API Documentation Extraction
This page is about making documentation of the Vesta APIs available in a usable on-line format.
The Documentation Gap
Vesta has a sizable set of libraries with API that are available in 5 programming languages. There's not nearly enough documentation to help people make sense of how to use these interfaces.
The core code is written in C++ and is dvidied into 13 distinct libraries. (There are 3 sub-libraries that are a part of basics_umb and 10 that are a part of vesta_umb.) We've added interfaces to a subset of these libraries using SWIG to make them call-able from 4 more languages (Perl, Python, Tcl, and Java). Those interfaces have some subtle differences to make them map more naturally onto the different languages. (For example, some parts of the C++ API use call-back functions to process a set of items, and in the wrapped APIs instead return some list of data structures.)
Unfortunately, there's not much documentation of all this. The C++ header files do include some useful comments, but making use of them requires finding the right source file and then reading through the C++. This is a non-trivial exercise for programmers who wish to work in one of the other languages. Also, the subtle changes made in the wrappers are undocumented except for in the SWIG input files. So, in order for someone to make use of the wrapped APIs in these other languages, they would have to read both the C++ header files which define the core API and then read the SWIG input file. This is obviously too much effort to expect.
What Users Need
- Documentation of all the elemnts of the C++ API:
- Classes and other types
- Member variables and member functions
- Template parameters for polymorphic types
- Indexes of the above element
- By individual library
- Links back to the C++ source code for each element
- Description of each alternate language interface
- Which elemnts of the C++ API are available
- Differences in the interface
Processing and Intermediate Formats
For the core APIs, it seems best to extract documentation from the code itself (i.e. a literate programming approach). To get the most out of this will require the addition of structured comments to the source code, though some tools should allow us to generate at least some minimal indexing and documentation with the code as it is today.
Documenting the SWIG interfaces will probably require creating our own documentation annotation and extraction system. (SWIG 1.1 had a documentation generation system. We're using SWIG 1.3 which currently lacks a cohesive documentation system, though it has some limited support for generating documentation for certain languages.) Hopefully the limited scope will keep it from being too much of a burden.
We believe that the right solution will involve XML as an intermediate format. This should enable both flexible presentation through XSL and some advanced cross-indexing and searching capabilites (such as searching for every function which takes a particular type as an argument).
What Exisits Today
The only guide which provides any help today in understanding the Vesta APIs is the vcheckout dissection.
Experiment with Doxygen, evaluate its XML output format
- Experiment with SWIG's XML output format also.
DONE The information from SWIG's xml output is not useful for documentation.
- Develop a spec for embedding documentation in SWIG input files and extracting it
There are at least two other kinds of documentation which is would be nice to incorporate into the code and keep up to date automatically: