General Information

Probably the best short overview of Vesta and the motivation for its creation is the 24-page paper "The Vesta Approach to Software Configuration Management". If you prefer slides an alternative overview is in a presentation given at CodeCon 2004.

The hard cover book from Springer-Verlag "Software Configuration Management Using Vesta" goes into a lot more depth (if you really want a lot of detail).

We've tried to make the LearningVesta page a central resource with pointers to various other resources one can use to learn about different aspects Vesta.

If you have an questions, don't hesitate to use:

Finding Your Way Around the Code

We have a a cross-reference of the code from Doxygen that you might find useful. A couple Caveats:

There's a dissection of an old version of the checkout tool that might be helpful for getting up to speed on the Vesta APIs. It's not directly related to the server's NFS layer and it covers a bunch of other topics (attributes, replication), but it will probably be easier to digest.

Inside the Vesta Repository

LongIds are NFS Filehandles

The class used to identify files and directories, both for the v2 NFS interface and the Vesta-specific RPC interface is LongId. You can find its declaration in the file VestaSource.H. The implementation is spread across several files, partly because some parts are different between the client and the server. You should probably start with the client-side files: VestaSourceCommon.C and VDirSurrogateOnly.C. After that you may want to look at the server side file: VestaSource.C.

A LongId is mostly an encoded sequence of integer directory indexes. This should be obvious if you look at the code for LongId::append and LongId::getParent.

There are some special cases when the first bytes of the LongId is zero:

VDirChangeable packaed representation

The directory of everything in the repository is kept in virtual memory while the repository daemon is running (and it usually runs continuously). This uses a tightly packed data structure that's optimized more for space than speed. In the server source code see VDirChangeable.H and VDirChangeable.C for how this is implemented.

Some Quirks Related to NFS

Directories in the Vesta repository can be a delta of changes over some other directory. The working copies users modify are a set of changes over the version their checkout was based on. (This is how constant-time checkouts are possible for arbitrarily large source sets: no copying takes place until you make changes.) The implementation uses even directory entry indexes for the immutable portion and odd directory entry indexes for the mutable portion. This keeps the indexes of the immutable pieces the same as mutable entries are added. However, when an immutable file is modified and a copy-on-write happens the NFS filehandle (and LongId) needs to remain the same. This is implemented by the sameAsBase bit in the directory entry. If that's set, the directory entry has replaced something with the same name in the basis directory, and the code will look up its directory entry and uses its directory index.

When a file or directory is renamed, its directory index and even the directory it is contained in can change. However the NFS filehandle has to remain valid. When an object is moved it leaves behind a forwarding pointer which can be followed to find where the object is now. See the files VForward.H and VForward.C for how this is implemented. In the VDirChangeable structure a directory entry gets its type changed to VestaSource::deleted or VestaSource::outdated and the VForward is stored in the entry where a file/directory reference would be for other directory entry types.

Each tool run during a build (e.g. compiling one source file) takes place in a separate volatile directory that exists just as long as that tool runs. This means that each successive tool run has a different set of NFS filehandles (and LongIds). However, a lot of the files tend to be the same from one tool run to the next when you're doing something like processing a collection of source files with the same tool. To be able to exploit the client OS machine's filesystem cache, we need the NFS filehandles for identical files to be the same across multiple tool invocations. To enable this the default is to use filehandles based on the input files contents rather than a sequence of directory indexes. This requires that every file that exists at the start of a tool be read-only. The tool can create new files or delete existing files, but it can't modify the contents of existing files.

Vesta's "Simple" RPC System

This is the library used for all the Vesta-specific RPCs for operations that can't be represented through the filesystem (e.g. taking an immutable snapshot of a mutable working directory).

The code of the library and some simple test programs can be found:

Vesta: NotesForCiti (last edited 2008-08-08 22:38:49 by KenSchalk)