Vesta uses complete filesystem and environment encapsulation based on the chroot system call. Each time you set up a new tool (or even a new version of a tool) to run under Vesta control, you need to construct a complete filesystem with everything it needs to run. Sometimes this requires a process of tral-and-error to discover all the files used by a tool. This page gives describes a number of techniques you can use to do this.

Background

When the Vesta builder runs a tool, it is restricted to an isolated subset of the filesystem. This subset is defined by the value ./root at the point when _run_tool is called.

Vesta's tool launching process uses the chroot(2) and execve(2) system calls to restrict the filesystem accesses to a directory under Vesta's control and completely define the environment for each tool. Your system may have man pages which describe these in detail (though that's probably more information than you need). If you're not familiar with the concept of chroot, wikipedia's description may also be helpful.

Tools that you run under Vesta will usually need a number of things other than the tool itself included in the filesystem:

Constructing a Filesystem from OS Compoenents

Most modern operating systems have a system for dividing the installed files into different named components. Each such component represents a subset of the files and directories that make up the operating system. Many Linux systems use either the RedHat package manager or the Debian package manager, but these are just two examples.

To make it easier to construct the filesystem needed to run a tool, we import entire OS components into Vesta and allow the user to simply give a list of OS component names which need to be in the filesystem. For example, on a Debian system in order to construct a filesystem needed to run the lexical analyzer generator flex, we would ask for:

  1. The OS component for flex itself
  2. The OS component for the m4 macro pre-processor (which recent versions of flex use)
  3. The OS component for the C run-time library

This can be done with a single function call:

   1 ./build_root(<"flex", "m4", "libc6">)

This is of course just one approach implemented on top of Vesta in the Vesta SDL language, but we do think it's a useful one. For more on this see std_env which describes the standard build environment.

Evaluator: -fsdeps

One of the Vesta evaluator's debug flags is "-fsdeps". This will print out one line for each dependency recorded by a filesystem access as a tool runs. Here's an example from the introduction to writing bridges:

% vmake -fsdeps
[...]
0/hostname: grep a
0/FS dependency: !/./root/.WD/grep
0/FS dependency: N/./root/bin/grep
0/FS dependency: !/./root/lib
0/FS dependency: !/./root/usr
0/Error: invoking _run_tool: [...]

The first character of each dependency path indicates what kind of operation the tool was performing:

In this case we can see that the tool looked for /.WD/grep and found that it didn't exist and then moved on to look for /bin/grep and used that file. It then looked for /lib which didn't exist and /usr which didn't exist, and then failed. (This example is specifically meant to illustrate what happens when you leave out certain key filesystem components like the loader and the C run-timw library.)

For more on dependencies and how they are recorded when tools run, see HowCachingWorks and the description of how _run_tool's dependency recording can be controlled.

Evaluator: -evalcalls

Another Vesta evaluator debug flag is "-evalcalls". This prints out one message for every call-back to the evaluator requesting information about some part of the encapsulated filesystem. This usually doesn't provide any additional useful information beyond "-fsdeps" and is mostly interesting to developers working on modifying Vesta. However it's worth knowing about and may be useful in some obscure cases.

Evaluator: -stop-before/after-tool

@@@ Not written yet @@@

Determining Needed Shared Libraries

As mentioend above, tools usually need some shared libraries to run. There's usually a command you can run to get a list of the shared libraries needed by an executable, though it varies depending on the operating system:

Monitoring System Calls

Sometimes it's helpful to monitor the system calls made by a tool. There's usually some utility you can use to do this, though you'll need to import it into Vesta and include it in your tool's filesystem. The exact command depends on the operating system:

Environment Variables

Some tools use environment variables during their processing. If they're not set correctly, the tool may not operate correctly. Unfortunately, there's no way to observe which envrionment variables a tool uses. (getenv is a library call, not a system call, so strace cannot monitor it.) If the tool's documentation doesn't tell you, you may need to resort to examining the tool's internal functioning.

For compiled binaries, environment variable names usually appear in the output of strings(1) run on the binary (though many other strings will obviously be included as well). As environment variable names typically follow the convention of using uppercase, underscores, and digits, you can usually find them with a simple filter:

% strings /usr/bin/gcc | grep '^[A-Z_0-9]*$'
[...]
PATH
GCC_EXEC_PREFIX
COMPILER_PATH
LIBRARY_PATH
LPATH
BINUTILS
_ROOT
POSIX
LC_COLLATE
LC_CTYPE
LC_MONETARY
LC_NUMERIC
LC_TIME
LC_MESSAGES
LC_ALL
LC_XXX
LANGUAGE
LANG
TMPDIR
TEMP
[...]

For scripts, you can usually come up with a simple pattern which will find environment variable references:

Vesta: ToolTroubleshooting (last edited 2007-06-21 22:57:40 by KenSchalk)