- Reorganizing Repositories
- Reasons for Renaming or Deleting Repos Dirs
- Current Options
- Other Ideas
Repository "appendable directories" (those below /vesta, both "package-parents" created with mkdir or vmkdir, and "packages" created with vcreate) and "immutable directories" (those created with vadvance and vcheckin) can never be renamed. They can be deleted, but they leave behind ghosts. (See: RepositoryRules.)
Essentially, when you create something under /vesta (a directory, a package, a version), it's immortal. Versions, once created, are frozen forever.
There are two main motivations for these restrictions:
Build repeatability. In order for a build to do precisely the same thing in the future that it does now, once a version is created, it must always refer to the same contents. So, once you create /vesta/example.com/dir/pkg/1, that version must always have the same contents. If you could rename /vesta/example.com/dir/pkg to /vesta/example.com/otherdir/otherpkg, your build descriptions would still refer to the old name (via an import), and you wouldn't be able to repeat past builds. If you could delete /vesta/example.com/dir without leaving behind a ghost, you could re-create /vesta/example.com/dir/pkg/1 with different contents, which would change the result of builds you had created in the past.
Replication invariance. Just as /vesta/example.com/dir/pkg/1 is supposed to have the same contents for all time in the repository where it is first created, it's also supposed to have the same contents in any repository you replicate it to. If you could delete objects without leaving ghosts, you could replace /vesta/example.com/dir/pkg/1 in one repository and leave the old contents in another repository. (For more on this see: "Vesta Replication Design" in the vrepl(1) man page and Partial Replication in the Vesta Software Repository.)
But the desire to rename or delete them remains, nonetheless.
Reasons for Renaming or Deleting Repos Dirs
There are several common reasons why people want to delete or rename repository directories and packages.
Typo on vmkdir or vcreate
Ie, you type vmkdir progrm when you meant vmkdir program.
Splitting up packages
Since programs tend to get bigger over time, adding more souce files, packages tend to grow in file count. Eventually one may want to split a package up into two or more separate packages.
This is just a general reorganization of dirs, packages, and files, for readability, or because the human team structure changed, or for many other reasons.
Packages Becoming Obsolete
Maybe some component of your design is no longer used. You probably don't want to completely delete it (as old builds may refer to it), but maybe you don't want users to see it when looking through the contents of the repository.
(Really this is a sub-problem of both splitting up packages and refactoring.)
Create New Dirs
If you vmkdir progrm by accident, you can just say to yourself "oops" and vmkdir program and just live with both of those lying around, one of which you never use again. You can vrm progrm, of course, but this still leaves progrm lying around as a ghost.
Number Your Root Dir
When you create the top-level dir of your tree, even the namespace, create it as project1.example.com instead of just project.example.com so that when you eventually want to refactor, you just move to project2.example.com. A disadvantage of this approach is that, except for possibly old-version attributes, your new project2.example.com tree has lost connection to your old tree.
It's relatively easy to automate such a migration. The mirror-legacy-version.pl script attached to this page is an example of one way to do a simple migration to a new top-level directory (though it won't do any re-organization, so probably isn't appropriate for most uses).
A potential problem with this is that paths to objects below /vesta may get hard-coded into various files (scripts, replicator instructions, weeder instructions, etc.). The mirror-legacy-version.pl script only helps with re-writing SDL imports.
Optionally Make Ghosts Invisible
An idea that's been suggested in the past (and has a tracker entry) is to make ghosts invisible when listing a directory through the NFS interface. This could be implemented without too much difficulty. The ghosts would still exist and still perform their essential function of preventing names from being recreated with different contents. However, they wouldn't show up when listing a directory.
This would probably be affected by an attribute which would either be required to hide a ghost or could be set to make a ghost visible again.
We could make this applicable to non-ghosts as well, which would make it possible to hide an obsolete package but leave it in the repository so that old builds could use it.
An alternative suggestion is implementing a a Vesta-specific vls command that can optionally hide ghosts.
Don't "Freeze" Until Needed
In the conversation which prompted the creation of this page, JohnVk suggested the idea of allowing changes to appendable directories until they need to be "frozen". There are two obvious times when you would need to "freeze" something:
- When a source is first used by a build
- When a source is replicated to a peer repository
It seems like this would be difficult to implement, and would actually be more difficult for users to understand than the current rules. They would have to have some way to find out whether something can be renamed, and would probably have difficulty understanding why something suddenly changes from being renameable/replaceable to no being frozen/immortal.
Mount A Historical View of the Repository
Really I'm not clear on how much utility this would have, as it would mostly help with the behavior of scripts/programs which are not part of a Vesta build but which want to access the repository via NFS or look at attributes.
Transparently Follow Renames
- Keep a database of old and new package positions. If an import is not found, the database would be used to find it in case it was moved. Of course, this would mean we would have to reserve the name so a new package with the same name isn't created in the old position. The idea here is that someone is unhappy with the package (not needed anymore) or it's position and wouldn't really care to create the same name in the same position again. Perhaps, if they did we could just ask if they would like to move the package back.
This seems problematic to me (KenSchalk) for several reasons:
- It makes the evaluator's behavior dependent upon data which changes over time. Even though the intent is not to introduce anything which could break repeatability, this seems inherently unwise.
- It requires a new query-able database that's part of the repository's stable, transactional data store. This definitely has to be made part of the repository, as changes to it would have to be atomic with the , but it's a non-trivial amount of work just to add this database.
A better option for supporting this would be to have SDL refer to its imports not by a path below /vesta, but by something persistent (like the fingerprint of the immutable directory). This would essentially mean treating the path to a version (vesta/vestasys.org/vesta/release/12) as a "pet name" for a more permanent referent, like it's fingerprint (e8ca04dd4c895ee654ef9ceaec291e39). The down-side to this approach would be that SDL imports would become far less readable:
from e8ca04dd4c895ee654ef9ceaec291e39 import vesta_release_12 = build.ves;
It's probably better to just leave the package in its original position even after migrating to using it at the new position.
Having said all this, there are some other reason why it would be very helpful to have an accessible data structure which stores the DAG of versions for a package. That's not quite what Jim said, but such a data structure would probably include information about where all copies of the exact same version exist, which could be used for a purpose like this.
Transparently Re-write Old SDL Imports
- Actually have Vesta change the contents of existing versions in vesta to effectively change the import paths of any sdl code importing this old path. This wouldn't actually create new versions just change the old ones. Yeah I know .. .scary stuff but once done, you can completely forget about the fact you moved a package. I guess the fun part here is remote vesta sites importing the package. I can see us searching our entire repos but to also ensure all other repos's don't use it as well not sure if that is possible / feasible thing to do.
My (KenSchalk) instinct is to reject any idea that involves re-writing history. I like the previous idea more.
Unrelated: Tracking Renames Between Versions
There's an RFE about automatically handling tracking of renames, but this is about a different problem: keeping track of renames of files/directories within a package (i.e. the changes from one version to the next). It has nothing to do with renaming in appendable directories.