When invoking vmake, it builds all the files and outputs them.
Great. Except...that makes it overwrite last-modified-times on files that haven't been modified - which is a real problem in any situation where the things being make'd are content in any kind of distribution server (files, http, etc).
At the simplest level, it would be nice if vmake had a flag to only overwrite files if they were different from the ones already there. This small change would instantly make making into a cache-friendly action. This has knock-on effects from the small-but-universal (you can ftp your built files to a remote host using "mput -r *" and recursively put the entire directory, safe in the knowledge that only what's changed will be transferred) to the huge-but-uncommon/rare (you can vmake content directly into deployment/delivery directories, safe in the knowledge that it won't blow all the unchanged files out of all the caches on the planet).
We've talked on IRC about several options:
Have the evaluator preserve the mtime of the shipped file.
In other words:
- If it's a source file being shipped, the mtime of the shipped file would be the last time a user edited it.
- If it's a derived file being shipped, the mtime of the shipped file would be when the tool finished writing it.
This would be pretty easy to implement and seems to mostly solve the problem. The main hole is that if something gets weeded out of the cache and then you re-build and re-ship, it will update those derived files (since their mtime will be different).
Compare files before shipping
Whenever shipping a file, compare the contents of the file to be shipped with the one already at the destination. Don't copy the new file unless it's different from the old one. This would slow down shipping significantly, but would completely solve the problem posed.
Of course one could write a script which did this for you. Ship the result to another directory (even as links), then do the comparison and selective copying to the real destination. This makes me think that this functionality doesn't really need to go into the evaluator.
Parse the old .log file
When shipping, the evaluator writes a file named ".log" which tells what build the file came from. It also includes the time the shipping was performed and the shortid that was copied.
This could be used to see if the same shortid is being shipped and whether the file at the destination has been modified since the last ship. However this has a number of problems:
- The same shortid doesn't always contain the same thing. Weedeing can delete shortids and a new source or derived could be assigned the same shortid.
- mtime is a really bad way to determine if something has changed.
Basically this idea seems really bad to me (without some significant modification, possibly turning it into option 2 above). It would be terribly broken to ask the evaluator to ship the result of a build and get something other than the result of that build.
AdamMartin asked:
- A source file in one function, whose result is then shipped from another function - does this become a derived file as far as the outer function is concerned?
In some sense it's both a source and a derived. However it will have the same shortid and its mtime won't be changed merely by being returned from a function.
It's a source in that it's still referenced from the source portion of the repository (/vesta). It's a derived in that it's protected by a cache entry.
Hypothetically, you could delete the source version which protects the shortid but still have a cache entry protecting it. This would change its status to just being a derived.