Custom Search

Saturday, February 19, 2011

Fossil - a sweet spot in the VCS space

Over the course of my career, I've dealt with clients using a variety of different VCS systems, starting with RCS, and including at least one that was developed by the client for inhouse use. I recently had reason to try the generally little-known fossil DVCS, and was pleasantly surprised.

The Sweet Spot

Fossil, unlike other DVCSs I'm familiar with, doesn't include the repository in the working directory. Instead, the repository can be on any locally accessible disk - much like some of the server VCS systems running without a server. You can then check out multiple workspaces - in different directories, of course - from the same repository, again like a server VCS. This has a couple of advantages.

First, consider the following scenario. I'm working on the next version of a clients product, busy adding the cutting-edge features that keep it in constant demand. A "how did we release it with that" bug (introduced by some other developer) shows up, and I'm asked to fix it yesterday. After getting the fix tested and committed to the appropriate release branch, I'll want to evaluate it and possibly merge it to the development branch I've been working on. We can ignore the hard part of dealing with the bug - debugging our processes to figure out how we managed to release a product with such a nasty bug in the first place.

In general, I have two choices in how to work on this bug. I can either commit any outstanding changes in my current workspace and then change it to the release branch, or I can use a different workspace. Doing the commit is painful, unless I happen to be close to one. Sufficiently so that more than one VCS (including recent snapshots of fossil) has tools for saving and restoring uncommitted changes while moving between branches, or even to move them between branches. With a server VCS, creating a second workspace is easy - I just create the second workspace and check out the release branch. In practice, I probably already have such a workspace set up, because I'll have been dealing with the release code all along. Merging the fix back into development is trivial - I just issue the appropriate merge command.

With most DVCSs, creating a second workspace involves creating a second repository. The real difference between a DVCS and a server VCS will come when I decide to merge the fix from the release branch. I'll have to pull the fix into my development repository before I can do the merge. Exactly how hard this is will depend on the VCS configuration. If the branches are actually branches in a central repository, then I can just update and merge. If the branches are represented by different repositories, then I'll have to pull from the release repository - or my release repository - instead of just updating before I can do the merge. Either way, it's a slightly more complicated process, making it take just a little longer and be just a little bit more likely to go wrong.

With fossil, I can do things either way: I can create a new repository for the release branch and open a workspace from that, or (if the branch is represented by a fossil branch) I can check out the release branch from the repository I'm already using in a new workspace. By not tying repositories to workspaces, fossil has the flexibility of both server VCSs and DVCSs.

Another place fossil can act like either a VCS or DVCS is in copying changes to/from a remote URL. If set to autosync mode, a commit to a local repo will automatically push to the default remote rep, and an update from the local repo will do a pull from the remote repo before doing the update. In essence, fossil acts just like a server VCS if you turn autosync mode on. This is a sufficiently popular mode of operation that other DVCSs can do this, though they usually requires a little external help.

A final place fossil hits near the sweep spot is in workspace pollution. Being a long-time Perforce user, I find that VCS systems that leave lots of VCS metadata files in my workspace a pain. I constantly find myself running some generic unix command for dealing with trees, seeing those files pop up in the results, cursing, and then either running the VCS-specific command for that purpose, or re-running the unix command with an appropriate filter added to remove them. Perforce wins this one - it doesn't put metadata files in my workspace. Fossil is a close second, having only one.

Philosophy


One attractive thing - to me, anyway - about fossil is the philosophy. The developers believe that development history is in the past, and hence should be unchangeable. Most server VCS's do this as well (though the knowledgeable can edit the on-disk representation of history in an emergency), but DVCS developers seem to think that creating a linear development history in the repository is important functionality, even if the result is a lie. Fossil doesn't support this (again, modulo hand editing the on-disk repository), which I find attractive, though others may not.

The target audience for fossil is the small workgroup. It isn't designed for handling a distributed systems with thousands of developers and multiple layers of evaluators looking at patches and either rejecting or committing and passing them upstream to eventually reach the release engineer. While it seems to have all the features required to work in such a project, that is not it's target, and I'm not really qualified to decide whether it would work as such or not.

Other wins


Installation


What prodded me to look at fossil in the first place was needing a DVCS that ran on a relatively old platform. After a day trying to get one of a more popular DVCS's requirements to find a third party library that comes with or as an optional package on most modern Unix systems, I decided to see if fossil could be used instead. While it didn't build out of the box - it needed that same third party library - turning off a config option for features I didn't need easily solved that.

Even better, compared to other DVCSs, fossil is a simple install. There's just one file - the fossil binary - to install. Or remove if you decide you don't like it, or update if you want to install a newer version.

Server options

Like most DVCSs, fossil has a command to start a server so it can be used for ad-hoc push/pull/cloning over the network. Unlike most of the others, the fossil binary can be used that way in most common servers with little or no setup. It can run natively out of inetd, passing it either a repository or a directory of repositories (each ending in .fossil) as an argument to serve either that repository or the repositories in that directory.

Fossil can also run a a CGI script, requiring a two or three line fossil script - yes, it starts with #!/usr/bin/fossil or the equivalent - that points to the repository or a directory of repositories as per the inetd invocation.


While these options might not be suitable for a large project, they are perfectly adequate for the small workgroup that is fossils target audience.

A complete project management solution

DVCS's that do VCS operations over HTTP often provide ways to get human-readable information out of the repository. After all, this is a thing people commonly want to do with source repositories. So it's not surprising that fossil does that. But the fossil server does authentication, and allows logged in users to perform pretty much any operation that can be performed at the command line. This is the recommended GUI for those who want one.

Further, the fossil server provides the facilities you expect to find on a repository hosting services: a wiki, a blog, and an integrated issue tracking system, plus the ability to customize all of those things. Even better, having all of these in a distributed VCS means they are all distributed as well. You can edit wiki pages or modify tickets locally, and then push them out to a remote repository.

While none of these facilities is has as many features as a complete external project, all of them are perfectly adequate for for the small groups that this project targets.  Further, installation and configuration - when compared to, say, setting up mercurial and trac - is nearly trivial.

Almost all of my clients would have been able to use the fossil tools, and in many cases they would have wound up with a better solution than we were using at the time.

Downsides


Like all things, fossil isn't perfect. It's relatively new, and still under development. So like many such open source projects, the documentation is a bit on the thin side, and what there is is a bit disorganized.

Some features - especially those mostly useful on larger projects, like support for subrepositories or code review integration - aren't available, or at least not yet.

Releases seem to be a bit haphazard - the user bases appears to still be mostly developers, and the ease of installation makes tracking sources an easy way to stay current.

The lack of a rebase facility - among other things - may alienate some users, but this is getting into a matter of taste - exactly what features do you want in your VCS?

It's not clear that all of the features in fossil should be bundled in one program - modularity is generally a desirable property. On the other hand, wikis and tickets are generally version-controlled, and using the same VCS for them as for source has a certain attraction to it.

Finally, googling for help with fossil tends to produce a lot of hits about bones rather than VCS software.

Summary


If you're running a project large enough that changing SCM's is a  project, then fossil probably isn't for you. On the other hand, if you're working on a small project that doesn't have wiki or issue tracking system and could use one, and have a system that you can host binaries on (or you can talk the provider into installing the fossil package for use in your CGI scripts), then fossil is well worth taking a look at, as setting up the fossil server is easier than setting up just the VCS server for most VCS's, and that gets you a wiki and issue tracking system in a single step.