bos@559: bos@26: bos@559: bos@587: bos@559: Preface bos@26: bos@583: bos@583: Why revision control? Why Mercurial? bos@583: bos@584: Revision control is the process of managing multiple bos@583: versions of a piece of information. In its simplest form, this bos@583: is something that many people do by hand: every time you modify bos@583: a file, save it under a new name that contains a number, each bos@583: one higher than the number of the preceding version. bos@583: bos@584: Manually managing multiple versions of even a single file is bos@583: an error-prone task, though, so software tools to help automate bos@583: this process have long been available. The earliest automated bos@583: revision control tools were intended to help a single user to bos@583: manage revisions of a single file. Over the past few decades, bos@583: the scope of revision control tools has expanded greatly; they bos@583: now manage multiple files, and help multiple people to work bos@583: together. The best modern revision control tools have no bos@583: problem coping with thousands of people working together on bos@583: projects that consist of hundreds of thousands of files. bos@583: bos@584: The arrival of distributed revision control is relatively bos@583: recent, and so far this new field has grown due to people's bos@583: willingness to explore ill-charted territory. bos@583: bos@584: I am writing a book about distributed revision control bos@583: because I believe that it is an important subject that deserves bos@583: a field guide. I chose to write about Mercurial because it is bos@583: the easiest tool to learn the terrain with, and yet it scales to bos@583: the demands of real, challenging environments where many other bos@583: revision control tools buckle. bos@583: bos@583: bos@583: Why use revision control? bos@583: bos@584: There are a number of reasons why you or your team might bos@583: want to use an automated revision control tool for a bos@583: project. bos@583: bos@583: bos@584: It will track the history and evolution of bos@583: your project, so you don't have to. For every change, bos@583: you'll have a log of who made it; bos@583: why they made it; bos@583: when they made it; and bos@583: what the change bos@583: was. bos@584: When you're working with other people, bos@583: revision control software makes it easier for you to bos@583: collaborate. For example, when people more or less bos@583: simultaneously make potentially incompatible changes, the bos@583: software will help you to identify and resolve those bos@583: conflicts. bos@584: It can help you to recover from mistakes. If bos@583: you make a change that later turns out to be in error, you bos@583: can revert to an earlier version of one or more files. In bos@583: fact, a really good revision control bos@583: tool will even help you to efficiently figure out exactly bos@592: when a problem was introduced (see for details). bos@584: It will help you to work simultaneously on, bos@583: and manage the drift between, multiple versions of your bos@583: project. bos@583: bos@583: bos@584: Most of these reasons are equally valid---at least in bos@583: theory---whether you're working on a project by yourself, or bos@583: with a hundred other people. bos@583: bos@584: A key question about the practicality of revision control bos@583: at these two different scales (lone hacker and bos@583: huge team) is how its bos@583: benefits compare to its bos@583: costs. A revision control tool that's bos@583: difficult to understand or use is going to impose a high bos@583: cost. bos@583: bos@584: A five-hundred-person project is likely to collapse under bos@583: its own weight almost immediately without a revision control bos@583: tool and process. In this case, the cost of using revision bos@583: control might hardly seem worth considering, since bos@583: without it, failure is almost bos@583: guaranteed. bos@583: bos@584: On the other hand, a one-person quick hack bos@583: might seem like a poor place to use a revision control tool, bos@583: because surely the cost of using one must be close to the bos@583: overall cost of the project. Right? bos@583: bos@584: Mercurial uniquely supports both of bos@583: these scales of development. You can learn the basics in just bos@583: a few minutes, and due to its low overhead, you can apply bos@583: revision control to the smallest of projects with ease. Its bos@583: simplicity means you won't have a lot of abstruse concepts or bos@583: command sequences competing for mental space with whatever bos@583: you're really trying to do. At the same bos@583: time, Mercurial's high performance and peer-to-peer nature let bos@583: you scale painlessly to handle large projects. bos@583: bos@584: No revision control tool can rescue a poorly run project, bos@583: but a good choice of tools can make a huge difference to the bos@583: fluidity with which you can work on a project. bos@583: bos@583: bos@583: bos@583: bos@583: The many names of revision control bos@583: bos@584: Revision control is a diverse field, so much so that it is bos@583: referred to by many names and acronyms. Here are a few of the bos@583: more common variations you'll encounter: bos@583: bos@584: Revision control (RCS) bos@584: Software configuration management (SCM), or bos@583: configuration management bos@584: Source code management bos@584: Source code control, or source bos@583: control bos@584: Version control bos@583: (VCS) bos@584: Some people claim that these terms actually have different bos@583: meanings, but in practice they overlap so much that there's no bos@583: agreed or even useful way to tease them apart. bos@583: bos@583: bos@583: bos@26: bos@559: bos@559: This book is a work in progress bos@26: bos@584: I am releasing this book while I am still writing it, in the bos@583: hope that it will prove useful to others. I am writing under an bos@583: open license in the hope that you, my readers, will contribute bos@583: feedback and perhaps content of your own. bos@200: bos@559: bos@559: bos@559: About the examples in this book bos@200: bos@584: This book takes an unusual approach to code samples. Every bos@559: example is live---each one is actually the result bos@559: of a shell script that executes the Mercurial commands you see. bos@559: Every time an image of the book is built from its sources, all bos@559: the example scripts are automatically run, and their current bos@559: results compared against their expected results. bos@200: bos@584: The advantage of this approach is that the examples are bos@559: always accurate; they describe exactly the bos@559: behaviour of the version of Mercurial that's mentioned at the bos@559: front of the book. If I update the version of Mercurial that bos@559: I'm documenting, and the output of some command changes, the bos@559: build fails. bos@200: bos@584: There is a small disadvantage to this approach, which is bos@559: that the dates and times you'll see in examples tend to be bos@559: squashed together in a way that they wouldn't be bos@559: if the same commands were being typed by a human. Where a human bos@559: can issue no more than one command every few seconds, with any bos@559: resulting timestamps correspondingly spread out, my automated bos@559: example scripts run many commands in one second. bos@200: bos@584: As an instance of this, several consecutive commits in an bos@559: example can show up as having occurred during the same second. bos@559: You can see this occur in the bisect example in , for instance. bos@200: bos@584: So when you're reading examples, don't place too much weight bos@559: on the dates or times you see in the output of commands. But bos@559: do be confident that the behaviour you're bos@559: seeing is consistent and reproducible. bos@26: bos@559: bos@583: bos@583: bos@583: Trends in the field bos@583: bos@584: There has been an unmistakable trend in the development and bos@583: use of revision control tools over the past four decades, as bos@583: people have become familiar with the capabilities of their tools bos@583: and constrained by their limitations. bos@583: bos@584: The first generation began by managing single files on bos@583: individual computers. Although these tools represented a huge bos@583: advance over ad-hoc manual revision control, their locking model bos@583: and reliance on a single computer limited them to small, bos@583: tightly-knit teams. bos@583: bos@584: The second generation loosened these constraints by moving bos@583: to network-centered architectures, and managing entire projects bos@583: at a time. As projects grew larger, they ran into new problems. bos@583: With clients needing to talk to servers very frequently, server bos@583: scaling became an issue for large projects. An unreliable bos@583: network connection could prevent remote users from being able to bos@583: talk to the server at all. As open source projects started bos@583: making read-only access available anonymously to anyone, people bos@583: without commit privileges found that they could not use the bos@583: tools to interact with a project in a natural way, as they could bos@583: not record their changes. bos@583: bos@584: The current generation of revision control tools is bos@583: peer-to-peer in nature. All of these systems have dropped the bos@583: dependency on a single central server, and allow people to bos@583: distribute their revision control data to where it's actually bos@583: needed. Collaboration over the Internet has moved from bos@583: constrained by technology to a matter of choice and consensus. bos@583: Modern tools can operate offline indefinitely and autonomously, bos@583: with a network connection only needed when syncing changes with bos@583: another repository. bos@583: bos@583: bos@583: bos@583: A few of the advantages of distributed revision bos@583: control bos@583: bos@584: Even though distributed revision control tools have for bos@583: several years been as robust and usable as their bos@583: previous-generation counterparts, people using older tools have bos@583: not yet necessarily woken up to their advantages. There are a bos@583: number of ways in which distributed tools shine relative to bos@583: centralised ones. bos@583: bos@584: For an individual developer, distributed tools are almost bos@583: always much faster than centralised tools. This is for a simple bos@583: reason: a centralised tool needs to talk over the network for bos@583: many common operations, because most metadata is stored in a bos@583: single copy on the central server. A distributed tool stores bos@583: all of its metadata locally. All else being equal, talking over bos@583: the network adds overhead to a centralised tool. Don't bos@583: underestimate the value of a snappy, responsive tool: you're bos@583: going to spend a lot of time interacting with your revision bos@583: control software. bos@583: bos@584: Distributed tools are indifferent to the vagaries of your bos@583: server infrastructure, again because they replicate metadata to bos@583: so many locations. If you use a centralised system and your bos@583: server catches fire, you'd better hope that your backup media bos@583: are reliable, and that your last backup was recent and actually bos@583: worked. With a distributed tool, you have many backups bos@583: available on every contributor's computer. bos@583: bos@584: The reliability of your network will affect distributed bos@583: tools far less than it will centralised tools. You can't even bos@583: use a centralised tool without a network connection, except for bos@583: a few highly constrained commands. With a distributed tool, if bos@583: your network connection goes down while you're working, you may bos@583: not even notice. The only thing you won't be able to do is talk bos@583: to repositories on other computers, something that is relatively bos@583: rare compared with local operations. If you have a far-flung bos@583: team of collaborators, this may be significant. bos@583: bos@583: bos@583: Advantages for open source projects bos@583: bos@584: If you take a shine to an open source project and decide bos@583: that you would like to start hacking on it, and that project bos@583: uses a distributed revision control tool, you are at once a bos@583: peer with the people who consider themselves the bos@583: core of that project. If they publish their bos@583: repositories, you can immediately copy their project history, bos@583: start making changes, and record your work, using the same bos@583: tools in the same ways as insiders. By contrast, with a bos@583: centralised tool, you must use the software in a read bos@583: only mode unless someone grants you permission to bos@583: commit changes to their central server. Until then, you won't bos@583: be able to record changes, and your local modifications will bos@583: be at risk of corruption any time you try to update your bos@583: client's view of the repository. bos@583: bos@583: bos@583: The forking non-problem bos@583: bos@584: It has been suggested that distributed revision control bos@583: tools pose some sort of risk to open source projects because bos@583: they make it easy to fork the development of bos@583: a project. A fork happens when there are differences in bos@583: opinion or attitude between groups of developers that cause bos@583: them to decide that they can't work together any longer. bos@583: Each side takes a more or less complete copy of the bos@583: project's source code, and goes off in its own bos@583: direction. bos@583: bos@584: Sometimes the camps in a fork decide to reconcile their bos@583: differences. With a centralised revision control system, the bos@583: technical process of reconciliation is bos@583: painful, and has to be performed largely by hand. You have bos@583: to decide whose revision history is going to bos@583: win, and graft the other team's changes into bos@583: the tree somehow. This usually loses some or all of one bos@583: side's revision history. bos@583: bos@584: What distributed tools do with respect to forking is bos@583: they make forking the only way to bos@583: develop a project. Every single change that you make is bos@583: potentially a fork point. The great strength of this bos@583: approach is that a distributed revision control tool has to bos@583: be really good at merging forks, bos@583: because forks are absolutely fundamental: they happen all bos@583: the time. bos@583: bos@584: If every piece of work that everybody does, all the bos@583: time, is framed in terms of forking and merging, then what bos@583: the open source world refers to as a fork bos@583: becomes purely a social issue. If bos@583: anything, distributed tools lower the bos@583: likelihood of a fork: bos@583: bos@584: They eliminate the social distinction that bos@583: centralised tools impose: that between insiders (people bos@583: with commit access) and outsiders (people bos@583: without). bos@584: They make it easier to reconcile after a bos@583: social fork, because all that's involved from the bos@583: perspective of the revision control software is just bos@583: another merge. bos@583: bos@584: Some people resist distributed tools because they want bos@583: to retain tight control over their projects, and they bos@583: believe that centralised tools give them this control. bos@583: However, if you're of this belief, and you publish your CVS bos@583: or Subversion repositories publicly, there are plenty of bos@583: tools available that can pull out your entire project's bos@583: history (albeit slowly) and recreate it somewhere that you bos@583: don't control. So while your control in this case is bos@583: illusory, you are forgoing the ability to fluidly bos@583: collaborate with whatever people feel compelled to mirror bos@583: and fork your history. bos@583: bos@583: bos@583: bos@583: bos@583: Advantages for commercial projects bos@583: bos@584: Many commercial projects are undertaken by teams that are bos@583: scattered across the globe. Contributors who are far from a bos@583: central server will see slower command execution and perhaps bos@583: less reliability. Commercial revision control systems attempt bos@583: to ameliorate these problems with remote-site replication bos@583: add-ons that are typically expensive to buy and cantankerous bos@583: to administer. A distributed system doesn't suffer from these bos@583: problems in the first place. Better yet, you can easily set bos@583: up multiple authoritative servers, say one per site, so that bos@583: there's no redundant communication between repositories over bos@583: expensive long-haul network links. bos@583: bos@584: Centralised revision control systems tend to have bos@583: relatively low scalability. It's not unusual for an expensive bos@583: centralised system to fall over under the combined load of bos@583: just a few dozen concurrent users. Once again, the typical bos@583: response tends to be an expensive and clunky replication bos@583: facility. Since the load on a central server---if you have bos@583: one at all---is many times lower with a distributed tool bos@583: (because all of the data is replicated everywhere), a single bos@583: cheap server can handle the needs of a much larger team, and bos@583: replication to balance load becomes a simple matter of bos@583: scripting. bos@583: bos@584: If you have an employee in the field, troubleshooting a bos@583: problem at a customer's site, they'll benefit from distributed bos@583: revision control. The tool will let them generate custom bos@583: builds, try different fixes in isolation from each other, and bos@583: search efficiently through history for the sources of bugs and bos@583: regressions in the customer's environment, all without needing bos@583: to connect to your company's network. bos@583: bos@583: bos@583: bos@583: bos@583: Why choose Mercurial? bos@583: bos@584: Mercurial has a unique set of properties that make it a bos@583: particularly good choice as a revision control system. bos@583: bos@584: It is easy to learn and use. bos@584: It is lightweight. bos@584: It scales excellently. bos@584: It is easy to bos@583: customise. bos@583: bos@584: If you are at all familiar with revision control systems, bos@583: you should be able to get up and running with Mercurial in less bos@583: than five minutes. Even if not, it will take no more than a few bos@583: minutes longer. Mercurial's command and feature sets are bos@583: generally uniform and consistent, so you can keep track of a few bos@583: general rules instead of a host of exceptions. bos@583: bos@584: On a small project, you can start working with Mercurial in bos@583: moments. Creating new changes and branches; transferring changes bos@583: around (whether locally or over a network); and history and bos@583: status operations are all fast. Mercurial attempts to stay bos@583: nimble and largely out of your way by combining low cognitive bos@583: overhead with blazingly fast operations. bos@583: bos@584: The usefulness of Mercurial is not limited to small bos@583: projects: it is used by projects with hundreds to thousands of bos@583: contributors, each containing tens of thousands of files and bos@583: hundreds of megabytes of source code. bos@583: bos@584: If the core functionality of Mercurial is not enough for bos@583: you, it's easy to build on. Mercurial is well suited to bos@583: scripting tasks, and its clean internals and implementation in bos@583: Python make it easy to add features in the form of extensions. bos@583: There are a number of popular and useful extensions already bos@583: available, ranging from helping to identify bugs to improving bos@583: performance. bos@583: bos@583: bos@583: bos@583: Mercurial compared with other tools bos@583: bos@584: Before you read on, please understand that this section bos@583: necessarily reflects my own experiences, interests, and (dare I bos@583: say it) biases. I have used every one of the revision control bos@583: tools listed below, in most cases for several years at a bos@583: time. bos@583: bos@583: bos@583: bos@583: Subversion bos@583: bos@584: Subversion is a popular revision control tool, developed bos@583: to replace CVS. It has a centralised client/server bos@583: architecture. bos@583: bos@584: Subversion and Mercurial have similarly named commands for bos@583: performing the same operations, so if you're familiar with bos@583: one, it is easy to learn to use the other. Both tools are bos@583: portable to all popular operating systems. bos@583: bos@584: Prior to version 1.5, Subversion had no useful support for bos@583: merges. At the time of writing, its merge tracking capability bos@583: is new, and known to be complicated bos@583: and buggy. bos@583: bos@584: Mercurial has a substantial performance advantage over bos@583: Subversion on every revision control operation I have bos@583: benchmarked. I have measured its advantage as ranging from a bos@583: factor of two to a factor of six when compared with Subversion bos@583: 1.4.3's ra_local file store, which is the bos@583: fastest access method available. In more realistic bos@583: deployments involving a network-based store, Subversion will bos@583: be at a substantially larger disadvantage. Because many bos@583: Subversion commands must talk to the server and Subversion bos@583: does not have useful replication facilities, server capacity bos@583: and network bandwidth become bottlenecks for modestly large bos@583: projects. bos@583: bos@584: Additionally, Subversion incurs substantial storage bos@583: overhead to avoid network transactions for a few common bos@583: operations, such as finding modified files bos@583: (status) and displaying modifications bos@583: against the current revision (diff). As a bos@583: result, a Subversion working copy is often the same size as, bos@583: or larger than, a Mercurial repository and working directory, bos@583: even though the Mercurial repository contains a complete bos@583: history of the project. bos@583: bos@584: Subversion is widely supported by third party tools. bos@583: Mercurial currently lags considerably in this area. This gap bos@583: is closing, however, and indeed some of Mercurial's GUI tools bos@583: now outshine their Subversion equivalents. Like Mercurial, bos@583: Subversion has an excellent user manual. bos@583: bos@584: Because Subversion doesn't store revision history on the bos@583: client, it is well suited to managing projects that deal with bos@583: lots of large, opaque binary files. If you check in fifty bos@583: revisions to an incompressible 10MB file, Subversion's bos@583: client-side space usage stays constant The space used by any bos@583: distributed SCM will grow rapidly in proportion to the number bos@583: of revisions, because the differences between each revision bos@583: are large. bos@583: bos@584: In addition, it's often difficult or, more usually, bos@583: impossible to merge different versions of a binary file. bos@583: Subversion's ability to let a user lock a file, so that they bos@583: temporarily have the exclusive right to commit changes to it, bos@583: can be a significant advantage to a project where binary files bos@583: are widely used. bos@583: bos@584: Mercurial can import revision history from a Subversion bos@583: repository. It can also export revision history to a bos@583: Subversion repository. This makes it easy to test the bos@583: waters and use Mercurial and Subversion in parallel bos@583: before deciding to switch. History conversion is incremental, bos@583: so you can perform an initial conversion, then small bos@583: additional conversions afterwards to bring in new bos@583: changes. bos@583: bos@583: bos@583: bos@583: bos@583: Git bos@583: bos@584: Git is a distributed revision control tool that was bos@583: developed for managing the Linux kernel source tree. Like bos@583: Mercurial, its early design was somewhat influenced by bos@583: Monotone. bos@583: bos@584: Git has a very large command set, with version 1.5.0 bos@583: providing 139 individual commands. It has something of a bos@583: reputation for being difficult to learn. Compared to Git, bos@583: Mercurial has a strong focus on simplicity. bos@583: bos@584: In terms of performance, Git is extremely fast. In bos@583: several cases, it is faster than Mercurial, at least on Linux, bos@583: while Mercurial performs better on other operations. However, bos@583: on Windows, the performance and general level of support that bos@583: Git provides is, at the time of writing, far behind that of bos@583: Mercurial. bos@583: bos@584: While a Mercurial repository needs no maintenance, a Git bos@583: repository requires frequent manual repacks of bos@583: its metadata. Without these, performance degrades, while bos@583: space usage grows rapidly. A server that contains many Git bos@583: repositories that are not rigorously and frequently repacked bos@583: will become heavily disk-bound during backups, and there have bos@583: been instances of daily backups taking far longer than 24 bos@583: hours as a result. A freshly packed Git repository is bos@583: slightly smaller than a Mercurial repository, but an unpacked bos@583: repository is several orders of magnitude larger. bos@583: bos@584: The core of Git is written in C. Many Git commands are bos@583: implemented as shell or Perl scripts, and the quality of these bos@583: scripts varies widely. I have encountered several instances bos@583: where scripts charged along blindly in the presence of errors bos@583: that should have been fatal. bos@583: bos@584: Mercurial can import revision history from a Git bos@583: repository. bos@583: bos@583: bos@583: bos@583: bos@583: CVS bos@583: bos@584: CVS is probably the most widely used revision control tool bos@583: in the world. Due to its age and internal untidiness, it has bos@583: been only lightly maintained for many years. bos@583: bos@584: It has a centralised client/server architecture. It does bos@583: not group related file changes into atomic commits, making it bos@583: easy for people to break the build: one person bos@583: can successfully commit part of a change and then be blocked bos@583: by the need for a merge, causing other people to see only a bos@583: portion of the work they intended to do. This also affects bos@583: how you work with project history. If you want to see all of bos@583: the modifications someone made as part of a task, you will bos@583: need to manually inspect the descriptions and timestamps of bos@583: the changes made to each file involved (if you even know what bos@583: those files were). bos@583: bos@584: CVS has a muddled notion of tags and branches that I will bos@583: not attempt to even describe. It does not support renaming of bos@583: files or directories well, making it easy to corrupt a bos@583: repository. It has almost no internal consistency checking bos@583: capabilities, so it is usually not even possible to tell bos@583: whether or how a repository is corrupt. I would not recommend bos@583: CVS for any project, existing or new. bos@583: bos@584: Mercurial can import CVS revision history. However, there bos@583: are a few caveats that apply; these are true of every other bos@583: revision control tool's CVS importer, too. Due to CVS's lack bos@583: of atomic changes and unversioned filesystem hierarchy, it is bos@583: not possible to reconstruct CVS history completely accurately; bos@583: some guesswork is involved, and renames will usually not show bos@583: up. Because a lot of advanced CVS administration has to be bos@583: done by hand and is hence error-prone, it's common for CVS bos@583: importers to run into multiple problems with corrupted bos@583: repositories (completely bogus revision timestamps and files bos@583: that have remained locked for over a decade are just two of bos@583: the less interesting problems I can recall from personal bos@583: experience). bos@583: bos@584: Mercurial can import revision history from a CVS bos@583: repository. bos@583: bos@583: bos@583: bos@583: bos@583: Commercial tools bos@583: bos@584: Perforce has a centralised client/server architecture, bos@583: with no client-side caching of any data. Unlike modern bos@583: revision control tools, Perforce requires that a user run a bos@583: command to inform the server about every file they intend to bos@583: edit. bos@583: bos@584: The performance of Perforce is quite good for small teams, bos@583: but it falls off rapidly as the number of users grows beyond a bos@583: few dozen. Modestly large Perforce installations require the bos@583: deployment of proxies to cope with the load their users bos@583: generate. bos@583: bos@583: bos@583: bos@583: bos@583: Choosing a revision control tool bos@583: bos@584: With the exception of CVS, all of the tools listed above bos@583: have unique strengths that suit them to particular styles of bos@583: work. There is no single revision control tool that is best bos@583: in all situations. bos@583: bos@584: As an example, Subversion is a good choice for working bos@583: with frequently edited binary files, due to its centralised bos@583: nature and support for file locking. bos@583: bos@584: I personally find Mercurial's properties of simplicity, bos@583: performance, and good merge support to be a compelling bos@583: combination that has served me well for several years. bos@583: bos@583: bos@583: bos@583: bos@583: bos@583: Switching from another tool to Mercurial bos@583: bos@584: Mercurial is bundled with an extension named convert, which can incrementally bos@583: import revision history from several other revision control bos@583: tools. By incremental, I mean that you can bos@583: convert all of a project's history to date in one go, then rerun bos@583: the conversion later to obtain new changes that happened after bos@583: the initial conversion. bos@583: bos@584: The revision control tools supported by convert are as follows: bos@583: bos@584: Subversion bos@584: CVS bos@584: Git bos@584: Darcs bos@584: bos@584: In addition, convert can bos@583: export changes from Mercurial to Subversion. This makes it bos@583: possible to try Subversion and Mercurial in parallel before bos@583: committing to a switchover, without risking the loss of any bos@583: work. bos@583: bos@584: The convert command bos@583: is easy to use. Simply point it at the path or URL of the bos@583: source repository, optionally give it the name of the bos@583: destination repository, and it will start working. After the bos@583: initial conversion, just run the same command again to import bos@583: new changes. bos@583: bos@583: bos@583: bos@583: A short history of revision control bos@583: bos@584: The best known of the old-time revision control tools is bos@583: SCCS (Source Code Control System), which Marc Rochkind wrote at bos@583: Bell Labs, in the early 1970s. SCCS operated on individual bos@583: files, and required every person working on a project to have bos@583: access to a shared workspace on a single system. Only one bos@583: person could modify a file at any time; arbitration for access bos@583: to files was via locks. It was common for people to lock files, bos@583: and later forget to unlock them, preventing anyone else from bos@583: modifying those files without the help of an bos@583: administrator. bos@583: bos@584: Walter Tichy developed a free alternative to SCCS in the bos@583: early 1980s; he called his program RCS (Revision Control System). bos@583: Like SCCS, RCS required developers to work in a single shared bos@583: workspace, and to lock files to prevent multiple people from bos@583: modifying them simultaneously. bos@583: bos@584: Later in the 1980s, Dick Grune used RCS as a building block bos@583: for a set of shell scripts he initially called cmt, but then bos@583: renamed to CVS (Concurrent Versions System). The big innovation bos@583: of CVS was that it let developers work simultaneously and bos@583: somewhat independently in their own personal workspaces. The bos@583: personal workspaces prevented developers from stepping on each bos@583: other's toes all the time, as was common with SCCS and RCS. Each bos@583: developer had a copy of every project file, and could modify bos@583: their copies independently. They had to merge their edits prior bos@583: to committing changes to the central repository. bos@583: bos@584: Brian Berliner took Grune's original scripts and rewrote bos@583: them in C, releasing in 1989 the code that has since developed bos@583: into the modern version of CVS. CVS subsequently acquired the bos@583: ability to operate over a network connection, giving it a bos@583: client/server architecture. CVS's architecture is centralised; bos@583: only the server has a copy of the history of the project. Client bos@583: workspaces just contain copies of recent versions of the bos@583: project's files, and a little metadata to tell them where the bos@583: server is. CVS has been enormously successful; it is probably bos@583: the world's most widely used revision control system. bos@583: bos@584: In the early 1990s, Sun Microsystems developed an early bos@583: distributed revision control system, called TeamWare. A bos@583: TeamWare workspace contains a complete copy of the project's bos@583: history. TeamWare has no notion of a central repository. (CVS bos@583: relied upon RCS for its history storage; TeamWare used bos@583: SCCS.) bos@583: bos@584: As the 1990s progressed, awareness grew of a number of bos@583: problems with CVS. It records simultaneous changes to multiple bos@583: files individually, instead of grouping them together as a bos@583: single logically atomic operation. It does not manage its file bos@583: hierarchy well; it is easy to make a mess of a repository by bos@583: renaming files and directories. Worse, its source code is bos@583: difficult to read and maintain, which made the pain bos@583: level of fixing these architectural problems bos@583: prohibitive. bos@583: bos@584: In 2001, Jim Blandy and Karl Fogel, two developers who had bos@583: worked on CVS, started a project to replace it with a tool that bos@583: would have a better architecture and cleaner code. The result, bos@583: Subversion, does not stray from CVS's centralised client/server bos@583: model, but it adds multi-file atomic commits, better namespace bos@583: management, and a number of other features that make it a bos@583: generally better tool than CVS. Since its initial release, it bos@583: has rapidly grown in popularity. bos@583: bos@584: More or less simultaneously, Graydon Hoare began working on bos@583: an ambitious distributed revision control system that he named bos@583: Monotone. While Monotone addresses many of CVS's design flaws bos@583: and has a peer-to-peer architecture, it goes beyond earlier (and bos@583: subsequent) revision control tools in a number of innovative bos@583: ways. It uses cryptographic hashes as identifiers, and has an bos@583: integral notion of trust for code from different bos@583: sources. bos@583: bos@584: Mercurial began life in 2005. While a few aspects of its bos@583: design are influenced by Monotone, Mercurial focuses on ease of bos@583: use, high performance, and scalability to very large bos@583: projects. bos@583: bos@583: bos@583: bos@583: bos@583: Colophon&emdash;this book is Free bos@26: bos@584: This book is licensed under the Open Publication License, bos@559: and is produced entirely using Free Software tools. It is bos@580: typeset with DocBook XML. Illustrations are drawn and rendered with bos@559: Inkscape. bos@26: bos@584: The complete source code for this book is published as a bos@559: Mercurial repository, at http://hg.serpentine.com/mercurial/book. bos@559: bos@559: bos@559: bos@559: