bos@16: \chapter{Introduction} bos@16: \label{chap:intro} bos@16: bos@217: \section{About revision control} bos@155: bos@217: Revision control is the management of multiple versions of a piece of bos@217: information. In its simplest form, it's a process that many people bos@217: perform by hand: every time you modify a file, save it under a new bos@217: name that contains a number, each one higher than the number of the bos@217: preceding version. bos@217: bos@217: Manually managing multiple versions of even a single file is an bos@217: error-prone task, though, so software tools to help automate this bos@217: process have long been available. The earliest automated revision bos@217: control tools were intended to help a single user to manage revisions bos@217: to a single file. Over the past several decades, the scope of bos@217: revision control tools has expanded greatly; they now manage multiple bos@217: files, and help multiple people to work together. The best modern bos@217: revision control tools will have no problem coping with thousands of bos@217: people working together on a single project, which might consist of bos@217: hundreds of thousands of files. bos@217: bos@217: \subsection{Why use revision control?} bos@217: bos@217: There are a number of reasons why you or your team might want to use bos@217: an automated revision control tool for a project. bos@217: \begin{itemize} bos@217: \item The software gives you a unified way of working with your bos@218: project's files. bos@218: \item When you're working with other people, it makes it easier for bos@218: you to collaborate. For example, when people more or less bos@218: simultaneously make potentially incompatible changes, the software bos@218: will help you to identify and resolve those conflicts. bos@217: \item It will track the history of your project. For every change, bos@217: you'll have a log of \emph{who} made it; \emph{why} they made it; bos@217: \emph{when} they made it; and \emph{what} the change was. bos@217: \item It can help you to recover from mistakes. If you make a change bos@217: that later turns out to be in error, you can revert to an earlier bos@217: version of one or more files. In fact, a \emph{really} good bos@217: revision control tool will even help you to efficiently figure out bos@217: exactly when a problem was introduced (see bos@217: section~\ref{sec:undo:bisect} for details). bos@218: \item It will help you to work simultaneously on, and manage the drift bos@218: between, multiple versions of your project. bos@217: \end{itemize} bos@218: Most of these reasons are equally valid---at least in theory---whether bos@218: you're working on a project by yourself, or with a hundred other bos@218: people. bos@218: bos@218: A key question about the practicality of revision control at these two bos@218: different scales (``lone hacker'' and ``huge team'') is how its bos@218: \emph{benefits} compare to its \emph{costs}. A revision control tool bos@218: that's difficult to understand or use is going to impose a high cost. bos@218: bos@218: For example, a five-hundred-person project is likely to collapse under bos@218: its own weight almost immediately without a revision control tool and bos@218: process. In this case, the cost of using revision control might bos@218: hardly seem worth considering, since \emph{without} it, failure is bos@218: almost guaranteed. bos@218: bos@218: On the other hand, a one-person ``quick hack'' might seem like a poor bos@218: place to use a revision control tool, because surely the cost of using bos@218: one must be close to the overall cost of the project. Right? bos@218: bos@218: Mercurial uniquely supports \emph{both} of these scales of bos@218: development. You can learn the basics in just a few minutes, and due bos@218: to its low overhead, you can apply revision control to the smallest of bos@218: projects with ease. Its simplicity means you won't have a lot of bos@218: abstruse concepts or command sequences competing for mental space with bos@218: whatever you're \emph{really} trying to do. At the same time, bos@218: Mercurial's high performance and peer-to-peer nature let you scale bos@218: painlessly to handle large projects. bos@217: bos@217: \subsection{The many names of revision control} bos@217: bos@217: Revision control is a diverse field, so much so that it doesn't bos@217: actually have a single name or acronym. Here are a few of the more bos@217: common names and acronyms you'll encounter: bos@217: \begin{itemize} bos@217: \item Configuration management (CM) bos@217: \item Revision control (RCS) bos@217: \item Software configuration management (SCM) bos@218: \item Source code management bos@218: \item Source control bos@217: \item Version control (VCS) bos@217: \end{itemize} bos@217: Some people claim that these terms actually have different meanings, bos@217: but in practice they overlap so much that there's no agreed or even bos@217: useful way to tease them apart. bos@155: bos@218: \section{A short history and hierarchy of revision control} bos@155: bos@218: The best known of the old-time revision control tools is SCCS (Source bos@218: Code Control System), which Marc Rochkind wrote at Bell Labs, in the bos@218: early 1970s. SCCS operated on individual files, and required every bos@218: person working on a project to have access to a shared workspace on a bos@218: single system. Only one person could modify a file at any time; bos@218: arbitration for access to files was via locks. It was common for bos@218: people to lock files, and later forget to unlock them, preventing bos@218: anyone else from modifying those files without the help of an bos@218: administrator. bos@218: bos@218: Walter Tichy developed a free alternative to SCCS in the early 1980s; bos@218: he called his program RCS (Revison Control System). Like SCCS, RCS bos@218: required developers to work in a single shared workspace, and to lock bos@218: files to prevent multiple people from modifying them simultaneously. bos@218: bos@218: Later in the 1980s, Dick Grune used RCS as a building block for a set bos@218: of shell scripts he initially called cmt, but then renamed to CVS bos@218: (Concurrent Versions System). The big innovation of CVS was that it bos@218: let developers work simultaneously and somewhat independently in their bos@218: own personal workspaces. The personal workspaces prevented developers bos@218: from stepping on each other's toes all the time, as was common with bos@218: SCCS and RCS. Each developer had a copy of every project file, and bos@218: could modify their copies independently. They had to merge their bos@218: edits prior to committing changes to the central repository. bos@218: bos@218: Brian Berliner took Grune's original scripts and rewrote them in~C, bos@218: releasing in 1989 the code that has since developed into the modern bos@218: version of CVS. CVS subsequently acquired the ability to operate over bos@218: a network connection, giving it a client/server architecture. CVS's bos@218: architecture is centralised; only the server has a copy of the history bos@218: of the project. Client workspaces just contain copies of recent bos@218: versions of the project's files, and a little metadata to tell them bos@218: where the server is. CVS has been enormously successful; it is bos@218: probably the world's most widely used revision control system. bos@218: bos@218: In the early 1990s, Sun Microsystems developed an early distributed bos@218: revision control system, called TeamWare. A TeamWare workspace bos@218: contains a complete copy of the project's history. TeamWare has no bos@218: notion of a central repository. (CVS relied upon RCS for its history bos@218: storage; TeamWare used SCCS.) bos@218: bos@218: As the 1990s progressed, awareness grew of a number of problems with bos@218: CVS. It records simultaneous changes to multiple files individually, bos@218: instead of grouping them together as a single logically atomic bos@218: operation. It does not manage its file hierarchy well; it is easy to bos@218: make a mess of a repository by renaming files and directories. Worse, bos@218: its source code is difficult to read and maintain, which made the bos@218: ``pain level'' of fixing these architectural problems prohibitive. bos@218: bos@218: In 2001, Jim Blandy and Karl Fogel, two developers who had worked on bos@218: CVS, started a project to replace it with a tool that would have a bos@218: better architecture and cleaner code. The result, Subversion, does bos@218: not stray from CVS's centralised client/server model, but it adds bos@218: multi-file atomic commits, better namespace management, and a number bos@218: of other features that make it a generally better tool than CVS. bos@218: Since its initial release, it has rapidly grown in popularity. bos@218: bos@218: More or less simultaneously, Graydon Hoare began working on an bos@218: ambitious distributed revision control system that he named Monotone. bos@218: While Monotone addresses many of CVS's design flaws and has a bos@218: peer-to-peer architecture, it goes beyond earlier (and subsequent) bos@218: revision control tools in a number of innovative ways. It uses bos@218: cryptographic hashes as identifiers, and has an integral notion of bos@218: ``trust'' for code from different sources. bos@218: bos@218: Mercurial began life in 2005. While a few aspects of its design are bos@218: influenced by Monotone, Mercurial focuses on ease of use, high bos@218: performance, and scalability to very large projects. bos@155: bos@155: \subsection{On a single system} bos@155: bos@155: \subsection{Network-based, but centralised} bos@155: bos@155: \subsection{Fully distributed} bos@155: bos@155: bos@155: \section{Advantages of distributed revision control} bos@155: bos@155: \subsection{For open source projects} bos@155: bos@155: \subsection{For commercial projects} bos@155: bos@155: \subsection{Myths about distributed revision control} bos@155: bos@155: \section{Why choose Mercurial?} bos@155: bos@16: bos@16: %%% Local Variables: bos@16: %%% mode: latex bos@16: %%% TeX-master: "00book" bos@16: %%% End: