hgbook

annotate en/intro.tex @ 218:75fd236d736b

History of SCM tools.
author Bryan O'Sullivan <bos@serpentine.com>
date Thu May 10 17:21:09 2007 -0700 (2007-05-10)
parents 369858a4d63c
children 15a6fd2ba582
rev   line source
bos@16 1 \chapter{Introduction}
bos@16 2 \label{chap:intro}
bos@16 3
bos@217 4 \section{About revision control}
bos@155 5
bos@217 6 Revision control is the management of multiple versions of a piece of
bos@217 7 information. In its simplest form, it's a process that many people
bos@217 8 perform by hand: every time you modify a file, save it under a new
bos@217 9 name that contains a number, each one higher than the number of the
bos@217 10 preceding version.
bos@217 11
bos@217 12 Manually managing multiple versions of even a single file is an
bos@217 13 error-prone task, though, so software tools to help automate this
bos@217 14 process have long been available. The earliest automated revision
bos@217 15 control tools were intended to help a single user to manage revisions
bos@217 16 to a single file. Over the past several decades, the scope of
bos@217 17 revision control tools has expanded greatly; they now manage multiple
bos@217 18 files, and help multiple people to work together. The best modern
bos@217 19 revision control tools will have no problem coping with thousands of
bos@217 20 people working together on a single project, which might consist of
bos@217 21 hundreds of thousands of files.
bos@217 22
bos@217 23 \subsection{Why use revision control?}
bos@217 24
bos@217 25 There are a number of reasons why you or your team might want to use
bos@217 26 an automated revision control tool for a project.
bos@217 27 \begin{itemize}
bos@217 28 \item The software gives you a unified way of working with your
bos@218 29 project's files.
bos@218 30 \item When you're working with other people, it makes it easier for
bos@218 31 you to collaborate. For example, when people more or less
bos@218 32 simultaneously make potentially incompatible changes, the software
bos@218 33 will help you to identify and resolve those conflicts.
bos@217 34 \item It will track the history of your project. For every change,
bos@217 35 you'll have a log of \emph{who} made it; \emph{why} they made it;
bos@217 36 \emph{when} they made it; and \emph{what} the change was.
bos@217 37 \item It can help you to recover from mistakes. If you make a change
bos@217 38 that later turns out to be in error, you can revert to an earlier
bos@217 39 version of one or more files. In fact, a \emph{really} good
bos@217 40 revision control tool will even help you to efficiently figure out
bos@217 41 exactly when a problem was introduced (see
bos@217 42 section~\ref{sec:undo:bisect} for details).
bos@218 43 \item It will help you to work simultaneously on, and manage the drift
bos@218 44 between, multiple versions of your project.
bos@217 45 \end{itemize}
bos@218 46 Most of these reasons are equally valid---at least in theory---whether
bos@218 47 you're working on a project by yourself, or with a hundred other
bos@218 48 people.
bos@218 49
bos@218 50 A key question about the practicality of revision control at these two
bos@218 51 different scales (``lone hacker'' and ``huge team'') is how its
bos@218 52 \emph{benefits} compare to its \emph{costs}. A revision control tool
bos@218 53 that's difficult to understand or use is going to impose a high cost.
bos@218 54
bos@218 55 For example, a five-hundred-person project is likely to collapse under
bos@218 56 its own weight almost immediately without a revision control tool and
bos@218 57 process. In this case, the cost of using revision control might
bos@218 58 hardly seem worth considering, since \emph{without} it, failure is
bos@218 59 almost guaranteed.
bos@218 60
bos@218 61 On the other hand, a one-person ``quick hack'' might seem like a poor
bos@218 62 place to use a revision control tool, because surely the cost of using
bos@218 63 one must be close to the overall cost of the project. Right?
bos@218 64
bos@218 65 Mercurial uniquely supports \emph{both} of these scales of
bos@218 66 development. You can learn the basics in just a few minutes, and due
bos@218 67 to its low overhead, you can apply revision control to the smallest of
bos@218 68 projects with ease. Its simplicity means you won't have a lot of
bos@218 69 abstruse concepts or command sequences competing for mental space with
bos@218 70 whatever you're \emph{really} trying to do. At the same time,
bos@218 71 Mercurial's high performance and peer-to-peer nature let you scale
bos@218 72 painlessly to handle large projects.
bos@217 73
bos@217 74 \subsection{The many names of revision control}
bos@217 75
bos@217 76 Revision control is a diverse field, so much so that it doesn't
bos@217 77 actually have a single name or acronym. Here are a few of the more
bos@217 78 common names and acronyms you'll encounter:
bos@217 79 \begin{itemize}
bos@217 80 \item Configuration management (CM)
bos@217 81 \item Revision control (RCS)
bos@217 82 \item Software configuration management (SCM)
bos@218 83 \item Source code management
bos@218 84 \item Source control
bos@217 85 \item Version control (VCS)
bos@217 86 \end{itemize}
bos@217 87 Some people claim that these terms actually have different meanings,
bos@217 88 but in practice they overlap so much that there's no agreed or even
bos@217 89 useful way to tease them apart.
bos@155 90
bos@218 91 \section{A short history and hierarchy of revision control}
bos@155 92
bos@218 93 The best known of the old-time revision control tools is SCCS (Source
bos@218 94 Code Control System), which Marc Rochkind wrote at Bell Labs, in the
bos@218 95 early 1970s. SCCS operated on individual files, and required every
bos@218 96 person working on a project to have access to a shared workspace on a
bos@218 97 single system. Only one person could modify a file at any time;
bos@218 98 arbitration for access to files was via locks. It was common for
bos@218 99 people to lock files, and later forget to unlock them, preventing
bos@218 100 anyone else from modifying those files without the help of an
bos@218 101 administrator.
bos@218 102
bos@218 103 Walter Tichy developed a free alternative to SCCS in the early 1980s;
bos@218 104 he called his program RCS (Revison Control System). Like SCCS, RCS
bos@218 105 required developers to work in a single shared workspace, and to lock
bos@218 106 files to prevent multiple people from modifying them simultaneously.
bos@218 107
bos@218 108 Later in the 1980s, Dick Grune used RCS as a building block for a set
bos@218 109 of shell scripts he initially called cmt, but then renamed to CVS
bos@218 110 (Concurrent Versions System). The big innovation of CVS was that it
bos@218 111 let developers work simultaneously and somewhat independently in their
bos@218 112 own personal workspaces. The personal workspaces prevented developers
bos@218 113 from stepping on each other's toes all the time, as was common with
bos@218 114 SCCS and RCS. Each developer had a copy of every project file, and
bos@218 115 could modify their copies independently. They had to merge their
bos@218 116 edits prior to committing changes to the central repository.
bos@218 117
bos@218 118 Brian Berliner took Grune's original scripts and rewrote them in~C,
bos@218 119 releasing in 1989 the code that has since developed into the modern
bos@218 120 version of CVS. CVS subsequently acquired the ability to operate over
bos@218 121 a network connection, giving it a client/server architecture. CVS's
bos@218 122 architecture is centralised; only the server has a copy of the history
bos@218 123 of the project. Client workspaces just contain copies of recent
bos@218 124 versions of the project's files, and a little metadata to tell them
bos@218 125 where the server is. CVS has been enormously successful; it is
bos@218 126 probably the world's most widely used revision control system.
bos@218 127
bos@218 128 In the early 1990s, Sun Microsystems developed an early distributed
bos@218 129 revision control system, called TeamWare. A TeamWare workspace
bos@218 130 contains a complete copy of the project's history. TeamWare has no
bos@218 131 notion of a central repository. (CVS relied upon RCS for its history
bos@218 132 storage; TeamWare used SCCS.)
bos@218 133
bos@218 134 As the 1990s progressed, awareness grew of a number of problems with
bos@218 135 CVS. It records simultaneous changes to multiple files individually,
bos@218 136 instead of grouping them together as a single logically atomic
bos@218 137 operation. It does not manage its file hierarchy well; it is easy to
bos@218 138 make a mess of a repository by renaming files and directories. Worse,
bos@218 139 its source code is difficult to read and maintain, which made the
bos@218 140 ``pain level'' of fixing these architectural problems prohibitive.
bos@218 141
bos@218 142 In 2001, Jim Blandy and Karl Fogel, two developers who had worked on
bos@218 143 CVS, started a project to replace it with a tool that would have a
bos@218 144 better architecture and cleaner code. The result, Subversion, does
bos@218 145 not stray from CVS's centralised client/server model, but it adds
bos@218 146 multi-file atomic commits, better namespace management, and a number
bos@218 147 of other features that make it a generally better tool than CVS.
bos@218 148 Since its initial release, it has rapidly grown in popularity.
bos@218 149
bos@218 150 More or less simultaneously, Graydon Hoare began working on an
bos@218 151 ambitious distributed revision control system that he named Monotone.
bos@218 152 While Monotone addresses many of CVS's design flaws and has a
bos@218 153 peer-to-peer architecture, it goes beyond earlier (and subsequent)
bos@218 154 revision control tools in a number of innovative ways. It uses
bos@218 155 cryptographic hashes as identifiers, and has an integral notion of
bos@218 156 ``trust'' for code from different sources.
bos@218 157
bos@218 158 Mercurial began life in 2005. While a few aspects of its design are
bos@218 159 influenced by Monotone, Mercurial focuses on ease of use, high
bos@218 160 performance, and scalability to very large projects.
bos@155 161
bos@155 162 \subsection{On a single system}
bos@155 163
bos@155 164 \subsection{Network-based, but centralised}
bos@155 165
bos@155 166 \subsection{Fully distributed}
bos@155 167
bos@155 168
bos@155 169 \section{Advantages of distributed revision control}
bos@155 170
bos@155 171 \subsection{For open source projects}
bos@155 172
bos@155 173 \subsection{For commercial projects}
bos@155 174
bos@155 175 \subsection{Myths about distributed revision control}
bos@155 176
bos@155 177 \section{Why choose Mercurial?}
bos@155 178
bos@16 179
bos@16 180 %%% Local Variables:
bos@16 181 %%% mode: latex
bos@16 182 %%% TeX-master: "00book"
bos@16 183 %%% End: