hgbook

view en/intro.tex @ 218:75fd236d736b

History of SCM tools.
author Bryan O'Sullivan <bos@serpentine.com>
date Thu May 10 17:21:09 2007 -0700 (2007-05-10)
parents 369858a4d63c
children 15a6fd2ba582
line source
1 \chapter{Introduction}
2 \label{chap:intro}
4 \section{About revision control}
6 Revision control is the management of multiple versions of a piece of
7 information. In its simplest form, it's a process that many people
8 perform by hand: every time you modify a file, save it under a new
9 name that contains a number, each one higher than the number of the
10 preceding version.
12 Manually managing multiple versions of even a single file is an
13 error-prone task, though, so software tools to help automate this
14 process have long been available. The earliest automated revision
15 control tools were intended to help a single user to manage revisions
16 to a single file. Over the past several decades, the scope of
17 revision control tools has expanded greatly; they now manage multiple
18 files, and help multiple people to work together. The best modern
19 revision control tools will have no problem coping with thousands of
20 people working together on a single project, which might consist of
21 hundreds of thousands of files.
23 \subsection{Why use revision control?}
25 There are a number of reasons why you or your team might want to use
26 an automated revision control tool for a project.
27 \begin{itemize}
28 \item The software gives you a unified way of working with your
29 project's files.
30 \item When you're working with other people, it makes it easier for
31 you to collaborate. For example, when people more or less
32 simultaneously make potentially incompatible changes, the software
33 will help you to identify and resolve those conflicts.
34 \item It will track the history of your project. For every change,
35 you'll have a log of \emph{who} made it; \emph{why} they made it;
36 \emph{when} they made it; and \emph{what} the change was.
37 \item It can help you to recover from mistakes. If you make a change
38 that later turns out to be in error, you can revert to an earlier
39 version of one or more files. In fact, a \emph{really} good
40 revision control tool will even help you to efficiently figure out
41 exactly when a problem was introduced (see
42 section~\ref{sec:undo:bisect} for details).
43 \item It will help you to work simultaneously on, and manage the drift
44 between, multiple versions of your project.
45 \end{itemize}
46 Most of these reasons are equally valid---at least in theory---whether
47 you're working on a project by yourself, or with a hundred other
48 people.
50 A key question about the practicality of revision control at these two
51 different scales (``lone hacker'' and ``huge team'') is how its
52 \emph{benefits} compare to its \emph{costs}. A revision control tool
53 that's difficult to understand or use is going to impose a high cost.
55 For example, a five-hundred-person project is likely to collapse under
56 its own weight almost immediately without a revision control tool and
57 process. In this case, the cost of using revision control might
58 hardly seem worth considering, since \emph{without} it, failure is
59 almost guaranteed.
61 On the other hand, a one-person ``quick hack'' might seem like a poor
62 place to use a revision control tool, because surely the cost of using
63 one must be close to the overall cost of the project. Right?
65 Mercurial uniquely supports \emph{both} of these scales of
66 development. You can learn the basics in just a few minutes, and due
67 to its low overhead, you can apply revision control to the smallest of
68 projects with ease. Its simplicity means you won't have a lot of
69 abstruse concepts or command sequences competing for mental space with
70 whatever you're \emph{really} trying to do. At the same time,
71 Mercurial's high performance and peer-to-peer nature let you scale
72 painlessly to handle large projects.
74 \subsection{The many names of revision control}
76 Revision control is a diverse field, so much so that it doesn't
77 actually have a single name or acronym. Here are a few of the more
78 common names and acronyms you'll encounter:
79 \begin{itemize}
80 \item Configuration management (CM)
81 \item Revision control (RCS)
82 \item Software configuration management (SCM)
83 \item Source code management
84 \item Source control
85 \item Version control (VCS)
86 \end{itemize}
87 Some people claim that these terms actually have different meanings,
88 but in practice they overlap so much that there's no agreed or even
89 useful way to tease them apart.
91 \section{A short history and hierarchy of revision control}
93 The best known of the old-time revision control tools is SCCS (Source
94 Code Control System), which Marc Rochkind wrote at Bell Labs, in the
95 early 1970s. SCCS operated on individual files, and required every
96 person working on a project to have access to a shared workspace on a
97 single system. Only one person could modify a file at any time;
98 arbitration for access to files was via locks. It was common for
99 people to lock files, and later forget to unlock them, preventing
100 anyone else from modifying those files without the help of an
101 administrator.
103 Walter Tichy developed a free alternative to SCCS in the early 1980s;
104 he called his program RCS (Revison Control System). Like SCCS, RCS
105 required developers to work in a single shared workspace, and to lock
106 files to prevent multiple people from modifying them simultaneously.
108 Later in the 1980s, Dick Grune used RCS as a building block for a set
109 of shell scripts he initially called cmt, but then renamed to CVS
110 (Concurrent Versions System). The big innovation of CVS was that it
111 let developers work simultaneously and somewhat independently in their
112 own personal workspaces. The personal workspaces prevented developers
113 from stepping on each other's toes all the time, as was common with
114 SCCS and RCS. Each developer had a copy of every project file, and
115 could modify their copies independently. They had to merge their
116 edits prior to committing changes to the central repository.
118 Brian Berliner took Grune's original scripts and rewrote them in~C,
119 releasing in 1989 the code that has since developed into the modern
120 version of CVS. CVS subsequently acquired the ability to operate over
121 a network connection, giving it a client/server architecture. CVS's
122 architecture is centralised; only the server has a copy of the history
123 of the project. Client workspaces just contain copies of recent
124 versions of the project's files, and a little metadata to tell them
125 where the server is. CVS has been enormously successful; it is
126 probably the world's most widely used revision control system.
128 In the early 1990s, Sun Microsystems developed an early distributed
129 revision control system, called TeamWare. A TeamWare workspace
130 contains a complete copy of the project's history. TeamWare has no
131 notion of a central repository. (CVS relied upon RCS for its history
132 storage; TeamWare used SCCS.)
134 As the 1990s progressed, awareness grew of a number of problems with
135 CVS. It records simultaneous changes to multiple files individually,
136 instead of grouping them together as a single logically atomic
137 operation. It does not manage its file hierarchy well; it is easy to
138 make a mess of a repository by renaming files and directories. Worse,
139 its source code is difficult to read and maintain, which made the
140 ``pain level'' of fixing these architectural problems prohibitive.
142 In 2001, Jim Blandy and Karl Fogel, two developers who had worked on
143 CVS, started a project to replace it with a tool that would have a
144 better architecture and cleaner code. The result, Subversion, does
145 not stray from CVS's centralised client/server model, but it adds
146 multi-file atomic commits, better namespace management, and a number
147 of other features that make it a generally better tool than CVS.
148 Since its initial release, it has rapidly grown in popularity.
150 More or less simultaneously, Graydon Hoare began working on an
151 ambitious distributed revision control system that he named Monotone.
152 While Monotone addresses many of CVS's design flaws and has a
153 peer-to-peer architecture, it goes beyond earlier (and subsequent)
154 revision control tools in a number of innovative ways. It uses
155 cryptographic hashes as identifiers, and has an integral notion of
156 ``trust'' for code from different sources.
158 Mercurial began life in 2005. While a few aspects of its design are
159 influenced by Monotone, Mercurial focuses on ease of use, high
160 performance, and scalability to very large projects.
162 \subsection{On a single system}
164 \subsection{Network-based, but centralised}
166 \subsection{Fully distributed}
169 \section{Advantages of distributed revision control}
171 \subsection{For open source projects}
173 \subsection{For commercial projects}
175 \subsection{Myths about distributed revision control}
177 \section{Why choose Mercurial?}
180 %%% Local Variables:
181 %%% mode: latex
182 %%% TeX-master: "00book"
183 %%% End: