hgbook

view en/ch00-preface.xml @ 609:c44d5854620b

Fix up chapter 1.
author Bryan O'Sullivan <bos@serpentine.com>
date Tue Mar 31 22:38:30 2009 -0700 (2009-03-31)
parents 4ce9d0754af3
children 3b33dd6aba87
line source
1 <!-- vim: set filetype=docbkxml shiftwidth=2 autoindent expandtab tw=77 : -->
3 <preface id="chap:preface">
4 <?dbhtml filename="preface.html"?>
5 <title>Preface</title>
7 <sect1>
8 <title>Why revision control? Why Mercurial?</title>
10 <para id="x_6d">Revision control is the process of managing multiple
11 versions of a piece of information. In its simplest form, this
12 is something that many people do by hand: every time you modify
13 a file, save it under a new name that contains a number, each
14 one higher than the number of the preceding version.</para>
16 <para id="x_6e">Manually managing multiple versions of even a single file is
17 an error-prone task, though, so software tools to help automate
18 this process have long been available. The earliest automated
19 revision control tools were intended to help a single user to
20 manage revisions of a single file. Over the past few decades,
21 the scope of revision control tools has expanded greatly; they
22 now manage multiple files, and help multiple people to work
23 together. The best modern revision control tools have no
24 problem coping with thousands of people working together on
25 projects that consist of hundreds of thousands of files.</para>
27 <para id="x_6f">The arrival of distributed revision control is relatively
28 recent, and so far this new field has grown due to people's
29 willingness to explore ill-charted territory.</para>
31 <para id="x_70">I am writing a book about distributed revision control
32 because I believe that it is an important subject that deserves
33 a field guide. I chose to write about Mercurial because it is
34 the easiest tool to learn the terrain with, and yet it scales to
35 the demands of real, challenging environments where many other
36 revision control tools buckle.</para>
38 <sect2>
39 <title>Why use revision control?</title>
41 <para id="x_71">There are a number of reasons why you or your team might
42 want to use an automated revision control tool for a
43 project.</para>
45 <itemizedlist>
46 <listitem><para id="x_72">It will track the history and evolution of
47 your project, so you don't have to. For every change,
48 you'll have a log of <emphasis>who</emphasis> made it;
49 <emphasis>why</emphasis> they made it;
50 <emphasis>when</emphasis> they made it; and
51 <emphasis>what</emphasis> the change
52 was.</para></listitem>
53 <listitem><para id="x_73">When you're working with other people,
54 revision control software makes it easier for you to
55 collaborate. For example, when people more or less
56 simultaneously make potentially incompatible changes, the
57 software will help you to identify and resolve those
58 conflicts.</para></listitem>
59 <listitem><para id="x_74">It can help you to recover from mistakes. If
60 you make a change that later turns out to be in error, you
61 can revert to an earlier version of one or more files. In
62 fact, a <emphasis>really</emphasis> good revision control
63 tool will even help you to efficiently figure out exactly
64 when a problem was introduced (see <xref
65 linkend="sec:undo:bisect"/> for details).</para></listitem>
66 <listitem><para id="x_75">It will help you to work simultaneously on,
67 and manage the drift between, multiple versions of your
68 project.</para></listitem>
69 </itemizedlist>
71 <para id="x_76">Most of these reasons are equally
72 valid&emdash;at least in theory&emdash;whether you're working
73 on a project by yourself, or with a hundred other
74 people.</para>
76 <para id="x_77">A key question about the practicality of revision control
77 at these two different scales (<quote>lone hacker</quote> and
78 <quote>huge team</quote>) is how its
79 <emphasis>benefits</emphasis> compare to its
80 <emphasis>costs</emphasis>. A revision control tool that's
81 difficult to understand or use is going to impose a high
82 cost.</para>
84 <para id="x_78">A five-hundred-person project is likely to collapse under
85 its own weight almost immediately without a revision control
86 tool and process. In this case, the cost of using revision
87 control might hardly seem worth considering, since
88 <emphasis>without</emphasis> it, failure is almost
89 guaranteed.</para>
91 <para id="x_79">On the other hand, a one-person <quote>quick hack</quote>
92 might seem like a poor place to use a revision control tool,
93 because surely the cost of using one must be close to the
94 overall cost of the project. Right?</para>
96 <para id="x_7a">Mercurial uniquely supports <emphasis>both</emphasis> of
97 these scales of development. You can learn the basics in just
98 a few minutes, and due to its low overhead, you can apply
99 revision control to the smallest of projects with ease. Its
100 simplicity means you won't have a lot of abstruse concepts or
101 command sequences competing for mental space with whatever
102 you're <emphasis>really</emphasis> trying to do. At the same
103 time, Mercurial's high performance and peer-to-peer nature let
104 you scale painlessly to handle large projects.</para>
106 <para id="x_7b">No revision control tool can rescue a poorly run project,
107 but a good choice of tools can make a huge difference to the
108 fluidity with which you can work on a project.</para>
110 </sect2>
112 <sect2>
113 <title>The many names of revision control</title>
115 <para id="x_7c">Revision control is a diverse field, so much so that it is
116 referred to by many names and acronyms. Here are a few of the
117 more common variations you'll encounter:</para>
118 <itemizedlist>
119 <listitem><para id="x_7d">Revision control (RCS)</para></listitem>
120 <listitem><para id="x_7e">Software configuration management (SCM), or
121 configuration management</para></listitem>
122 <listitem><para id="x_7f">Source code management</para></listitem>
123 <listitem><para id="x_80">Source code control, or source
124 control</para></listitem>
125 <listitem><para id="x_81">Version control
126 (VCS)</para></listitem></itemizedlist>
127 <para id="x_82">Some people claim that these terms actually have different
128 meanings, but in practice they overlap so much that there's no
129 agreed or even useful way to tease them apart.</para>
131 </sect2>
132 </sect1>
134 <sect1>
135 <title>This book is a work in progress</title>
137 <para id="x_83">I am releasing this book while I am still writing it, in the
138 hope that it will prove useful to others. I am writing under an
139 open license in the hope that you, my readers, will contribute
140 feedback and perhaps content of your own.</para>
142 </sect1>
143 <sect1>
144 <title>About the examples in this book</title>
146 <para id="x_84">This book takes an unusual approach to code samples. Every
147 example is <quote>live</quote>&emdash;each one is actually the result
148 of a shell script that executes the Mercurial commands you see.
149 Every time an image of the book is built from its sources, all
150 the example scripts are automatically run, and their current
151 results compared against their expected results.</para>
153 <para id="x_85">The advantage of this approach is that the examples are
154 always accurate; they describe <emphasis>exactly</emphasis> the
155 behaviour of the version of Mercurial that's mentioned at the
156 front of the book. If I update the version of Mercurial that
157 I'm documenting, and the output of some command changes, the
158 build fails.</para>
160 <para id="x_86">There is a small disadvantage to this approach, which is
161 that the dates and times you'll see in examples tend to be
162 <quote>squashed</quote> together in a way that they wouldn't be
163 if the same commands were being typed by a human. Where a human
164 can issue no more than one command every few seconds, with any
165 resulting timestamps correspondingly spread out, my automated
166 example scripts run many commands in one second.</para>
168 <para id="x_87">As an instance of this, several consecutive commits in an
169 example can show up as having occurred during the same second.
170 You can see this occur in the <literal
171 role="hg-ext">bisect</literal> example in <xref
172 linkend="sec:undo:bisect"/>, for instance.</para>
174 <para id="x_88">So when you're reading examples, don't place too much weight
175 on the dates or times you see in the output of commands. But
176 <emphasis>do</emphasis> be confident that the behaviour you're
177 seeing is consistent and reproducible.</para>
179 </sect1>
181 <sect1>
182 <title>Trends in the field</title>
184 <para id="x_89">There has been an unmistakable trend in the development and
185 use of revision control tools over the past four decades, as
186 people have become familiar with the capabilities of their tools
187 and constrained by their limitations.</para>
189 <para id="x_8a">The first generation began by managing single files on
190 individual computers. Although these tools represented a huge
191 advance over ad-hoc manual revision control, their locking model
192 and reliance on a single computer limited them to small,
193 tightly-knit teams.</para>
195 <para id="x_8b">The second generation loosened these constraints by moving
196 to network-centered architectures, and managing entire projects
197 at a time. As projects grew larger, they ran into new problems.
198 With clients needing to talk to servers very frequently, server
199 scaling became an issue for large projects. An unreliable
200 network connection could prevent remote users from being able to
201 talk to the server at all. As open source projects started
202 making read-only access available anonymously to anyone, people
203 without commit privileges found that they could not use the
204 tools to interact with a project in a natural way, as they could
205 not record their changes.</para>
207 <para id="x_8c">The current generation of revision control tools is
208 peer-to-peer in nature. All of these systems have dropped the
209 dependency on a single central server, and allow people to
210 distribute their revision control data to where it's actually
211 needed. Collaboration over the Internet has moved from
212 constrained by technology to a matter of choice and consensus.
213 Modern tools can operate offline indefinitely and autonomously,
214 with a network connection only needed when syncing changes with
215 another repository.</para>
217 </sect1>
218 <sect1>
219 <title>A few of the advantages of distributed revision
220 control</title>
222 <para id="x_8d">Even though distributed revision control tools have for
223 several years been as robust and usable as their
224 previous-generation counterparts, people using older tools have
225 not yet necessarily woken up to their advantages. There are a
226 number of ways in which distributed tools shine relative to
227 centralised ones.</para>
229 <para id="x_8e">For an individual developer, distributed tools are almost
230 always much faster than centralised tools. This is for a simple
231 reason: a centralised tool needs to talk over the network for
232 many common operations, because most metadata is stored in a
233 single copy on the central server. A distributed tool stores
234 all of its metadata locally. All else being equal, talking over
235 the network adds overhead to a centralised tool. Don't
236 underestimate the value of a snappy, responsive tool: you're
237 going to spend a lot of time interacting with your revision
238 control software.</para>
240 <para id="x_8f">Distributed tools are indifferent to the vagaries of your
241 server infrastructure, again because they replicate metadata to
242 so many locations. If you use a centralised system and your
243 server catches fire, you'd better hope that your backup media
244 are reliable, and that your last backup was recent and actually
245 worked. With a distributed tool, you have many backups
246 available on every contributor's computer.</para>
248 <para id="x_90">The reliability of your network will affect distributed
249 tools far less than it will centralised tools. You can't even
250 use a centralised tool without a network connection, except for
251 a few highly constrained commands. With a distributed tool, if
252 your network connection goes down while you're working, you may
253 not even notice. The only thing you won't be able to do is talk
254 to repositories on other computers, something that is relatively
255 rare compared with local operations. If you have a far-flung
256 team of collaborators, this may be significant.</para>
258 <sect2>
259 <title>Advantages for open source projects</title>
261 <para id="x_91">If you take a shine to an open source project and decide
262 that you would like to start hacking on it, and that project
263 uses a distributed revision control tool, you are at once a
264 peer with the people who consider themselves the
265 <quote>core</quote> of that project. If they publish their
266 repositories, you can immediately copy their project history,
267 start making changes, and record your work, using the same
268 tools in the same ways as insiders. By contrast, with a
269 centralised tool, you must use the software in a <quote>read
270 only</quote> mode unless someone grants you permission to
271 commit changes to their central server. Until then, you won't
272 be able to record changes, and your local modifications will
273 be at risk of corruption any time you try to update your
274 client's view of the repository.</para>
276 <sect3>
277 <title>The forking non-problem</title>
279 <para id="x_92">It has been suggested that distributed revision control
280 tools pose some sort of risk to open source projects because
281 they make it easy to <quote>fork</quote> the development of
282 a project. A fork happens when there are differences in
283 opinion or attitude between groups of developers that cause
284 them to decide that they can't work together any longer.
285 Each side takes a more or less complete copy of the
286 project's source code, and goes off in its own
287 direction.</para>
289 <para id="x_93">Sometimes the camps in a fork decide to reconcile their
290 differences. With a centralised revision control system, the
291 <emphasis>technical</emphasis> process of reconciliation is
292 painful, and has to be performed largely by hand. You have
293 to decide whose revision history is going to
294 <quote>win</quote>, and graft the other team's changes into
295 the tree somehow. This usually loses some or all of one
296 side's revision history.</para>
298 <para id="x_94">What distributed tools do with respect to forking is
299 they make forking the <emphasis>only</emphasis> way to
300 develop a project. Every single change that you make is
301 potentially a fork point. The great strength of this
302 approach is that a distributed revision control tool has to
303 be really good at <emphasis>merging</emphasis> forks,
304 because forks are absolutely fundamental: they happen all
305 the time.</para>
307 <para id="x_95">If every piece of work that everybody does, all the
308 time, is framed in terms of forking and merging, then what
309 the open source world refers to as a <quote>fork</quote>
310 becomes <emphasis>purely</emphasis> a social issue. If
311 anything, distributed tools <emphasis>lower</emphasis> the
312 likelihood of a fork:</para>
313 <itemizedlist>
314 <listitem><para id="x_96">They eliminate the social distinction that
315 centralised tools impose: that between insiders (people
316 with commit access) and outsiders (people
317 without).</para></listitem>
318 <listitem><para id="x_97">They make it easier to reconcile after a
319 social fork, because all that's involved from the
320 perspective of the revision control software is just
321 another merge.</para></listitem></itemizedlist>
323 <para id="x_98">Some people resist distributed tools because they want
324 to retain tight control over their projects, and they
325 believe that centralised tools give them this control.
326 However, if you're of this belief, and you publish your CVS
327 or Subversion repositories publicly, there are plenty of
328 tools available that can pull out your entire project's
329 history (albeit slowly) and recreate it somewhere that you
330 don't control. So while your control in this case is
331 illusory, you are forgoing the ability to fluidly
332 collaborate with whatever people feel compelled to mirror
333 and fork your history.</para>
335 </sect3>
336 </sect2>
337 <sect2>
338 <title>Advantages for commercial projects</title>
340 <para id="x_99">Many commercial projects are undertaken by teams that are
341 scattered across the globe. Contributors who are far from a
342 central server will see slower command execution and perhaps
343 less reliability. Commercial revision control systems attempt
344 to ameliorate these problems with remote-site replication
345 add-ons that are typically expensive to buy and cantankerous
346 to administer. A distributed system doesn't suffer from these
347 problems in the first place. Better yet, you can easily set
348 up multiple authoritative servers, say one per site, so that
349 there's no redundant communication between repositories over
350 expensive long-haul network links.</para>
352 <para id="x_9a">Centralised revision control systems tend to have
353 relatively low scalability. It's not unusual for an expensive
354 centralised system to fall over under the combined load of
355 just a few dozen concurrent users. Once again, the typical
356 response tends to be an expensive and clunky replication
357 facility. Since the load on a central server&emdash;if you have
358 one at all&emdash;is many times lower with a distributed tool
359 (because all of the data is replicated everywhere), a single
360 cheap server can handle the needs of a much larger team, and
361 replication to balance load becomes a simple matter of
362 scripting.</para>
364 <para id="x_9b">If you have an employee in the field, troubleshooting a
365 problem at a customer's site, they'll benefit from distributed
366 revision control. The tool will let them generate custom
367 builds, try different fixes in isolation from each other, and
368 search efficiently through history for the sources of bugs and
369 regressions in the customer's environment, all without needing
370 to connect to your company's network.</para>
372 </sect2>
373 </sect1>
374 <sect1>
375 <title>Why choose Mercurial?</title>
377 <para id="x_9c">Mercurial has a unique set of properties that make it a
378 particularly good choice as a revision control system.</para>
379 <itemizedlist>
380 <listitem><para id="x_9d">It is easy to learn and use.</para></listitem>
381 <listitem><para id="x_9e">It is lightweight.</para></listitem>
382 <listitem><para id="x_9f">It scales excellently.</para></listitem>
383 <listitem><para id="x_a0">It is easy to
384 customise.</para></listitem></itemizedlist>
386 <para id="x_a1">If you are at all familiar with revision control systems,
387 you should be able to get up and running with Mercurial in less
388 than five minutes. Even if not, it will take no more than a few
389 minutes longer. Mercurial's command and feature sets are
390 generally uniform and consistent, so you can keep track of a few
391 general rules instead of a host of exceptions.</para>
393 <para id="x_a2">On a small project, you can start working with Mercurial in
394 moments. Creating new changes and branches; transferring changes
395 around (whether locally or over a network); and history and
396 status operations are all fast. Mercurial attempts to stay
397 nimble and largely out of your way by combining low cognitive
398 overhead with blazingly fast operations.</para>
400 <para id="x_a3">The usefulness of Mercurial is not limited to small
401 projects: it is used by projects with hundreds to thousands of
402 contributors, each containing tens of thousands of files and
403 hundreds of megabytes of source code.</para>
405 <para id="x_a4">If the core functionality of Mercurial is not enough for
406 you, it's easy to build on. Mercurial is well suited to
407 scripting tasks, and its clean internals and implementation in
408 Python make it easy to add features in the form of extensions.
409 There are a number of popular and useful extensions already
410 available, ranging from helping to identify bugs to improving
411 performance.</para>
413 </sect1>
414 <sect1>
415 <title>Mercurial compared with other tools</title>
417 <para id="x_a5">Before you read on, please understand that this section
418 necessarily reflects my own experiences, interests, and (dare I
419 say it) biases. I have used every one of the revision control
420 tools listed below, in most cases for several years at a
421 time.</para>
424 <sect2>
425 <title>Subversion</title>
427 <para id="x_a6">Subversion is a popular revision control tool, developed
428 to replace CVS. It has a centralised client/server
429 architecture.</para>
431 <para id="x_a7">Subversion and Mercurial have similarly named commands for
432 performing the same operations, so if you're familiar with
433 one, it is easy to learn to use the other. Both tools are
434 portable to all popular operating systems.</para>
436 <para id="x_a8">Prior to version 1.5, Subversion had no useful support for
437 merges. At the time of writing, its merge tracking capability
438 is new, and known to be <ulink
439 url="http://svnbook.red-bean.com/nightly/en/svn.branchmerge.advanced.html#svn.branchmerge.advanced.finalword">complicated
440 and buggy</ulink>.</para>
442 <para id="x_a9">Mercurial has a substantial performance advantage over
443 Subversion on every revision control operation I have
444 benchmarked. I have measured its advantage as ranging from a
445 factor of two to a factor of six when compared with Subversion
446 1.4.3's <emphasis>ra_local</emphasis> file store, which is the
447 fastest access method available. In more realistic
448 deployments involving a network-based store, Subversion will
449 be at a substantially larger disadvantage. Because many
450 Subversion commands must talk to the server and Subversion
451 does not have useful replication facilities, server capacity
452 and network bandwidth become bottlenecks for modestly large
453 projects.</para>
455 <para id="x_aa">Additionally, Subversion incurs substantial storage
456 overhead to avoid network transactions for a few common
457 operations, such as finding modified files
458 (<literal>status</literal>) and displaying modifications
459 against the current revision (<literal>diff</literal>). As a
460 result, a Subversion working copy is often the same size as,
461 or larger than, a Mercurial repository and working directory,
462 even though the Mercurial repository contains a complete
463 history of the project.</para>
465 <para id="x_ab">Subversion is widely supported by third party tools.
466 Mercurial currently lags considerably in this area. This gap
467 is closing, however, and indeed some of Mercurial's GUI tools
468 now outshine their Subversion equivalents. Like Mercurial,
469 Subversion has an excellent user manual.</para>
471 <para id="x_ac">Because Subversion doesn't store revision history on the
472 client, it is well suited to managing projects that deal with
473 lots of large, opaque binary files. If you check in fifty
474 revisions to an incompressible 10MB file, Subversion's
475 client-side space usage stays constant The space used by any
476 distributed SCM will grow rapidly in proportion to the number
477 of revisions, because the differences between each revision
478 are large.</para>
480 <para id="x_ad">In addition, it's often difficult or, more usually,
481 impossible to merge different versions of a binary file.
482 Subversion's ability to let a user lock a file, so that they
483 temporarily have the exclusive right to commit changes to it,
484 can be a significant advantage to a project where binary files
485 are widely used.</para>
487 <para id="x_ae">Mercurial can import revision history from a Subversion
488 repository. It can also export revision history to a
489 Subversion repository. This makes it easy to <quote>test the
490 waters</quote> and use Mercurial and Subversion in parallel
491 before deciding to switch. History conversion is incremental,
492 so you can perform an initial conversion, then small
493 additional conversions afterwards to bring in new
494 changes.</para>
497 </sect2>
498 <sect2>
499 <title>Git</title>
501 <para id="x_af">Git is a distributed revision control tool that was
502 developed for managing the Linux kernel source tree. Like
503 Mercurial, its early design was somewhat influenced by
504 Monotone.</para>
506 <para id="x_b0">Git has a very large command set, with version 1.5.0
507 providing 139 individual commands. It has something of a
508 reputation for being difficult to learn. Compared to Git,
509 Mercurial has a strong focus on simplicity.</para>
511 <para id="x_b1">In terms of performance, Git is extremely fast. In
512 several cases, it is faster than Mercurial, at least on Linux,
513 while Mercurial performs better on other operations. However,
514 on Windows, the performance and general level of support that
515 Git provides is, at the time of writing, far behind that of
516 Mercurial.</para>
518 <para id="x_b2">While a Mercurial repository needs no maintenance, a Git
519 repository requires frequent manual <quote>repacks</quote> of
520 its metadata. Without these, performance degrades, while
521 space usage grows rapidly. A server that contains many Git
522 repositories that are not rigorously and frequently repacked
523 will become heavily disk-bound during backups, and there have
524 been instances of daily backups taking far longer than 24
525 hours as a result. A freshly packed Git repository is
526 slightly smaller than a Mercurial repository, but an unpacked
527 repository is several orders of magnitude larger.</para>
529 <para id="x_b3">The core of Git is written in C. Many Git commands are
530 implemented as shell or Perl scripts, and the quality of these
531 scripts varies widely. I have encountered several instances
532 where scripts charged along blindly in the presence of errors
533 that should have been fatal.</para>
535 <para id="x_b4">Mercurial can import revision history from a Git
536 repository.</para>
539 </sect2>
540 <sect2>
541 <title>CVS</title>
543 <para id="x_b5">CVS is probably the most widely used revision control tool
544 in the world. Due to its age and internal untidiness, it has
545 been only lightly maintained for many years.</para>
547 <para id="x_b6">It has a centralised client/server architecture. It does
548 not group related file changes into atomic commits, making it
549 easy for people to <quote>break the build</quote>: one person
550 can successfully commit part of a change and then be blocked
551 by the need for a merge, causing other people to see only a
552 portion of the work they intended to do. This also affects
553 how you work with project history. If you want to see all of
554 the modifications someone made as part of a task, you will
555 need to manually inspect the descriptions and timestamps of
556 the changes made to each file involved (if you even know what
557 those files were).</para>
559 <para id="x_b7">CVS has a muddled notion of tags and branches that I will
560 not attempt to even describe. It does not support renaming of
561 files or directories well, making it easy to corrupt a
562 repository. It has almost no internal consistency checking
563 capabilities, so it is usually not even possible to tell
564 whether or how a repository is corrupt. I would not recommend
565 CVS for any project, existing or new.</para>
567 <para id="x_b8">Mercurial can import CVS revision history. However, there
568 are a few caveats that apply; these are true of every other
569 revision control tool's CVS importer, too. Due to CVS's lack
570 of atomic changes and unversioned filesystem hierarchy, it is
571 not possible to reconstruct CVS history completely accurately;
572 some guesswork is involved, and renames will usually not show
573 up. Because a lot of advanced CVS administration has to be
574 done by hand and is hence error-prone, it's common for CVS
575 importers to run into multiple problems with corrupted
576 repositories (completely bogus revision timestamps and files
577 that have remained locked for over a decade are just two of
578 the less interesting problems I can recall from personal
579 experience).</para>
581 <para id="x_b9">Mercurial can import revision history from a CVS
582 repository.</para>
585 </sect2>
586 <sect2>
587 <title>Commercial tools</title>
589 <para id="x_ba">Perforce has a centralised client/server architecture,
590 with no client-side caching of any data. Unlike modern
591 revision control tools, Perforce requires that a user run a
592 command to inform the server about every file they intend to
593 edit.</para>
595 <para id="x_bb">The performance of Perforce is quite good for small teams,
596 but it falls off rapidly as the number of users grows beyond a
597 few dozen. Modestly large Perforce installations require the
598 deployment of proxies to cope with the load their users
599 generate.</para>
602 </sect2>
603 <sect2>
604 <title>Choosing a revision control tool</title>
606 <para id="x_bc">With the exception of CVS, all of the tools listed above
607 have unique strengths that suit them to particular styles of
608 work. There is no single revision control tool that is best
609 in all situations.</para>
611 <para id="x_bd">As an example, Subversion is a good choice for working
612 with frequently edited binary files, due to its centralised
613 nature and support for file locking.</para>
615 <para id="x_be">I personally find Mercurial's properties of simplicity,
616 performance, and good merge support to be a compelling
617 combination that has served me well for several years.</para>
620 </sect2>
621 </sect1>
622 <sect1>
623 <title>Switching from another tool to Mercurial</title>
625 <para id="x_bf">Mercurial is bundled with an extension named <literal
626 role="hg-ext">convert</literal>, which can incrementally
627 import revision history from several other revision control
628 tools. By <quote>incremental</quote>, I mean that you can
629 convert all of a project's history to date in one go, then rerun
630 the conversion later to obtain new changes that happened after
631 the initial conversion.</para>
633 <para id="x_c0">The revision control tools supported by <literal
634 role="hg-ext">convert</literal> are as follows:</para>
635 <itemizedlist>
636 <listitem><para id="x_c1">Subversion</para></listitem>
637 <listitem><para id="x_c2">CVS</para></listitem>
638 <listitem><para id="x_c3">Git</para></listitem>
639 <listitem><para id="x_c4">Darcs</para></listitem></itemizedlist>
641 <para id="x_c5">In addition, <literal role="hg-ext">convert</literal> can
642 export changes from Mercurial to Subversion. This makes it
643 possible to try Subversion and Mercurial in parallel before
644 committing to a switchover, without risking the loss of any
645 work.</para>
647 <para id="x_c6">The <command role="hg-ext-convert">convert</command> command
648 is easy to use. Simply point it at the path or URL of the
649 source repository, optionally give it the name of the
650 destination repository, and it will start working. After the
651 initial conversion, just run the same command again to import
652 new changes.</para>
653 </sect1>
655 <sect1>
656 <title>A short history of revision control</title>
658 <para id="x_c7">The best known of the old-time revision control tools is
659 SCCS (Source Code Control System), which Marc Rochkind wrote at
660 Bell Labs, in the early 1970s. SCCS operated on individual
661 files, and required every person working on a project to have
662 access to a shared workspace on a single system. Only one
663 person could modify a file at any time; arbitration for access
664 to files was via locks. It was common for people to lock files,
665 and later forget to unlock them, preventing anyone else from
666 modifying those files without the help of an
667 administrator.</para>
669 <para id="x_c8">Walter Tichy developed a free alternative to SCCS in the
670 early 1980s; he called his program RCS (Revision Control System).
671 Like SCCS, RCS required developers to work in a single shared
672 workspace, and to lock files to prevent multiple people from
673 modifying them simultaneously.</para>
675 <para id="x_c9">Later in the 1980s, Dick Grune used RCS as a building block
676 for a set of shell scripts he initially called cmt, but then
677 renamed to CVS (Concurrent Versions System). The big innovation
678 of CVS was that it let developers work simultaneously and
679 somewhat independently in their own personal workspaces. The
680 personal workspaces prevented developers from stepping on each
681 other's toes all the time, as was common with SCCS and RCS. Each
682 developer had a copy of every project file, and could modify
683 their copies independently. They had to merge their edits prior
684 to committing changes to the central repository.</para>
686 <para id="x_ca">Brian Berliner took Grune's original scripts and rewrote
687 them in C, releasing in 1989 the code that has since developed
688 into the modern version of CVS. CVS subsequently acquired the
689 ability to operate over a network connection, giving it a
690 client/server architecture. CVS's architecture is centralised;
691 only the server has a copy of the history of the project. Client
692 workspaces just contain copies of recent versions of the
693 project's files, and a little metadata to tell them where the
694 server is. CVS has been enormously successful; it is probably
695 the world's most widely used revision control system.</para>
697 <para id="x_cb">In the early 1990s, Sun Microsystems developed an early
698 distributed revision control system, called TeamWare. A
699 TeamWare workspace contains a complete copy of the project's
700 history. TeamWare has no notion of a central repository. (CVS
701 relied upon RCS for its history storage; TeamWare used
702 SCCS.)</para>
704 <para id="x_cc">As the 1990s progressed, awareness grew of a number of
705 problems with CVS. It records simultaneous changes to multiple
706 files individually, instead of grouping them together as a
707 single logically atomic operation. It does not manage its file
708 hierarchy well; it is easy to make a mess of a repository by
709 renaming files and directories. Worse, its source code is
710 difficult to read and maintain, which made the <quote>pain
711 level</quote> of fixing these architectural problems
712 prohibitive.</para>
714 <para id="x_cd">In 2001, Jim Blandy and Karl Fogel, two developers who had
715 worked on CVS, started a project to replace it with a tool that
716 would have a better architecture and cleaner code. The result,
717 Subversion, does not stray from CVS's centralised client/server
718 model, but it adds multi-file atomic commits, better namespace
719 management, and a number of other features that make it a
720 generally better tool than CVS. Since its initial release, it
721 has rapidly grown in popularity.</para>
723 <para id="x_ce">More or less simultaneously, Graydon Hoare began working on
724 an ambitious distributed revision control system that he named
725 Monotone. While Monotone addresses many of CVS's design flaws
726 and has a peer-to-peer architecture, it goes beyond earlier (and
727 subsequent) revision control tools in a number of innovative
728 ways. It uses cryptographic hashes as identifiers, and has an
729 integral notion of <quote>trust</quote> for code from different
730 sources.</para>
732 <para id="x_cf">Mercurial began life in 2005. While a few aspects of its
733 design are influenced by Monotone, Mercurial focuses on ease of
734 use, high performance, and scalability to very large
735 projects.</para>
737 </sect1>
739 <sect1>
740 <title>Colophon&emdash;this book is Free</title>
742 <para id="x_d0">This book is licensed under the Open Publication License,
743 and is produced entirely using Free Software tools. It is
744 typeset with DocBook XML. Illustrations are drawn and rendered with
745 <ulink url="http://www.inkscape.org/">Inkscape</ulink>.</para>
747 <para id="x_d1">The complete source code for this book is published as a
748 Mercurial repository, at <ulink
749 url="http://hg.serpentine.com/mercurial/book">http://hg.serpentine.com/mercurial/book</ulink>.</para>
751 </sect1>
752 </preface>
753 <!--
754 local variables:
755 sgml-parent-document: ("00book.xml" "book" "preface")
756 end:
757 -->