hgbook

view en/ch00-preface.xml @ 584:c838b3975bc6

Add IDs to paragraphs.
author Bryan O'Sullivan <bos@serpentine.com>
date Thu Mar 19 21:18:52 2009 -0700 (2009-03-19)
parents 28b5a5befb08
children 34cb220eb717
line source
1 <!-- vim: set filetype=docbkxml shiftwidth=2 autoindent expandtab tw=77 : -->
3 <preface id="chap:preface">
4 <title>Preface</title>
6 <sect1>
7 <title>Why revision control? Why Mercurial?</title>
9 <para id="x_6d">Revision control is the process of managing multiple
10 versions of a piece of information. In its simplest form, this
11 is something that many people do by hand: every time you modify
12 a file, save it under a new name that contains a number, each
13 one higher than the number of the preceding version.</para>
15 <para id="x_6e">Manually managing multiple versions of even a single file is
16 an error-prone task, though, so software tools to help automate
17 this process have long been available. The earliest automated
18 revision control tools were intended to help a single user to
19 manage revisions of a single file. Over the past few decades,
20 the scope of revision control tools has expanded greatly; they
21 now manage multiple files, and help multiple people to work
22 together. The best modern revision control tools have no
23 problem coping with thousands of people working together on
24 projects that consist of hundreds of thousands of files.</para>
26 <para id="x_6f">The arrival of distributed revision control is relatively
27 recent, and so far this new field has grown due to people's
28 willingness to explore ill-charted territory.</para>
30 <para id="x_70">I am writing a book about distributed revision control
31 because I believe that it is an important subject that deserves
32 a field guide. I chose to write about Mercurial because it is
33 the easiest tool to learn the terrain with, and yet it scales to
34 the demands of real, challenging environments where many other
35 revision control tools buckle.</para>
37 <sect2>
38 <title>Why use revision control?</title>
40 <para id="x_71">There are a number of reasons why you or your team might
41 want to use an automated revision control tool for a
42 project.</para>
44 <itemizedlist>
45 <listitem><para id="x_72">It will track the history and evolution of
46 your project, so you don't have to. For every change,
47 you'll have a log of <emphasis>who</emphasis> made it;
48 <emphasis>why</emphasis> they made it;
49 <emphasis>when</emphasis> they made it; and
50 <emphasis>what</emphasis> the change
51 was.</para></listitem>
52 <listitem><para id="x_73">When you're working with other people,
53 revision control software makes it easier for you to
54 collaborate. For example, when people more or less
55 simultaneously make potentially incompatible changes, the
56 software will help you to identify and resolve those
57 conflicts.</para></listitem>
58 <listitem><para id="x_74">It can help you to recover from mistakes. If
59 you make a change that later turns out to be in error, you
60 can revert to an earlier version of one or more files. In
61 fact, a <emphasis>really</emphasis> good revision control
62 tool will even help you to efficiently figure out exactly
63 when a problem was introduced (see section <xref
64 linkend="sec:undo:bisect"/> for details).</para></listitem>
65 <listitem><para id="x_75">It will help you to work simultaneously on,
66 and manage the drift between, multiple versions of your
67 project.</para></listitem>
68 </itemizedlist>
70 <para id="x_76">Most of these reasons are equally valid---at least in
71 theory---whether you're working on a project by yourself, or
72 with a hundred other people.</para>
74 <para id="x_77">A key question about the practicality of revision control
75 at these two different scales (<quote>lone hacker</quote> and
76 <quote>huge team</quote>) is how its
77 <emphasis>benefits</emphasis> compare to its
78 <emphasis>costs</emphasis>. A revision control tool that's
79 difficult to understand or use is going to impose a high
80 cost.</para>
82 <para id="x_78">A five-hundred-person project is likely to collapse under
83 its own weight almost immediately without a revision control
84 tool and process. In this case, the cost of using revision
85 control might hardly seem worth considering, since
86 <emphasis>without</emphasis> it, failure is almost
87 guaranteed.</para>
89 <para id="x_79">On the other hand, a one-person <quote>quick hack</quote>
90 might seem like a poor place to use a revision control tool,
91 because surely the cost of using one must be close to the
92 overall cost of the project. Right?</para>
94 <para id="x_7a">Mercurial uniquely supports <emphasis>both</emphasis> of
95 these scales of development. You can learn the basics in just
96 a few minutes, and due to its low overhead, you can apply
97 revision control to the smallest of projects with ease. Its
98 simplicity means you won't have a lot of abstruse concepts or
99 command sequences competing for mental space with whatever
100 you're <emphasis>really</emphasis> trying to do. At the same
101 time, Mercurial's high performance and peer-to-peer nature let
102 you scale painlessly to handle large projects.</para>
104 <para id="x_7b">No revision control tool can rescue a poorly run project,
105 but a good choice of tools can make a huge difference to the
106 fluidity with which you can work on a project.</para>
108 </sect2>
110 <sect2>
111 <title>The many names of revision control</title>
113 <para id="x_7c">Revision control is a diverse field, so much so that it is
114 referred to by many names and acronyms. Here are a few of the
115 more common variations you'll encounter:</para>
116 <itemizedlist>
117 <listitem><para id="x_7d">Revision control (RCS)</para></listitem>
118 <listitem><para id="x_7e">Software configuration management (SCM), or
119 configuration management</para></listitem>
120 <listitem><para id="x_7f">Source code management</para></listitem>
121 <listitem><para id="x_80">Source code control, or source
122 control</para></listitem>
123 <listitem><para id="x_81">Version control
124 (VCS)</para></listitem></itemizedlist>
125 <para id="x_82">Some people claim that these terms actually have different
126 meanings, but in practice they overlap so much that there's no
127 agreed or even useful way to tease them apart.</para>
129 </sect2>
130 </sect1>
132 <sect1>
133 <title>This book is a work in progress</title>
135 <para id="x_83">I am releasing this book while I am still writing it, in the
136 hope that it will prove useful to others. I am writing under an
137 open license in the hope that you, my readers, will contribute
138 feedback and perhaps content of your own.</para>
140 </sect1>
141 <sect1>
142 <title>About the examples in this book</title>
144 <para id="x_84">This book takes an unusual approach to code samples. Every
145 example is <quote>live</quote>---each one is actually the result
146 of a shell script that executes the Mercurial commands you see.
147 Every time an image of the book is built from its sources, all
148 the example scripts are automatically run, and their current
149 results compared against their expected results.</para>
151 <para id="x_85">The advantage of this approach is that the examples are
152 always accurate; they describe <emphasis>exactly</emphasis> the
153 behaviour of the version of Mercurial that's mentioned at the
154 front of the book. If I update the version of Mercurial that
155 I'm documenting, and the output of some command changes, the
156 build fails.</para>
158 <para id="x_86">There is a small disadvantage to this approach, which is
159 that the dates and times you'll see in examples tend to be
160 <quote>squashed</quote> together in a way that they wouldn't be
161 if the same commands were being typed by a human. Where a human
162 can issue no more than one command every few seconds, with any
163 resulting timestamps correspondingly spread out, my automated
164 example scripts run many commands in one second.</para>
166 <para id="x_87">As an instance of this, several consecutive commits in an
167 example can show up as having occurred during the same second.
168 You can see this occur in the <literal
169 role="hg-ext">bisect</literal> example in section <xref
170 id="sec:undo:bisect"/>, for instance.</para>
172 <para id="x_88">So when you're reading examples, don't place too much weight
173 on the dates or times you see in the output of commands. But
174 <emphasis>do</emphasis> be confident that the behaviour you're
175 seeing is consistent and reproducible.</para>
177 </sect1>
179 <sect1>
180 <title>Trends in the field</title>
182 <para id="x_89">There has been an unmistakable trend in the development and
183 use of revision control tools over the past four decades, as
184 people have become familiar with the capabilities of their tools
185 and constrained by their limitations.</para>
187 <para id="x_8a">The first generation began by managing single files on
188 individual computers. Although these tools represented a huge
189 advance over ad-hoc manual revision control, their locking model
190 and reliance on a single computer limited them to small,
191 tightly-knit teams.</para>
193 <para id="x_8b">The second generation loosened these constraints by moving
194 to network-centered architectures, and managing entire projects
195 at a time. As projects grew larger, they ran into new problems.
196 With clients needing to talk to servers very frequently, server
197 scaling became an issue for large projects. An unreliable
198 network connection could prevent remote users from being able to
199 talk to the server at all. As open source projects started
200 making read-only access available anonymously to anyone, people
201 without commit privileges found that they could not use the
202 tools to interact with a project in a natural way, as they could
203 not record their changes.</para>
205 <para id="x_8c">The current generation of revision control tools is
206 peer-to-peer in nature. All of these systems have dropped the
207 dependency on a single central server, and allow people to
208 distribute their revision control data to where it's actually
209 needed. Collaboration over the Internet has moved from
210 constrained by technology to a matter of choice and consensus.
211 Modern tools can operate offline indefinitely and autonomously,
212 with a network connection only needed when syncing changes with
213 another repository.</para>
215 </sect1>
216 <sect1>
217 <title>A few of the advantages of distributed revision
218 control</title>
220 <para id="x_8d">Even though distributed revision control tools have for
221 several years been as robust and usable as their
222 previous-generation counterparts, people using older tools have
223 not yet necessarily woken up to their advantages. There are a
224 number of ways in which distributed tools shine relative to
225 centralised ones.</para>
227 <para id="x_8e">For an individual developer, distributed tools are almost
228 always much faster than centralised tools. This is for a simple
229 reason: a centralised tool needs to talk over the network for
230 many common operations, because most metadata is stored in a
231 single copy on the central server. A distributed tool stores
232 all of its metadata locally. All else being equal, talking over
233 the network adds overhead to a centralised tool. Don't
234 underestimate the value of a snappy, responsive tool: you're
235 going to spend a lot of time interacting with your revision
236 control software.</para>
238 <para id="x_8f">Distributed tools are indifferent to the vagaries of your
239 server infrastructure, again because they replicate metadata to
240 so many locations. If you use a centralised system and your
241 server catches fire, you'd better hope that your backup media
242 are reliable, and that your last backup was recent and actually
243 worked. With a distributed tool, you have many backups
244 available on every contributor's computer.</para>
246 <para id="x_90">The reliability of your network will affect distributed
247 tools far less than it will centralised tools. You can't even
248 use a centralised tool without a network connection, except for
249 a few highly constrained commands. With a distributed tool, if
250 your network connection goes down while you're working, you may
251 not even notice. The only thing you won't be able to do is talk
252 to repositories on other computers, something that is relatively
253 rare compared with local operations. If you have a far-flung
254 team of collaborators, this may be significant.</para>
256 <sect2>
257 <title>Advantages for open source projects</title>
259 <para id="x_91">If you take a shine to an open source project and decide
260 that you would like to start hacking on it, and that project
261 uses a distributed revision control tool, you are at once a
262 peer with the people who consider themselves the
263 <quote>core</quote> of that project. If they publish their
264 repositories, you can immediately copy their project history,
265 start making changes, and record your work, using the same
266 tools in the same ways as insiders. By contrast, with a
267 centralised tool, you must use the software in a <quote>read
268 only</quote> mode unless someone grants you permission to
269 commit changes to their central server. Until then, you won't
270 be able to record changes, and your local modifications will
271 be at risk of corruption any time you try to update your
272 client's view of the repository.</para>
274 <sect3>
275 <title>The forking non-problem</title>
277 <para id="x_92">It has been suggested that distributed revision control
278 tools pose some sort of risk to open source projects because
279 they make it easy to <quote>fork</quote> the development of
280 a project. A fork happens when there are differences in
281 opinion or attitude between groups of developers that cause
282 them to decide that they can't work together any longer.
283 Each side takes a more or less complete copy of the
284 project's source code, and goes off in its own
285 direction.</para>
287 <para id="x_93">Sometimes the camps in a fork decide to reconcile their
288 differences. With a centralised revision control system, the
289 <emphasis>technical</emphasis> process of reconciliation is
290 painful, and has to be performed largely by hand. You have
291 to decide whose revision history is going to
292 <quote>win</quote>, and graft the other team's changes into
293 the tree somehow. This usually loses some or all of one
294 side's revision history.</para>
296 <para id="x_94">What distributed tools do with respect to forking is
297 they make forking the <emphasis>only</emphasis> way to
298 develop a project. Every single change that you make is
299 potentially a fork point. The great strength of this
300 approach is that a distributed revision control tool has to
301 be really good at <emphasis>merging</emphasis> forks,
302 because forks are absolutely fundamental: they happen all
303 the time.</para>
305 <para id="x_95">If every piece of work that everybody does, all the
306 time, is framed in terms of forking and merging, then what
307 the open source world refers to as a <quote>fork</quote>
308 becomes <emphasis>purely</emphasis> a social issue. If
309 anything, distributed tools <emphasis>lower</emphasis> the
310 likelihood of a fork:</para>
311 <itemizedlist>
312 <listitem><para id="x_96">They eliminate the social distinction that
313 centralised tools impose: that between insiders (people
314 with commit access) and outsiders (people
315 without).</para></listitem>
316 <listitem><para id="x_97">They make it easier to reconcile after a
317 social fork, because all that's involved from the
318 perspective of the revision control software is just
319 another merge.</para></listitem></itemizedlist>
321 <para id="x_98">Some people resist distributed tools because they want
322 to retain tight control over their projects, and they
323 believe that centralised tools give them this control.
324 However, if you're of this belief, and you publish your CVS
325 or Subversion repositories publicly, there are plenty of
326 tools available that can pull out your entire project's
327 history (albeit slowly) and recreate it somewhere that you
328 don't control. So while your control in this case is
329 illusory, you are forgoing the ability to fluidly
330 collaborate with whatever people feel compelled to mirror
331 and fork your history.</para>
333 </sect3>
334 </sect2>
335 <sect2>
336 <title>Advantages for commercial projects</title>
338 <para id="x_99">Many commercial projects are undertaken by teams that are
339 scattered across the globe. Contributors who are far from a
340 central server will see slower command execution and perhaps
341 less reliability. Commercial revision control systems attempt
342 to ameliorate these problems with remote-site replication
343 add-ons that are typically expensive to buy and cantankerous
344 to administer. A distributed system doesn't suffer from these
345 problems in the first place. Better yet, you can easily set
346 up multiple authoritative servers, say one per site, so that
347 there's no redundant communication between repositories over
348 expensive long-haul network links.</para>
350 <para id="x_9a">Centralised revision control systems tend to have
351 relatively low scalability. It's not unusual for an expensive
352 centralised system to fall over under the combined load of
353 just a few dozen concurrent users. Once again, the typical
354 response tends to be an expensive and clunky replication
355 facility. Since the load on a central server---if you have
356 one at all---is many times lower with a distributed tool
357 (because all of the data is replicated everywhere), a single
358 cheap server can handle the needs of a much larger team, and
359 replication to balance load becomes a simple matter of
360 scripting.</para>
362 <para id="x_9b">If you have an employee in the field, troubleshooting a
363 problem at a customer's site, they'll benefit from distributed
364 revision control. The tool will let them generate custom
365 builds, try different fixes in isolation from each other, and
366 search efficiently through history for the sources of bugs and
367 regressions in the customer's environment, all without needing
368 to connect to your company's network.</para>
370 </sect2>
371 </sect1>
372 <sect1>
373 <title>Why choose Mercurial?</title>
375 <para id="x_9c">Mercurial has a unique set of properties that make it a
376 particularly good choice as a revision control system.</para>
377 <itemizedlist>
378 <listitem><para id="x_9d">It is easy to learn and use.</para></listitem>
379 <listitem><para id="x_9e">It is lightweight.</para></listitem>
380 <listitem><para id="x_9f">It scales excellently.</para></listitem>
381 <listitem><para id="x_a0">It is easy to
382 customise.</para></listitem></itemizedlist>
384 <para id="x_a1">If you are at all familiar with revision control systems,
385 you should be able to get up and running with Mercurial in less
386 than five minutes. Even if not, it will take no more than a few
387 minutes longer. Mercurial's command and feature sets are
388 generally uniform and consistent, so you can keep track of a few
389 general rules instead of a host of exceptions.</para>
391 <para id="x_a2">On a small project, you can start working with Mercurial in
392 moments. Creating new changes and branches; transferring changes
393 around (whether locally or over a network); and history and
394 status operations are all fast. Mercurial attempts to stay
395 nimble and largely out of your way by combining low cognitive
396 overhead with blazingly fast operations.</para>
398 <para id="x_a3">The usefulness of Mercurial is not limited to small
399 projects: it is used by projects with hundreds to thousands of
400 contributors, each containing tens of thousands of files and
401 hundreds of megabytes of source code.</para>
403 <para id="x_a4">If the core functionality of Mercurial is not enough for
404 you, it's easy to build on. Mercurial is well suited to
405 scripting tasks, and its clean internals and implementation in
406 Python make it easy to add features in the form of extensions.
407 There are a number of popular and useful extensions already
408 available, ranging from helping to identify bugs to improving
409 performance.</para>
411 </sect1>
412 <sect1>
413 <title>Mercurial compared with other tools</title>
415 <para id="x_a5">Before you read on, please understand that this section
416 necessarily reflects my own experiences, interests, and (dare I
417 say it) biases. I have used every one of the revision control
418 tools listed below, in most cases for several years at a
419 time.</para>
422 <sect2>
423 <title>Subversion</title>
425 <para id="x_a6">Subversion is a popular revision control tool, developed
426 to replace CVS. It has a centralised client/server
427 architecture.</para>
429 <para id="x_a7">Subversion and Mercurial have similarly named commands for
430 performing the same operations, so if you're familiar with
431 one, it is easy to learn to use the other. Both tools are
432 portable to all popular operating systems.</para>
434 <para id="x_a8">Prior to version 1.5, Subversion had no useful support for
435 merges. At the time of writing, its merge tracking capability
436 is new, and known to be <ulink
437 url="http://svnbook.red-bean.com/nightly/en/svn.branchmerge.advanced.html#svn.branchmerge.advanced.finalword">complicated
438 and buggy</ulink>.</para>
440 <para id="x_a9">Mercurial has a substantial performance advantage over
441 Subversion on every revision control operation I have
442 benchmarked. I have measured its advantage as ranging from a
443 factor of two to a factor of six when compared with Subversion
444 1.4.3's <emphasis>ra_local</emphasis> file store, which is the
445 fastest access method available. In more realistic
446 deployments involving a network-based store, Subversion will
447 be at a substantially larger disadvantage. Because many
448 Subversion commands must talk to the server and Subversion
449 does not have useful replication facilities, server capacity
450 and network bandwidth become bottlenecks for modestly large
451 projects.</para>
453 <para id="x_aa">Additionally, Subversion incurs substantial storage
454 overhead to avoid network transactions for a few common
455 operations, such as finding modified files
456 (<literal>status</literal>) and displaying modifications
457 against the current revision (<literal>diff</literal>). As a
458 result, a Subversion working copy is often the same size as,
459 or larger than, a Mercurial repository and working directory,
460 even though the Mercurial repository contains a complete
461 history of the project.</para>
463 <para id="x_ab">Subversion is widely supported by third party tools.
464 Mercurial currently lags considerably in this area. This gap
465 is closing, however, and indeed some of Mercurial's GUI tools
466 now outshine their Subversion equivalents. Like Mercurial,
467 Subversion has an excellent user manual.</para>
469 <para id="x_ac">Because Subversion doesn't store revision history on the
470 client, it is well suited to managing projects that deal with
471 lots of large, opaque binary files. If you check in fifty
472 revisions to an incompressible 10MB file, Subversion's
473 client-side space usage stays constant The space used by any
474 distributed SCM will grow rapidly in proportion to the number
475 of revisions, because the differences between each revision
476 are large.</para>
478 <para id="x_ad">In addition, it's often difficult or, more usually,
479 impossible to merge different versions of a binary file.
480 Subversion's ability to let a user lock a file, so that they
481 temporarily have the exclusive right to commit changes to it,
482 can be a significant advantage to a project where binary files
483 are widely used.</para>
485 <para id="x_ae">Mercurial can import revision history from a Subversion
486 repository. It can also export revision history to a
487 Subversion repository. This makes it easy to <quote>test the
488 waters</quote> and use Mercurial and Subversion in parallel
489 before deciding to switch. History conversion is incremental,
490 so you can perform an initial conversion, then small
491 additional conversions afterwards to bring in new
492 changes.</para>
495 </sect2>
496 <sect2>
497 <title>Git</title>
499 <para id="x_af">Git is a distributed revision control tool that was
500 developed for managing the Linux kernel source tree. Like
501 Mercurial, its early design was somewhat influenced by
502 Monotone.</para>
504 <para id="x_b0">Git has a very large command set, with version 1.5.0
505 providing 139 individual commands. It has something of a
506 reputation for being difficult to learn. Compared to Git,
507 Mercurial has a strong focus on simplicity.</para>
509 <para id="x_b1">In terms of performance, Git is extremely fast. In
510 several cases, it is faster than Mercurial, at least on Linux,
511 while Mercurial performs better on other operations. However,
512 on Windows, the performance and general level of support that
513 Git provides is, at the time of writing, far behind that of
514 Mercurial.</para>
516 <para id="x_b2">While a Mercurial repository needs no maintenance, a Git
517 repository requires frequent manual <quote>repacks</quote> of
518 its metadata. Without these, performance degrades, while
519 space usage grows rapidly. A server that contains many Git
520 repositories that are not rigorously and frequently repacked
521 will become heavily disk-bound during backups, and there have
522 been instances of daily backups taking far longer than 24
523 hours as a result. A freshly packed Git repository is
524 slightly smaller than a Mercurial repository, but an unpacked
525 repository is several orders of magnitude larger.</para>
527 <para id="x_b3">The core of Git is written in C. Many Git commands are
528 implemented as shell or Perl scripts, and the quality of these
529 scripts varies widely. I have encountered several instances
530 where scripts charged along blindly in the presence of errors
531 that should have been fatal.</para>
533 <para id="x_b4">Mercurial can import revision history from a Git
534 repository.</para>
537 </sect2>
538 <sect2>
539 <title>CVS</title>
541 <para id="x_b5">CVS is probably the most widely used revision control tool
542 in the world. Due to its age and internal untidiness, it has
543 been only lightly maintained for many years.</para>
545 <para id="x_b6">It has a centralised client/server architecture. It does
546 not group related file changes into atomic commits, making it
547 easy for people to <quote>break the build</quote>: one person
548 can successfully commit part of a change and then be blocked
549 by the need for a merge, causing other people to see only a
550 portion of the work they intended to do. This also affects
551 how you work with project history. If you want to see all of
552 the modifications someone made as part of a task, you will
553 need to manually inspect the descriptions and timestamps of
554 the changes made to each file involved (if you even know what
555 those files were).</para>
557 <para id="x_b7">CVS has a muddled notion of tags and branches that I will
558 not attempt to even describe. It does not support renaming of
559 files or directories well, making it easy to corrupt a
560 repository. It has almost no internal consistency checking
561 capabilities, so it is usually not even possible to tell
562 whether or how a repository is corrupt. I would not recommend
563 CVS for any project, existing or new.</para>
565 <para id="x_b8">Mercurial can import CVS revision history. However, there
566 are a few caveats that apply; these are true of every other
567 revision control tool's CVS importer, too. Due to CVS's lack
568 of atomic changes and unversioned filesystem hierarchy, it is
569 not possible to reconstruct CVS history completely accurately;
570 some guesswork is involved, and renames will usually not show
571 up. Because a lot of advanced CVS administration has to be
572 done by hand and is hence error-prone, it's common for CVS
573 importers to run into multiple problems with corrupted
574 repositories (completely bogus revision timestamps and files
575 that have remained locked for over a decade are just two of
576 the less interesting problems I can recall from personal
577 experience).</para>
579 <para id="x_b9">Mercurial can import revision history from a CVS
580 repository.</para>
583 </sect2>
584 <sect2>
585 <title>Commercial tools</title>
587 <para id="x_ba">Perforce has a centralised client/server architecture,
588 with no client-side caching of any data. Unlike modern
589 revision control tools, Perforce requires that a user run a
590 command to inform the server about every file they intend to
591 edit.</para>
593 <para id="x_bb">The performance of Perforce is quite good for small teams,
594 but it falls off rapidly as the number of users grows beyond a
595 few dozen. Modestly large Perforce installations require the
596 deployment of proxies to cope with the load their users
597 generate.</para>
600 </sect2>
601 <sect2>
602 <title>Choosing a revision control tool</title>
604 <para id="x_bc">With the exception of CVS, all of the tools listed above
605 have unique strengths that suit them to particular styles of
606 work. There is no single revision control tool that is best
607 in all situations.</para>
609 <para id="x_bd">As an example, Subversion is a good choice for working
610 with frequently edited binary files, due to its centralised
611 nature and support for file locking.</para>
613 <para id="x_be">I personally find Mercurial's properties of simplicity,
614 performance, and good merge support to be a compelling
615 combination that has served me well for several years.</para>
618 </sect2>
619 </sect1>
620 <sect1>
621 <title>Switching from another tool to Mercurial</title>
623 <para id="x_bf">Mercurial is bundled with an extension named <literal
624 role="hg-ext">convert</literal>, which can incrementally
625 import revision history from several other revision control
626 tools. By <quote>incremental</quote>, I mean that you can
627 convert all of a project's history to date in one go, then rerun
628 the conversion later to obtain new changes that happened after
629 the initial conversion.</para>
631 <para id="x_c0">The revision control tools supported by <literal
632 role="hg-ext">convert</literal> are as follows:</para>
633 <itemizedlist>
634 <listitem><para id="x_c1">Subversion</para></listitem>
635 <listitem><para id="x_c2">CVS</para></listitem>
636 <listitem><para id="x_c3">Git</para></listitem>
637 <listitem><para id="x_c4">Darcs</para></listitem></itemizedlist>
639 <para id="x_c5">In addition, <literal role="hg-ext">convert</literal> can
640 export changes from Mercurial to Subversion. This makes it
641 possible to try Subversion and Mercurial in parallel before
642 committing to a switchover, without risking the loss of any
643 work.</para>
645 <para id="x_c6">The <command role="hg-ext-convert">convert</command> command
646 is easy to use. Simply point it at the path or URL of the
647 source repository, optionally give it the name of the
648 destination repository, and it will start working. After the
649 initial conversion, just run the same command again to import
650 new changes.</para>
651 </sect1>
653 <sect1>
654 <title>A short history of revision control</title>
656 <para id="x_c7">The best known of the old-time revision control tools is
657 SCCS (Source Code Control System), which Marc Rochkind wrote at
658 Bell Labs, in the early 1970s. SCCS operated on individual
659 files, and required every person working on a project to have
660 access to a shared workspace on a single system. Only one
661 person could modify a file at any time; arbitration for access
662 to files was via locks. It was common for people to lock files,
663 and later forget to unlock them, preventing anyone else from
664 modifying those files without the help of an
665 administrator.</para>
667 <para id="x_c8">Walter Tichy developed a free alternative to SCCS in the
668 early 1980s; he called his program RCS (Revision Control System).
669 Like SCCS, RCS required developers to work in a single shared
670 workspace, and to lock files to prevent multiple people from
671 modifying them simultaneously.</para>
673 <para id="x_c9">Later in the 1980s, Dick Grune used RCS as a building block
674 for a set of shell scripts he initially called cmt, but then
675 renamed to CVS (Concurrent Versions System). The big innovation
676 of CVS was that it let developers work simultaneously and
677 somewhat independently in their own personal workspaces. The
678 personal workspaces prevented developers from stepping on each
679 other's toes all the time, as was common with SCCS and RCS. Each
680 developer had a copy of every project file, and could modify
681 their copies independently. They had to merge their edits prior
682 to committing changes to the central repository.</para>
684 <para id="x_ca">Brian Berliner took Grune's original scripts and rewrote
685 them in C, releasing in 1989 the code that has since developed
686 into the modern version of CVS. CVS subsequently acquired the
687 ability to operate over a network connection, giving it a
688 client/server architecture. CVS's architecture is centralised;
689 only the server has a copy of the history of the project. Client
690 workspaces just contain copies of recent versions of the
691 project's files, and a little metadata to tell them where the
692 server is. CVS has been enormously successful; it is probably
693 the world's most widely used revision control system.</para>
695 <para id="x_cb">In the early 1990s, Sun Microsystems developed an early
696 distributed revision control system, called TeamWare. A
697 TeamWare workspace contains a complete copy of the project's
698 history. TeamWare has no notion of a central repository. (CVS
699 relied upon RCS for its history storage; TeamWare used
700 SCCS.)</para>
702 <para id="x_cc">As the 1990s progressed, awareness grew of a number of
703 problems with CVS. It records simultaneous changes to multiple
704 files individually, instead of grouping them together as a
705 single logically atomic operation. It does not manage its file
706 hierarchy well; it is easy to make a mess of a repository by
707 renaming files and directories. Worse, its source code is
708 difficult to read and maintain, which made the <quote>pain
709 level</quote> of fixing these architectural problems
710 prohibitive.</para>
712 <para id="x_cd">In 2001, Jim Blandy and Karl Fogel, two developers who had
713 worked on CVS, started a project to replace it with a tool that
714 would have a better architecture and cleaner code. The result,
715 Subversion, does not stray from CVS's centralised client/server
716 model, but it adds multi-file atomic commits, better namespace
717 management, and a number of other features that make it a
718 generally better tool than CVS. Since its initial release, it
719 has rapidly grown in popularity.</para>
721 <para id="x_ce">More or less simultaneously, Graydon Hoare began working on
722 an ambitious distributed revision control system that he named
723 Monotone. While Monotone addresses many of CVS's design flaws
724 and has a peer-to-peer architecture, it goes beyond earlier (and
725 subsequent) revision control tools in a number of innovative
726 ways. It uses cryptographic hashes as identifiers, and has an
727 integral notion of <quote>trust</quote> for code from different
728 sources.</para>
730 <para id="x_cf">Mercurial began life in 2005. While a few aspects of its
731 design are influenced by Monotone, Mercurial focuses on ease of
732 use, high performance, and scalability to very large
733 projects.</para>
735 </sect1>
737 <sect1>
738 <title>Colophon&emdash;this book is Free</title>
740 <para id="x_d0">This book is licensed under the Open Publication License,
741 and is produced entirely using Free Software tools. It is
742 typeset with DocBook XML. Illustrations are drawn and rendered with
743 <ulink url="http://www.inkscape.org/">Inkscape</ulink>.</para>
745 <para id="x_d1">The complete source code for this book is published as a
746 Mercurial repository, at <ulink
747 url="http://hg.serpentine.com/mercurial/book">http://hg.serpentine.com/mercurial/book</ulink>.</para>
749 </sect1>
750 </preface>
751 <!--
752 local variables:
753 sgml-parent-document: ("00book.xml" "book" "preface")
754 end:
755 -->