hgbook

annotate en/ch00-preface.xml @ 658:433040113eaf

Update file location in po files
author Dongsheng Song <songdongsheng@live.cn>
date Mon Mar 30 21:37:52 2009 +0800 (2009-03-30)
parents 751ee9bf2e8d 4ce9d0754af3
children 3b33dd6aba87
rev   line source
bos@559 1 <!-- vim: set filetype=docbkxml shiftwidth=2 autoindent expandtab tw=77 : -->
bos@26 2
bos@559 3 <preface id="chap:preface">
bos@587 4 <?dbhtml filename="preface.html"?>
bos@559 5 <title>Preface</title>
bos@26 6
bos@583 7 <sect1>
bos@583 8 <title>Why revision control? Why Mercurial?</title>
bos@583 9
bos@584 10 <para id="x_6d">Revision control is the process of managing multiple
bos@583 11 versions of a piece of information. In its simplest form, this
bos@583 12 is something that many people do by hand: every time you modify
bos@583 13 a file, save it under a new name that contains a number, each
bos@583 14 one higher than the number of the preceding version.</para>
bos@583 15
bos@584 16 <para id="x_6e">Manually managing multiple versions of even a single file is
bos@583 17 an error-prone task, though, so software tools to help automate
bos@583 18 this process have long been available. The earliest automated
bos@583 19 revision control tools were intended to help a single user to
bos@583 20 manage revisions of a single file. Over the past few decades,
bos@583 21 the scope of revision control tools has expanded greatly; they
bos@583 22 now manage multiple files, and help multiple people to work
bos@583 23 together. The best modern revision control tools have no
bos@583 24 problem coping with thousands of people working together on
bos@583 25 projects that consist of hundreds of thousands of files.</para>
bos@583 26
bos@584 27 <para id="x_6f">The arrival of distributed revision control is relatively
bos@583 28 recent, and so far this new field has grown due to people's
bos@583 29 willingness to explore ill-charted territory.</para>
bos@583 30
bos@584 31 <para id="x_70">I am writing a book about distributed revision control
bos@583 32 because I believe that it is an important subject that deserves
bos@583 33 a field guide. I chose to write about Mercurial because it is
bos@583 34 the easiest tool to learn the terrain with, and yet it scales to
bos@583 35 the demands of real, challenging environments where many other
bos@583 36 revision control tools buckle.</para>
bos@583 37
bos@583 38 <sect2>
bos@583 39 <title>Why use revision control?</title>
bos@583 40
bos@584 41 <para id="x_71">There are a number of reasons why you or your team might
bos@583 42 want to use an automated revision control tool for a
bos@583 43 project.</para>
bos@583 44
bos@583 45 <itemizedlist>
bos@584 46 <listitem><para id="x_72">It will track the history and evolution of
bos@583 47 your project, so you don't have to. For every change,
bos@583 48 you'll have a log of <emphasis>who</emphasis> made it;
bos@583 49 <emphasis>why</emphasis> they made it;
bos@583 50 <emphasis>when</emphasis> they made it; and
bos@583 51 <emphasis>what</emphasis> the change
bos@583 52 was.</para></listitem>
bos@584 53 <listitem><para id="x_73">When you're working with other people,
bos@583 54 revision control software makes it easier for you to
bos@583 55 collaborate. For example, when people more or less
bos@583 56 simultaneously make potentially incompatible changes, the
bos@583 57 software will help you to identify and resolve those
bos@583 58 conflicts.</para></listitem>
bos@584 59 <listitem><para id="x_74">It can help you to recover from mistakes. If
bos@583 60 you make a change that later turns out to be in error, you
bos@583 61 can revert to an earlier version of one or more files. In
bos@583 62 fact, a <emphasis>really</emphasis> good revision control
bos@583 63 tool will even help you to efficiently figure out exactly
bos@592 64 when a problem was introduced (see <xref
bos@583 65 linkend="sec:undo:bisect"/> for details).</para></listitem>
bos@584 66 <listitem><para id="x_75">It will help you to work simultaneously on,
bos@583 67 and manage the drift between, multiple versions of your
bos@583 68 project.</para></listitem>
bos@583 69 </itemizedlist>
bos@583 70
bos@584 71 <para id="x_76">Most of these reasons are equally valid---at least in
bos@583 72 theory---whether you're working on a project by yourself, or
bos@583 73 with a hundred other people.</para>
bos@583 74
bos@584 75 <para id="x_77">A key question about the practicality of revision control
bos@583 76 at these two different scales (<quote>lone hacker</quote> and
bos@583 77 <quote>huge team</quote>) is how its
bos@583 78 <emphasis>benefits</emphasis> compare to its
bos@583 79 <emphasis>costs</emphasis>. A revision control tool that's
bos@583 80 difficult to understand or use is going to impose a high
bos@583 81 cost.</para>
bos@583 82
bos@584 83 <para id="x_78">A five-hundred-person project is likely to collapse under
bos@583 84 its own weight almost immediately without a revision control
bos@583 85 tool and process. In this case, the cost of using revision
bos@583 86 control might hardly seem worth considering, since
bos@583 87 <emphasis>without</emphasis> it, failure is almost
bos@583 88 guaranteed.</para>
bos@583 89
bos@584 90 <para id="x_79">On the other hand, a one-person <quote>quick hack</quote>
bos@583 91 might seem like a poor place to use a revision control tool,
bos@583 92 because surely the cost of using one must be close to the
bos@583 93 overall cost of the project. Right?</para>
bos@583 94
bos@584 95 <para id="x_7a">Mercurial uniquely supports <emphasis>both</emphasis> of
bos@583 96 these scales of development. You can learn the basics in just
bos@583 97 a few minutes, and due to its low overhead, you can apply
bos@583 98 revision control to the smallest of projects with ease. Its
bos@583 99 simplicity means you won't have a lot of abstruse concepts or
bos@583 100 command sequences competing for mental space with whatever
bos@583 101 you're <emphasis>really</emphasis> trying to do. At the same
bos@583 102 time, Mercurial's high performance and peer-to-peer nature let
bos@583 103 you scale painlessly to handle large projects.</para>
bos@583 104
bos@584 105 <para id="x_7b">No revision control tool can rescue a poorly run project,
bos@583 106 but a good choice of tools can make a huge difference to the
bos@583 107 fluidity with which you can work on a project.</para>
bos@583 108
bos@583 109 </sect2>
bos@583 110
bos@583 111 <sect2>
bos@583 112 <title>The many names of revision control</title>
bos@583 113
bos@584 114 <para id="x_7c">Revision control is a diverse field, so much so that it is
bos@583 115 referred to by many names and acronyms. Here are a few of the
bos@583 116 more common variations you'll encounter:</para>
bos@583 117 <itemizedlist>
bos@584 118 <listitem><para id="x_7d">Revision control (RCS)</para></listitem>
bos@584 119 <listitem><para id="x_7e">Software configuration management (SCM), or
bos@583 120 configuration management</para></listitem>
bos@584 121 <listitem><para id="x_7f">Source code management</para></listitem>
bos@584 122 <listitem><para id="x_80">Source code control, or source
bos@583 123 control</para></listitem>
bos@584 124 <listitem><para id="x_81">Version control
bos@583 125 (VCS)</para></listitem></itemizedlist>
bos@584 126 <para id="x_82">Some people claim that these terms actually have different
bos@583 127 meanings, but in practice they overlap so much that there's no
bos@583 128 agreed or even useful way to tease them apart.</para>
bos@583 129
bos@583 130 </sect2>
bos@583 131 </sect1>
bos@26 132
bos@559 133 <sect1>
bos@559 134 <title>This book is a work in progress</title>
bos@26 135
bos@584 136 <para id="x_83">I am releasing this book while I am still writing it, in the
bos@583 137 hope that it will prove useful to others. I am writing under an
bos@583 138 open license in the hope that you, my readers, will contribute
bos@583 139 feedback and perhaps content of your own.</para>
bos@200 140
bos@559 141 </sect1>
bos@559 142 <sect1>
bos@559 143 <title>About the examples in this book</title>
bos@200 144
bos@584 145 <para id="x_84">This book takes an unusual approach to code samples. Every
bos@559 146 example is <quote>live</quote>---each one is actually the result
bos@559 147 of a shell script that executes the Mercurial commands you see.
bos@559 148 Every time an image of the book is built from its sources, all
bos@559 149 the example scripts are automatically run, and their current
bos@559 150 results compared against their expected results.</para>
bos@200 151
bos@584 152 <para id="x_85">The advantage of this approach is that the examples are
bos@559 153 always accurate; they describe <emphasis>exactly</emphasis> the
bos@559 154 behaviour of the version of Mercurial that's mentioned at the
bos@559 155 front of the book. If I update the version of Mercurial that
bos@559 156 I'm documenting, and the output of some command changes, the
bos@559 157 build fails.</para>
bos@200 158
bos@584 159 <para id="x_86">There is a small disadvantage to this approach, which is
bos@559 160 that the dates and times you'll see in examples tend to be
bos@559 161 <quote>squashed</quote> together in a way that they wouldn't be
bos@559 162 if the same commands were being typed by a human. Where a human
bos@559 163 can issue no more than one command every few seconds, with any
bos@559 164 resulting timestamps correspondingly spread out, my automated
bos@559 165 example scripts run many commands in one second.</para>
bos@200 166
bos@584 167 <para id="x_87">As an instance of this, several consecutive commits in an
bos@559 168 example can show up as having occurred during the same second.
bos@559 169 You can see this occur in the <literal
bos@592 170 role="hg-ext">bisect</literal> example in <xref
bos@589 171 linkend="sec:undo:bisect"/>, for instance.</para>
bos@200 172
bos@584 173 <para id="x_88">So when you're reading examples, don't place too much weight
bos@559 174 on the dates or times you see in the output of commands. But
bos@559 175 <emphasis>do</emphasis> be confident that the behaviour you're
bos@559 176 seeing is consistent and reproducible.</para>
bos@26 177
bos@559 178 </sect1>
bos@583 179
bos@583 180 <sect1>
bos@583 181 <title>Trends in the field</title>
bos@583 182
bos@584 183 <para id="x_89">There has been an unmistakable trend in the development and
bos@583 184 use of revision control tools over the past four decades, as
bos@583 185 people have become familiar with the capabilities of their tools
bos@583 186 and constrained by their limitations.</para>
bos@583 187
bos@584 188 <para id="x_8a">The first generation began by managing single files on
bos@583 189 individual computers. Although these tools represented a huge
bos@583 190 advance over ad-hoc manual revision control, their locking model
bos@583 191 and reliance on a single computer limited them to small,
bos@583 192 tightly-knit teams.</para>
bos@583 193
bos@584 194 <para id="x_8b">The second generation loosened these constraints by moving
bos@583 195 to network-centered architectures, and managing entire projects
bos@583 196 at a time. As projects grew larger, they ran into new problems.
bos@583 197 With clients needing to talk to servers very frequently, server
bos@583 198 scaling became an issue for large projects. An unreliable
bos@583 199 network connection could prevent remote users from being able to
bos@583 200 talk to the server at all. As open source projects started
bos@583 201 making read-only access available anonymously to anyone, people
bos@583 202 without commit privileges found that they could not use the
bos@583 203 tools to interact with a project in a natural way, as they could
bos@583 204 not record their changes.</para>
bos@583 205
bos@584 206 <para id="x_8c">The current generation of revision control tools is
bos@583 207 peer-to-peer in nature. All of these systems have dropped the
bos@583 208 dependency on a single central server, and allow people to
bos@583 209 distribute their revision control data to where it's actually
bos@583 210 needed. Collaboration over the Internet has moved from
bos@583 211 constrained by technology to a matter of choice and consensus.
bos@583 212 Modern tools can operate offline indefinitely and autonomously,
bos@583 213 with a network connection only needed when syncing changes with
bos@583 214 another repository.</para>
bos@583 215
bos@583 216 </sect1>
bos@583 217 <sect1>
bos@583 218 <title>A few of the advantages of distributed revision
bos@583 219 control</title>
bos@583 220
bos@584 221 <para id="x_8d">Even though distributed revision control tools have for
bos@583 222 several years been as robust and usable as their
bos@583 223 previous-generation counterparts, people using older tools have
bos@583 224 not yet necessarily woken up to their advantages. There are a
bos@583 225 number of ways in which distributed tools shine relative to
bos@583 226 centralised ones.</para>
bos@583 227
bos@584 228 <para id="x_8e">For an individual developer, distributed tools are almost
bos@583 229 always much faster than centralised tools. This is for a simple
bos@583 230 reason: a centralised tool needs to talk over the network for
bos@583 231 many common operations, because most metadata is stored in a
bos@583 232 single copy on the central server. A distributed tool stores
bos@583 233 all of its metadata locally. All else being equal, talking over
bos@583 234 the network adds overhead to a centralised tool. Don't
bos@583 235 underestimate the value of a snappy, responsive tool: you're
bos@583 236 going to spend a lot of time interacting with your revision
bos@583 237 control software.</para>
bos@583 238
bos@584 239 <para id="x_8f">Distributed tools are indifferent to the vagaries of your
bos@583 240 server infrastructure, again because they replicate metadata to
bos@583 241 so many locations. If you use a centralised system and your
bos@583 242 server catches fire, you'd better hope that your backup media
bos@583 243 are reliable, and that your last backup was recent and actually
bos@583 244 worked. With a distributed tool, you have many backups
bos@583 245 available on every contributor's computer.</para>
bos@583 246
bos@584 247 <para id="x_90">The reliability of your network will affect distributed
bos@583 248 tools far less than it will centralised tools. You can't even
bos@583 249 use a centralised tool without a network connection, except for
bos@583 250 a few highly constrained commands. With a distributed tool, if
bos@583 251 your network connection goes down while you're working, you may
bos@583 252 not even notice. The only thing you won't be able to do is talk
bos@583 253 to repositories on other computers, something that is relatively
bos@583 254 rare compared with local operations. If you have a far-flung
bos@583 255 team of collaborators, this may be significant.</para>
bos@583 256
bos@583 257 <sect2>
bos@583 258 <title>Advantages for open source projects</title>
bos@583 259
bos@584 260 <para id="x_91">If you take a shine to an open source project and decide
bos@583 261 that you would like to start hacking on it, and that project
bos@583 262 uses a distributed revision control tool, you are at once a
bos@583 263 peer with the people who consider themselves the
bos@583 264 <quote>core</quote> of that project. If they publish their
bos@583 265 repositories, you can immediately copy their project history,
bos@583 266 start making changes, and record your work, using the same
bos@583 267 tools in the same ways as insiders. By contrast, with a
bos@583 268 centralised tool, you must use the software in a <quote>read
bos@583 269 only</quote> mode unless someone grants you permission to
bos@583 270 commit changes to their central server. Until then, you won't
bos@583 271 be able to record changes, and your local modifications will
bos@583 272 be at risk of corruption any time you try to update your
bos@583 273 client's view of the repository.</para>
bos@583 274
bos@583 275 <sect3>
bos@583 276 <title>The forking non-problem</title>
bos@583 277
bos@584 278 <para id="x_92">It has been suggested that distributed revision control
bos@583 279 tools pose some sort of risk to open source projects because
bos@583 280 they make it easy to <quote>fork</quote> the development of
bos@583 281 a project. A fork happens when there are differences in
bos@583 282 opinion or attitude between groups of developers that cause
bos@583 283 them to decide that they can't work together any longer.
bos@583 284 Each side takes a more or less complete copy of the
bos@583 285 project's source code, and goes off in its own
bos@583 286 direction.</para>
bos@583 287
bos@584 288 <para id="x_93">Sometimes the camps in a fork decide to reconcile their
bos@583 289 differences. With a centralised revision control system, the
bos@583 290 <emphasis>technical</emphasis> process of reconciliation is
bos@583 291 painful, and has to be performed largely by hand. You have
bos@583 292 to decide whose revision history is going to
bos@583 293 <quote>win</quote>, and graft the other team's changes into
bos@583 294 the tree somehow. This usually loses some or all of one
bos@583 295 side's revision history.</para>
bos@583 296
bos@584 297 <para id="x_94">What distributed tools do with respect to forking is
bos@583 298 they make forking the <emphasis>only</emphasis> way to
bos@583 299 develop a project. Every single change that you make is
bos@583 300 potentially a fork point. The great strength of this
bos@583 301 approach is that a distributed revision control tool has to
bos@583 302 be really good at <emphasis>merging</emphasis> forks,
bos@583 303 because forks are absolutely fundamental: they happen all
bos@583 304 the time.</para>
bos@583 305
bos@584 306 <para id="x_95">If every piece of work that everybody does, all the
bos@583 307 time, is framed in terms of forking and merging, then what
bos@583 308 the open source world refers to as a <quote>fork</quote>
bos@583 309 becomes <emphasis>purely</emphasis> a social issue. If
bos@583 310 anything, distributed tools <emphasis>lower</emphasis> the
bos@583 311 likelihood of a fork:</para>
bos@583 312 <itemizedlist>
bos@584 313 <listitem><para id="x_96">They eliminate the social distinction that
bos@583 314 centralised tools impose: that between insiders (people
bos@583 315 with commit access) and outsiders (people
bos@583 316 without).</para></listitem>
bos@584 317 <listitem><para id="x_97">They make it easier to reconcile after a
bos@583 318 social fork, because all that's involved from the
bos@583 319 perspective of the revision control software is just
bos@583 320 another merge.</para></listitem></itemizedlist>
bos@583 321
bos@584 322 <para id="x_98">Some people resist distributed tools because they want
bos@583 323 to retain tight control over their projects, and they
bos@583 324 believe that centralised tools give them this control.
bos@583 325 However, if you're of this belief, and you publish your CVS
bos@583 326 or Subversion repositories publicly, there are plenty of
bos@583 327 tools available that can pull out your entire project's
bos@583 328 history (albeit slowly) and recreate it somewhere that you
bos@583 329 don't control. So while your control in this case is
bos@583 330 illusory, you are forgoing the ability to fluidly
bos@583 331 collaborate with whatever people feel compelled to mirror
bos@583 332 and fork your history.</para>
bos@583 333
bos@583 334 </sect3>
bos@583 335 </sect2>
bos@583 336 <sect2>
bos@583 337 <title>Advantages for commercial projects</title>
bos@583 338
bos@584 339 <para id="x_99">Many commercial projects are undertaken by teams that are
bos@583 340 scattered across the globe. Contributors who are far from a
bos@583 341 central server will see slower command execution and perhaps
bos@583 342 less reliability. Commercial revision control systems attempt
bos@583 343 to ameliorate these problems with remote-site replication
bos@583 344 add-ons that are typically expensive to buy and cantankerous
bos@583 345 to administer. A distributed system doesn't suffer from these
bos@583 346 problems in the first place. Better yet, you can easily set
bos@583 347 up multiple authoritative servers, say one per site, so that
bos@583 348 there's no redundant communication between repositories over
bos@583 349 expensive long-haul network links.</para>
bos@583 350
bos@584 351 <para id="x_9a">Centralised revision control systems tend to have
bos@583 352 relatively low scalability. It's not unusual for an expensive
bos@583 353 centralised system to fall over under the combined load of
bos@583 354 just a few dozen concurrent users. Once again, the typical
bos@583 355 response tends to be an expensive and clunky replication
bos@583 356 facility. Since the load on a central server---if you have
bos@583 357 one at all---is many times lower with a distributed tool
bos@583 358 (because all of the data is replicated everywhere), a single
bos@583 359 cheap server can handle the needs of a much larger team, and
bos@583 360 replication to balance load becomes a simple matter of
bos@583 361 scripting.</para>
bos@583 362
bos@584 363 <para id="x_9b">If you have an employee in the field, troubleshooting a
bos@583 364 problem at a customer's site, they'll benefit from distributed
bos@583 365 revision control. The tool will let them generate custom
bos@583 366 builds, try different fixes in isolation from each other, and
bos@583 367 search efficiently through history for the sources of bugs and
bos@583 368 regressions in the customer's environment, all without needing
bos@583 369 to connect to your company's network.</para>
bos@583 370
bos@583 371 </sect2>
bos@583 372 </sect1>
bos@583 373 <sect1>
bos@583 374 <title>Why choose Mercurial?</title>
bos@583 375
bos@584 376 <para id="x_9c">Mercurial has a unique set of properties that make it a
bos@583 377 particularly good choice as a revision control system.</para>
bos@583 378 <itemizedlist>
bos@584 379 <listitem><para id="x_9d">It is easy to learn and use.</para></listitem>
bos@584 380 <listitem><para id="x_9e">It is lightweight.</para></listitem>
bos@584 381 <listitem><para id="x_9f">It scales excellently.</para></listitem>
bos@584 382 <listitem><para id="x_a0">It is easy to
bos@583 383 customise.</para></listitem></itemizedlist>
bos@583 384
bos@584 385 <para id="x_a1">If you are at all familiar with revision control systems,
bos@583 386 you should be able to get up and running with Mercurial in less
bos@583 387 than five minutes. Even if not, it will take no more than a few
bos@583 388 minutes longer. Mercurial's command and feature sets are
bos@583 389 generally uniform and consistent, so you can keep track of a few
bos@583 390 general rules instead of a host of exceptions.</para>
bos@583 391
bos@584 392 <para id="x_a2">On a small project, you can start working with Mercurial in
bos@583 393 moments. Creating new changes and branches; transferring changes
bos@583 394 around (whether locally or over a network); and history and
bos@583 395 status operations are all fast. Mercurial attempts to stay
bos@583 396 nimble and largely out of your way by combining low cognitive
bos@583 397 overhead with blazingly fast operations.</para>
bos@583 398
bos@584 399 <para id="x_a3">The usefulness of Mercurial is not limited to small
bos@583 400 projects: it is used by projects with hundreds to thousands of
bos@583 401 contributors, each containing tens of thousands of files and
bos@583 402 hundreds of megabytes of source code.</para>
bos@583 403
bos@584 404 <para id="x_a4">If the core functionality of Mercurial is not enough for
bos@583 405 you, it's easy to build on. Mercurial is well suited to
bos@583 406 scripting tasks, and its clean internals and implementation in
bos@583 407 Python make it easy to add features in the form of extensions.
bos@583 408 There are a number of popular and useful extensions already
bos@583 409 available, ranging from helping to identify bugs to improving
bos@583 410 performance.</para>
bos@583 411
bos@583 412 </sect1>
bos@583 413 <sect1>
bos@583 414 <title>Mercurial compared with other tools</title>
bos@583 415
bos@584 416 <para id="x_a5">Before you read on, please understand that this section
bos@583 417 necessarily reflects my own experiences, interests, and (dare I
bos@583 418 say it) biases. I have used every one of the revision control
bos@583 419 tools listed below, in most cases for several years at a
bos@583 420 time.</para>
bos@583 421
bos@583 422
bos@583 423 <sect2>
bos@583 424 <title>Subversion</title>
bos@583 425
bos@584 426 <para id="x_a6">Subversion is a popular revision control tool, developed
bos@583 427 to replace CVS. It has a centralised client/server
bos@583 428 architecture.</para>
bos@583 429
bos@584 430 <para id="x_a7">Subversion and Mercurial have similarly named commands for
bos@583 431 performing the same operations, so if you're familiar with
bos@583 432 one, it is easy to learn to use the other. Both tools are
bos@583 433 portable to all popular operating systems.</para>
bos@583 434
bos@584 435 <para id="x_a8">Prior to version 1.5, Subversion had no useful support for
bos@583 436 merges. At the time of writing, its merge tracking capability
bos@583 437 is new, and known to be <ulink
bos@583 438 url="http://svnbook.red-bean.com/nightly/en/svn.branchmerge.advanced.html#svn.branchmerge.advanced.finalword">complicated
bos@583 439 and buggy</ulink>.</para>
bos@583 440
bos@584 441 <para id="x_a9">Mercurial has a substantial performance advantage over
bos@583 442 Subversion on every revision control operation I have
bos@583 443 benchmarked. I have measured its advantage as ranging from a
bos@583 444 factor of two to a factor of six when compared with Subversion
bos@583 445 1.4.3's <emphasis>ra_local</emphasis> file store, which is the
bos@583 446 fastest access method available. In more realistic
bos@583 447 deployments involving a network-based store, Subversion will
bos@583 448 be at a substantially larger disadvantage. Because many
bos@583 449 Subversion commands must talk to the server and Subversion
bos@583 450 does not have useful replication facilities, server capacity
bos@583 451 and network bandwidth become bottlenecks for modestly large
bos@583 452 projects.</para>
bos@583 453
bos@584 454 <para id="x_aa">Additionally, Subversion incurs substantial storage
bos@583 455 overhead to avoid network transactions for a few common
bos@583 456 operations, such as finding modified files
bos@583 457 (<literal>status</literal>) and displaying modifications
bos@583 458 against the current revision (<literal>diff</literal>). As a
bos@583 459 result, a Subversion working copy is often the same size as,
bos@583 460 or larger than, a Mercurial repository and working directory,
bos@583 461 even though the Mercurial repository contains a complete
bos@583 462 history of the project.</para>
bos@583 463
bos@584 464 <para id="x_ab">Subversion is widely supported by third party tools.
bos@583 465 Mercurial currently lags considerably in this area. This gap
bos@583 466 is closing, however, and indeed some of Mercurial's GUI tools
bos@583 467 now outshine their Subversion equivalents. Like Mercurial,
bos@583 468 Subversion has an excellent user manual.</para>
bos@583 469
bos@584 470 <para id="x_ac">Because Subversion doesn't store revision history on the
bos@583 471 client, it is well suited to managing projects that deal with
bos@583 472 lots of large, opaque binary files. If you check in fifty
bos@583 473 revisions to an incompressible 10MB file, Subversion's
bos@583 474 client-side space usage stays constant The space used by any
bos@583 475 distributed SCM will grow rapidly in proportion to the number
bos@583 476 of revisions, because the differences between each revision
bos@583 477 are large.</para>
bos@583 478
bos@584 479 <para id="x_ad">In addition, it's often difficult or, more usually,
bos@583 480 impossible to merge different versions of a binary file.
bos@583 481 Subversion's ability to let a user lock a file, so that they
bos@583 482 temporarily have the exclusive right to commit changes to it,
bos@583 483 can be a significant advantage to a project where binary files
bos@583 484 are widely used.</para>
bos@583 485
bos@584 486 <para id="x_ae">Mercurial can import revision history from a Subversion
bos@583 487 repository. It can also export revision history to a
bos@583 488 Subversion repository. This makes it easy to <quote>test the
bos@583 489 waters</quote> and use Mercurial and Subversion in parallel
bos@583 490 before deciding to switch. History conversion is incremental,
bos@583 491 so you can perform an initial conversion, then small
bos@583 492 additional conversions afterwards to bring in new
bos@583 493 changes.</para>
bos@583 494
bos@583 495
bos@583 496 </sect2>
bos@583 497 <sect2>
bos@583 498 <title>Git</title>
bos@583 499
bos@584 500 <para id="x_af">Git is a distributed revision control tool that was
bos@583 501 developed for managing the Linux kernel source tree. Like
bos@583 502 Mercurial, its early design was somewhat influenced by
bos@583 503 Monotone.</para>
bos@583 504
bos@584 505 <para id="x_b0">Git has a very large command set, with version 1.5.0
bos@583 506 providing 139 individual commands. It has something of a
bos@583 507 reputation for being difficult to learn. Compared to Git,
bos@583 508 Mercurial has a strong focus on simplicity.</para>
bos@583 509
bos@584 510 <para id="x_b1">In terms of performance, Git is extremely fast. In
bos@583 511 several cases, it is faster than Mercurial, at least on Linux,
bos@583 512 while Mercurial performs better on other operations. However,
bos@583 513 on Windows, the performance and general level of support that
bos@583 514 Git provides is, at the time of writing, far behind that of
bos@583 515 Mercurial.</para>
bos@583 516
bos@584 517 <para id="x_b2">While a Mercurial repository needs no maintenance, a Git
bos@583 518 repository requires frequent manual <quote>repacks</quote> of
bos@583 519 its metadata. Without these, performance degrades, while
bos@583 520 space usage grows rapidly. A server that contains many Git
bos@583 521 repositories that are not rigorously and frequently repacked
bos@583 522 will become heavily disk-bound during backups, and there have
bos@583 523 been instances of daily backups taking far longer than 24
bos@583 524 hours as a result. A freshly packed Git repository is
bos@583 525 slightly smaller than a Mercurial repository, but an unpacked
bos@583 526 repository is several orders of magnitude larger.</para>
bos@583 527
bos@584 528 <para id="x_b3">The core of Git is written in C. Many Git commands are
bos@583 529 implemented as shell or Perl scripts, and the quality of these
bos@583 530 scripts varies widely. I have encountered several instances
bos@583 531 where scripts charged along blindly in the presence of errors
bos@583 532 that should have been fatal.</para>
bos@583 533
bos@584 534 <para id="x_b4">Mercurial can import revision history from a Git
bos@583 535 repository.</para>
bos@583 536
bos@583 537
bos@583 538 </sect2>
bos@583 539 <sect2>
bos@583 540 <title>CVS</title>
bos@583 541
bos@584 542 <para id="x_b5">CVS is probably the most widely used revision control tool
bos@583 543 in the world. Due to its age and internal untidiness, it has
bos@583 544 been only lightly maintained for many years.</para>
bos@583 545
bos@584 546 <para id="x_b6">It has a centralised client/server architecture. It does
bos@583 547 not group related file changes into atomic commits, making it
bos@583 548 easy for people to <quote>break the build</quote>: one person
bos@583 549 can successfully commit part of a change and then be blocked
bos@583 550 by the need for a merge, causing other people to see only a
bos@583 551 portion of the work they intended to do. This also affects
bos@583 552 how you work with project history. If you want to see all of
bos@583 553 the modifications someone made as part of a task, you will
bos@583 554 need to manually inspect the descriptions and timestamps of
bos@583 555 the changes made to each file involved (if you even know what
bos@583 556 those files were).</para>
bos@583 557
bos@584 558 <para id="x_b7">CVS has a muddled notion of tags and branches that I will
bos@583 559 not attempt to even describe. It does not support renaming of
bos@583 560 files or directories well, making it easy to corrupt a
bos@583 561 repository. It has almost no internal consistency checking
bos@583 562 capabilities, so it is usually not even possible to tell
bos@583 563 whether or how a repository is corrupt. I would not recommend
bos@583 564 CVS for any project, existing or new.</para>
bos@583 565
bos@584 566 <para id="x_b8">Mercurial can import CVS revision history. However, there
bos@583 567 are a few caveats that apply; these are true of every other
bos@583 568 revision control tool's CVS importer, too. Due to CVS's lack
bos@583 569 of atomic changes and unversioned filesystem hierarchy, it is
bos@583 570 not possible to reconstruct CVS history completely accurately;
bos@583 571 some guesswork is involved, and renames will usually not show
bos@583 572 up. Because a lot of advanced CVS administration has to be
bos@583 573 done by hand and is hence error-prone, it's common for CVS
bos@583 574 importers to run into multiple problems with corrupted
bos@583 575 repositories (completely bogus revision timestamps and files
bos@583 576 that have remained locked for over a decade are just two of
bos@583 577 the less interesting problems I can recall from personal
bos@583 578 experience).</para>
bos@583 579
bos@584 580 <para id="x_b9">Mercurial can import revision history from a CVS
bos@583 581 repository.</para>
bos@583 582
bos@583 583
bos@583 584 </sect2>
bos@583 585 <sect2>
bos@583 586 <title>Commercial tools</title>
bos@583 587
bos@584 588 <para id="x_ba">Perforce has a centralised client/server architecture,
bos@583 589 with no client-side caching of any data. Unlike modern
bos@583 590 revision control tools, Perforce requires that a user run a
bos@583 591 command to inform the server about every file they intend to
bos@583 592 edit.</para>
bos@583 593
bos@584 594 <para id="x_bb">The performance of Perforce is quite good for small teams,
bos@583 595 but it falls off rapidly as the number of users grows beyond a
bos@583 596 few dozen. Modestly large Perforce installations require the
bos@583 597 deployment of proxies to cope with the load their users
bos@583 598 generate.</para>
bos@583 599
bos@583 600
bos@583 601 </sect2>
bos@583 602 <sect2>
bos@583 603 <title>Choosing a revision control tool</title>
bos@583 604
bos@584 605 <para id="x_bc">With the exception of CVS, all of the tools listed above
bos@583 606 have unique strengths that suit them to particular styles of
bos@583 607 work. There is no single revision control tool that is best
bos@583 608 in all situations.</para>
bos@583 609
bos@584 610 <para id="x_bd">As an example, Subversion is a good choice for working
bos@583 611 with frequently edited binary files, due to its centralised
bos@583 612 nature and support for file locking.</para>
bos@583 613
bos@584 614 <para id="x_be">I personally find Mercurial's properties of simplicity,
bos@583 615 performance, and good merge support to be a compelling
bos@583 616 combination that has served me well for several years.</para>
bos@583 617
bos@583 618
bos@583 619 </sect2>
bos@583 620 </sect1>
bos@583 621 <sect1>
bos@583 622 <title>Switching from another tool to Mercurial</title>
bos@583 623
bos@584 624 <para id="x_bf">Mercurial is bundled with an extension named <literal
bos@583 625 role="hg-ext">convert</literal>, which can incrementally
bos@583 626 import revision history from several other revision control
bos@583 627 tools. By <quote>incremental</quote>, I mean that you can
bos@583 628 convert all of a project's history to date in one go, then rerun
bos@583 629 the conversion later to obtain new changes that happened after
bos@583 630 the initial conversion.</para>
bos@583 631
bos@584 632 <para id="x_c0">The revision control tools supported by <literal
bos@583 633 role="hg-ext">convert</literal> are as follows:</para>
bos@583 634 <itemizedlist>
bos@584 635 <listitem><para id="x_c1">Subversion</para></listitem>
bos@584 636 <listitem><para id="x_c2">CVS</para></listitem>
bos@584 637 <listitem><para id="x_c3">Git</para></listitem>
bos@584 638 <listitem><para id="x_c4">Darcs</para></listitem></itemizedlist>
bos@584 639
bos@584 640 <para id="x_c5">In addition, <literal role="hg-ext">convert</literal> can
bos@583 641 export changes from Mercurial to Subversion. This makes it
bos@583 642 possible to try Subversion and Mercurial in parallel before
bos@583 643 committing to a switchover, without risking the loss of any
bos@583 644 work.</para>
bos@583 645
bos@584 646 <para id="x_c6">The <command role="hg-ext-convert">convert</command> command
bos@583 647 is easy to use. Simply point it at the path or URL of the
bos@583 648 source repository, optionally give it the name of the
bos@583 649 destination repository, and it will start working. After the
bos@583 650 initial conversion, just run the same command again to import
bos@583 651 new changes.</para>
bos@583 652 </sect1>
bos@583 653
bos@583 654 <sect1>
bos@583 655 <title>A short history of revision control</title>
bos@583 656
bos@584 657 <para id="x_c7">The best known of the old-time revision control tools is
bos@583 658 SCCS (Source Code Control System), which Marc Rochkind wrote at
bos@583 659 Bell Labs, in the early 1970s. SCCS operated on individual
bos@583 660 files, and required every person working on a project to have
bos@583 661 access to a shared workspace on a single system. Only one
bos@583 662 person could modify a file at any time; arbitration for access
bos@583 663 to files was via locks. It was common for people to lock files,
bos@583 664 and later forget to unlock them, preventing anyone else from
bos@583 665 modifying those files without the help of an
bos@583 666 administrator.</para>
bos@583 667
bos@584 668 <para id="x_c8">Walter Tichy developed a free alternative to SCCS in the
bos@583 669 early 1980s; he called his program RCS (Revision Control System).
bos@583 670 Like SCCS, RCS required developers to work in a single shared
bos@583 671 workspace, and to lock files to prevent multiple people from
bos@583 672 modifying them simultaneously.</para>
bos@583 673
bos@584 674 <para id="x_c9">Later in the 1980s, Dick Grune used RCS as a building block
bos@583 675 for a set of shell scripts he initially called cmt, but then
bos@583 676 renamed to CVS (Concurrent Versions System). The big innovation
bos@583 677 of CVS was that it let developers work simultaneously and
bos@583 678 somewhat independently in their own personal workspaces. The
bos@583 679 personal workspaces prevented developers from stepping on each
bos@583 680 other's toes all the time, as was common with SCCS and RCS. Each
bos@583 681 developer had a copy of every project file, and could modify
bos@583 682 their copies independently. They had to merge their edits prior
bos@583 683 to committing changes to the central repository.</para>
bos@583 684
bos@584 685 <para id="x_ca">Brian Berliner took Grune's original scripts and rewrote
bos@583 686 them in C, releasing in 1989 the code that has since developed
bos@583 687 into the modern version of CVS. CVS subsequently acquired the
bos@583 688 ability to operate over a network connection, giving it a
bos@583 689 client/server architecture. CVS's architecture is centralised;
bos@583 690 only the server has a copy of the history of the project. Client
bos@583 691 workspaces just contain copies of recent versions of the
bos@583 692 project's files, and a little metadata to tell them where the
bos@583 693 server is. CVS has been enormously successful; it is probably
bos@583 694 the world's most widely used revision control system.</para>
bos@583 695
bos@584 696 <para id="x_cb">In the early 1990s, Sun Microsystems developed an early
bos@583 697 distributed revision control system, called TeamWare. A
bos@583 698 TeamWare workspace contains a complete copy of the project's
bos@583 699 history. TeamWare has no notion of a central repository. (CVS
bos@583 700 relied upon RCS for its history storage; TeamWare used
bos@583 701 SCCS.)</para>
bos@583 702
bos@584 703 <para id="x_cc">As the 1990s progressed, awareness grew of a number of
bos@583 704 problems with CVS. It records simultaneous changes to multiple
bos@583 705 files individually, instead of grouping them together as a
bos@583 706 single logically atomic operation. It does not manage its file
bos@583 707 hierarchy well; it is easy to make a mess of a repository by
bos@583 708 renaming files and directories. Worse, its source code is
bos@583 709 difficult to read and maintain, which made the <quote>pain
bos@583 710 level</quote> of fixing these architectural problems
bos@583 711 prohibitive.</para>
bos@583 712
bos@584 713 <para id="x_cd">In 2001, Jim Blandy and Karl Fogel, two developers who had
bos@583 714 worked on CVS, started a project to replace it with a tool that
bos@583 715 would have a better architecture and cleaner code. The result,
bos@583 716 Subversion, does not stray from CVS's centralised client/server
bos@583 717 model, but it adds multi-file atomic commits, better namespace
bos@583 718 management, and a number of other features that make it a
bos@583 719 generally better tool than CVS. Since its initial release, it
bos@583 720 has rapidly grown in popularity.</para>
bos@583 721
bos@584 722 <para id="x_ce">More or less simultaneously, Graydon Hoare began working on
bos@583 723 an ambitious distributed revision control system that he named
bos@583 724 Monotone. While Monotone addresses many of CVS's design flaws
bos@583 725 and has a peer-to-peer architecture, it goes beyond earlier (and
bos@583 726 subsequent) revision control tools in a number of innovative
bos@583 727 ways. It uses cryptographic hashes as identifiers, and has an
bos@583 728 integral notion of <quote>trust</quote> for code from different
bos@583 729 sources.</para>
bos@583 730
bos@584 731 <para id="x_cf">Mercurial began life in 2005. While a few aspects of its
bos@583 732 design are influenced by Monotone, Mercurial focuses on ease of
bos@583 733 use, high performance, and scalability to very large
bos@583 734 projects.</para>
bos@583 735
bos@583 736 </sect1>
bos@583 737
bos@583 738 <sect1>
bos@583 739 <title>Colophon&emdash;this book is Free</title>
bos@26 740
bos@584 741 <para id="x_d0">This book is licensed under the Open Publication License,
bos@559 742 and is produced entirely using Free Software tools. It is
bos@580 743 typeset with DocBook XML. Illustrations are drawn and rendered with
bos@559 744 <ulink url="http://www.inkscape.org/">Inkscape</ulink>.</para>
bos@26 745
bos@584 746 <para id="x_d1">The complete source code for this book is published as a
bos@559 747 Mercurial repository, at <ulink
bos@559 748 url="http://hg.serpentine.com/mercurial/book">http://hg.serpentine.com/mercurial/book</ulink>.</para>
bos@559 749
bos@559 750 </sect1>
bos@559 751 </preface>
bos@559 752 <!--
bos@559 753 local variables:
bos@559 754 sgml-parent-document: ("00book.xml" "book" "preface")
bos@559 755 end:
bos@559 756 -->