hgbook
annotate en/ch01-intro.xml @ 1114:527b86d55d4a
inotify: update installation information
inotify is shipped in Mercurial since 1.0, which greatly simplifies the installation process
inotify is shipped in Mercurial since 1.0, which greatly simplifies the installation process
author | Nicolas Dumazet <nicdumz.commits@gmail.com> |
---|---|
date | Sun Dec 13 16:35:56 2009 +0900 (2009-12-13) |
parents | b338f5490029 |
children |
rev | line source |
---|---|
bos@559 | 1 <!-- vim: set filetype=docbkxml shiftwidth=2 autoindent expandtab tw=77 : --> |
bos@26 | 2 |
bos@704 | 3 <chapter id="chap:intro"> |
bos@704 | 4 <?dbhtml filename="how-did-we-get-here.html"?> |
bos@704 | 5 <title>How did we get here?</title> |
bos@26 | 6 |
bos@583 | 7 <sect1> |
bos@583 | 8 <title>Why revision control? Why Mercurial?</title> |
bos@583 | 9 |
bos@584 | 10 <para id="x_6d">Revision control is the process of managing multiple |
bos@583 | 11 versions of a piece of information. In its simplest form, this |
bos@583 | 12 is something that many people do by hand: every time you modify |
bos@583 | 13 a file, save it under a new name that contains a number, each |
bos@583 | 14 one higher than the number of the preceding version.</para> |
bos@583 | 15 |
bos@584 | 16 <para id="x_6e">Manually managing multiple versions of even a single file is |
bos@583 | 17 an error-prone task, though, so software tools to help automate |
bos@583 | 18 this process have long been available. The earliest automated |
bos@583 | 19 revision control tools were intended to help a single user to |
bos@583 | 20 manage revisions of a single file. Over the past few decades, |
bos@583 | 21 the scope of revision control tools has expanded greatly; they |
bos@583 | 22 now manage multiple files, and help multiple people to work |
bos@583 | 23 together. The best modern revision control tools have no |
bos@583 | 24 problem coping with thousands of people working together on |
bos@583 | 25 projects that consist of hundreds of thousands of files.</para> |
bos@583 | 26 |
bos@584 | 27 <para id="x_6f">The arrival of distributed revision control is relatively |
bos@583 | 28 recent, and so far this new field has grown due to people's |
bos@583 | 29 willingness to explore ill-charted territory.</para> |
bos@583 | 30 |
bos@584 | 31 <para id="x_70">I am writing a book about distributed revision control |
bos@583 | 32 because I believe that it is an important subject that deserves |
bos@583 | 33 a field guide. I chose to write about Mercurial because it is |
bos@583 | 34 the easiest tool to learn the terrain with, and yet it scales to |
bos@583 | 35 the demands of real, challenging environments where many other |
bos@583 | 36 revision control tools buckle.</para> |
bos@583 | 37 |
bos@583 | 38 <sect2> |
bos@583 | 39 <title>Why use revision control?</title> |
bos@583 | 40 |
bos@584 | 41 <para id="x_71">There are a number of reasons why you or your team might |
bos@583 | 42 want to use an automated revision control tool for a |
bos@583 | 43 project.</para> |
bos@583 | 44 |
bos@583 | 45 <itemizedlist> |
bos@584 | 46 <listitem><para id="x_72">It will track the history and evolution of |
bos@583 | 47 your project, so you don't have to. For every change, |
bos@583 | 48 you'll have a log of <emphasis>who</emphasis> made it; |
bos@583 | 49 <emphasis>why</emphasis> they made it; |
bos@583 | 50 <emphasis>when</emphasis> they made it; and |
bos@583 | 51 <emphasis>what</emphasis> the change |
bos@583 | 52 was.</para></listitem> |
bos@584 | 53 <listitem><para id="x_73">When you're working with other people, |
bos@583 | 54 revision control software makes it easier for you to |
bos@583 | 55 collaborate. For example, when people more or less |
bos@583 | 56 simultaneously make potentially incompatible changes, the |
bos@583 | 57 software will help you to identify and resolve those |
bos@583 | 58 conflicts.</para></listitem> |
bos@584 | 59 <listitem><para id="x_74">It can help you to recover from mistakes. If |
bos@583 | 60 you make a change that later turns out to be in error, you |
bos@583 | 61 can revert to an earlier version of one or more files. In |
bos@583 | 62 fact, a <emphasis>really</emphasis> good revision control |
bos@583 | 63 tool will even help you to efficiently figure out exactly |
bos@592 | 64 when a problem was introduced (see <xref |
bos@583 | 65 linkend="sec:undo:bisect"/> for details).</para></listitem> |
bos@584 | 66 <listitem><para id="x_75">It will help you to work simultaneously on, |
bos@583 | 67 and manage the drift between, multiple versions of your |
bos@583 | 68 project.</para></listitem> |
bos@583 | 69 </itemizedlist> |
bos@583 | 70 |
bos@609 | 71 <para id="x_76">Most of these reasons are equally |
bos@609 | 72 valid&emdash;at least in theory&emdash;whether you're working |
bos@609 | 73 on a project by yourself, or with a hundred other |
bos@609 | 74 people.</para> |
bos@583 | 75 |
bos@584 | 76 <para id="x_77">A key question about the practicality of revision control |
bos@583 | 77 at these two different scales (<quote>lone hacker</quote> and |
bos@583 | 78 <quote>huge team</quote>) is how its |
bos@583 | 79 <emphasis>benefits</emphasis> compare to its |
bos@583 | 80 <emphasis>costs</emphasis>. A revision control tool that's |
bos@583 | 81 difficult to understand or use is going to impose a high |
bos@583 | 82 cost.</para> |
bos@583 | 83 |
bos@584 | 84 <para id="x_78">A five-hundred-person project is likely to collapse under |
bos@583 | 85 its own weight almost immediately without a revision control |
bos@583 | 86 tool and process. In this case, the cost of using revision |
bos@583 | 87 control might hardly seem worth considering, since |
bos@583 | 88 <emphasis>without</emphasis> it, failure is almost |
bos@583 | 89 guaranteed.</para> |
bos@583 | 90 |
bos@584 | 91 <para id="x_79">On the other hand, a one-person <quote>quick hack</quote> |
bos@583 | 92 might seem like a poor place to use a revision control tool, |
bos@583 | 93 because surely the cost of using one must be close to the |
bos@583 | 94 overall cost of the project. Right?</para> |
bos@583 | 95 |
bos@584 | 96 <para id="x_7a">Mercurial uniquely supports <emphasis>both</emphasis> of |
bos@583 | 97 these scales of development. You can learn the basics in just |
bos@583 | 98 a few minutes, and due to its low overhead, you can apply |
bos@583 | 99 revision control to the smallest of projects with ease. Its |
bos@583 | 100 simplicity means you won't have a lot of abstruse concepts or |
bos@583 | 101 command sequences competing for mental space with whatever |
bos@583 | 102 you're <emphasis>really</emphasis> trying to do. At the same |
bos@583 | 103 time, Mercurial's high performance and peer-to-peer nature let |
bos@583 | 104 you scale painlessly to handle large projects.</para> |
bos@583 | 105 |
bos@584 | 106 <para id="x_7b">No revision control tool can rescue a poorly run project, |
bos@583 | 107 but a good choice of tools can make a huge difference to the |
bos@583 | 108 fluidity with which you can work on a project.</para> |
bos@583 | 109 |
bos@583 | 110 </sect2> |
bos@583 | 111 |
bos@583 | 112 <sect2> |
bos@583 | 113 <title>The many names of revision control</title> |
bos@583 | 114 |
bos@584 | 115 <para id="x_7c">Revision control is a diverse field, so much so that it is |
bos@583 | 116 referred to by many names and acronyms. Here are a few of the |
bos@583 | 117 more common variations you'll encounter:</para> |
bos@583 | 118 <itemizedlist> |
bos@584 | 119 <listitem><para id="x_7d">Revision control (RCS)</para></listitem> |
bos@584 | 120 <listitem><para id="x_7e">Software configuration management (SCM), or |
bos@583 | 121 configuration management</para></listitem> |
bos@584 | 122 <listitem><para id="x_7f">Source code management</para></listitem> |
bos@584 | 123 <listitem><para id="x_80">Source code control, or source |
bos@583 | 124 control</para></listitem> |
bos@584 | 125 <listitem><para id="x_81">Version control |
bos@583 | 126 (VCS)</para></listitem></itemizedlist> |
bos@584 | 127 <para id="x_82">Some people claim that these terms actually have different |
bos@583 | 128 meanings, but in practice they overlap so much that there's no |
bos@583 | 129 agreed or even useful way to tease them apart.</para> |
bos@583 | 130 |
bos@583 | 131 </sect2> |
bos@583 | 132 </sect1> |
bos@26 | 133 |
bos@559 | 134 <sect1> |
bos@559 | 135 <title>About the examples in this book</title> |
bos@200 | 136 |
bos@584 | 137 <para id="x_84">This book takes an unusual approach to code samples. Every |
bos@609 | 138 example is <quote>live</quote>&emdash;each one is actually the result |
bos@559 | 139 of a shell script that executes the Mercurial commands you see. |
bos@559 | 140 Every time an image of the book is built from its sources, all |
bos@559 | 141 the example scripts are automatically run, and their current |
bos@559 | 142 results compared against their expected results.</para> |
bos@200 | 143 |
bos@584 | 144 <para id="x_85">The advantage of this approach is that the examples are |
bos@559 | 145 always accurate; they describe <emphasis>exactly</emphasis> the |
bos@672 | 146 behavior of the version of Mercurial that's mentioned at the |
bos@559 | 147 front of the book. If I update the version of Mercurial that |
bos@559 | 148 I'm documenting, and the output of some command changes, the |
bos@559 | 149 build fails.</para> |
bos@200 | 150 |
bos@584 | 151 <para id="x_86">There is a small disadvantage to this approach, which is |
bos@559 | 152 that the dates and times you'll see in examples tend to be |
bos@559 | 153 <quote>squashed</quote> together in a way that they wouldn't be |
bos@559 | 154 if the same commands were being typed by a human. Where a human |
bos@559 | 155 can issue no more than one command every few seconds, with any |
bos@559 | 156 resulting timestamps correspondingly spread out, my automated |
bos@559 | 157 example scripts run many commands in one second.</para> |
bos@200 | 158 |
bos@584 | 159 <para id="x_87">As an instance of this, several consecutive commits in an |
bos@559 | 160 example can show up as having occurred during the same second. |
bos@559 | 161 You can see this occur in the <literal |
bos@592 | 162 role="hg-ext">bisect</literal> example in <xref |
bos@589 | 163 linkend="sec:undo:bisect"/>, for instance.</para> |
bos@200 | 164 |
bos@584 | 165 <para id="x_88">So when you're reading examples, don't place too much weight |
bos@559 | 166 on the dates or times you see in the output of commands. But |
bos@672 | 167 <emphasis>do</emphasis> be confident that the behavior you're |
bos@559 | 168 seeing is consistent and reproducible.</para> |
bos@26 | 169 |
bos@559 | 170 </sect1> |
bos@583 | 171 |
bos@583 | 172 <sect1> |
bos@583 | 173 <title>Trends in the field</title> |
bos@583 | 174 |
bos@584 | 175 <para id="x_89">There has been an unmistakable trend in the development and |
bos@583 | 176 use of revision control tools over the past four decades, as |
bos@583 | 177 people have become familiar with the capabilities of their tools |
bos@583 | 178 and constrained by their limitations.</para> |
bos@583 | 179 |
bos@584 | 180 <para id="x_8a">The first generation began by managing single files on |
bos@583 | 181 individual computers. Although these tools represented a huge |
bos@583 | 182 advance over ad-hoc manual revision control, their locking model |
bos@583 | 183 and reliance on a single computer limited them to small, |
bos@583 | 184 tightly-knit teams.</para> |
bos@583 | 185 |
bos@584 | 186 <para id="x_8b">The second generation loosened these constraints by moving |
bos@583 | 187 to network-centered architectures, and managing entire projects |
bos@583 | 188 at a time. As projects grew larger, they ran into new problems. |
bos@583 | 189 With clients needing to talk to servers very frequently, server |
bos@583 | 190 scaling became an issue for large projects. An unreliable |
bos@583 | 191 network connection could prevent remote users from being able to |
bos@583 | 192 talk to the server at all. As open source projects started |
bos@583 | 193 making read-only access available anonymously to anyone, people |
bos@583 | 194 without commit privileges found that they could not use the |
bos@583 | 195 tools to interact with a project in a natural way, as they could |
bos@583 | 196 not record their changes.</para> |
bos@583 | 197 |
bos@584 | 198 <para id="x_8c">The current generation of revision control tools is |
bos@583 | 199 peer-to-peer in nature. All of these systems have dropped the |
bos@583 | 200 dependency on a single central server, and allow people to |
bos@583 | 201 distribute their revision control data to where it's actually |
bos@583 | 202 needed. Collaboration over the Internet has moved from |
bos@583 | 203 constrained by technology to a matter of choice and consensus. |
bos@583 | 204 Modern tools can operate offline indefinitely and autonomously, |
bos@583 | 205 with a network connection only needed when syncing changes with |
bos@583 | 206 another repository.</para> |
bos@583 | 207 |
bos@583 | 208 </sect1> |
bos@583 | 209 <sect1> |
bos@583 | 210 <title>A few of the advantages of distributed revision |
bos@583 | 211 control</title> |
bos@583 | 212 |
bos@584 | 213 <para id="x_8d">Even though distributed revision control tools have for |
bos@583 | 214 several years been as robust and usable as their |
bos@583 | 215 previous-generation counterparts, people using older tools have |
bos@583 | 216 not yet necessarily woken up to their advantages. There are a |
bos@583 | 217 number of ways in which distributed tools shine relative to |
bos@583 | 218 centralised ones.</para> |
bos@583 | 219 |
bos@584 | 220 <para id="x_8e">For an individual developer, distributed tools are almost |
bos@583 | 221 always much faster than centralised tools. This is for a simple |
bos@583 | 222 reason: a centralised tool needs to talk over the network for |
bos@583 | 223 many common operations, because most metadata is stored in a |
bos@583 | 224 single copy on the central server. A distributed tool stores |
bos@583 | 225 all of its metadata locally. All else being equal, talking over |
bos@583 | 226 the network adds overhead to a centralised tool. Don't |
bos@583 | 227 underestimate the value of a snappy, responsive tool: you're |
bos@583 | 228 going to spend a lot of time interacting with your revision |
bos@583 | 229 control software.</para> |
bos@583 | 230 |
bos@584 | 231 <para id="x_8f">Distributed tools are indifferent to the vagaries of your |
bos@583 | 232 server infrastructure, again because they replicate metadata to |
bos@583 | 233 so many locations. If you use a centralised system and your |
bos@583 | 234 server catches fire, you'd better hope that your backup media |
bos@583 | 235 are reliable, and that your last backup was recent and actually |
bos@583 | 236 worked. With a distributed tool, you have many backups |
bos@583 | 237 available on every contributor's computer.</para> |
bos@583 | 238 |
bos@584 | 239 <para id="x_90">The reliability of your network will affect distributed |
bos@583 | 240 tools far less than it will centralised tools. You can't even |
bos@583 | 241 use a centralised tool without a network connection, except for |
bos@583 | 242 a few highly constrained commands. With a distributed tool, if |
bos@583 | 243 your network connection goes down while you're working, you may |
bos@583 | 244 not even notice. The only thing you won't be able to do is talk |
bos@583 | 245 to repositories on other computers, something that is relatively |
bos@583 | 246 rare compared with local operations. If you have a far-flung |
bos@583 | 247 team of collaborators, this may be significant.</para> |
bos@583 | 248 |
bos@583 | 249 <sect2> |
bos@583 | 250 <title>Advantages for open source projects</title> |
bos@583 | 251 |
bos@584 | 252 <para id="x_91">If you take a shine to an open source project and decide |
bos@583 | 253 that you would like to start hacking on it, and that project |
bos@583 | 254 uses a distributed revision control tool, you are at once a |
bos@583 | 255 peer with the people who consider themselves the |
bos@583 | 256 <quote>core</quote> of that project. If they publish their |
bos@583 | 257 repositories, you can immediately copy their project history, |
bos@583 | 258 start making changes, and record your work, using the same |
bos@583 | 259 tools in the same ways as insiders. By contrast, with a |
bos@583 | 260 centralised tool, you must use the software in a <quote>read |
bos@583 | 261 only</quote> mode unless someone grants you permission to |
bos@583 | 262 commit changes to their central server. Until then, you won't |
bos@583 | 263 be able to record changes, and your local modifications will |
bos@583 | 264 be at risk of corruption any time you try to update your |
bos@583 | 265 client's view of the repository.</para> |
bos@583 | 266 |
bos@583 | 267 <sect3> |
bos@583 | 268 <title>The forking non-problem</title> |
bos@583 | 269 |
bos@584 | 270 <para id="x_92">It has been suggested that distributed revision control |
bos@583 | 271 tools pose some sort of risk to open source projects because |
bos@583 | 272 they make it easy to <quote>fork</quote> the development of |
bos@583 | 273 a project. A fork happens when there are differences in |
bos@583 | 274 opinion or attitude between groups of developers that cause |
bos@583 | 275 them to decide that they can't work together any longer. |
bos@583 | 276 Each side takes a more or less complete copy of the |
bos@583 | 277 project's source code, and goes off in its own |
bos@583 | 278 direction.</para> |
bos@583 | 279 |
bos@584 | 280 <para id="x_93">Sometimes the camps in a fork decide to reconcile their |
bos@583 | 281 differences. With a centralised revision control system, the |
bos@583 | 282 <emphasis>technical</emphasis> process of reconciliation is |
bos@583 | 283 painful, and has to be performed largely by hand. You have |
bos@583 | 284 to decide whose revision history is going to |
bos@583 | 285 <quote>win</quote>, and graft the other team's changes into |
bos@583 | 286 the tree somehow. This usually loses some or all of one |
bos@583 | 287 side's revision history.</para> |
bos@583 | 288 |
bos@584 | 289 <para id="x_94">What distributed tools do with respect to forking is |
bos@583 | 290 they make forking the <emphasis>only</emphasis> way to |
bos@583 | 291 develop a project. Every single change that you make is |
bos@583 | 292 potentially a fork point. The great strength of this |
bos@583 | 293 approach is that a distributed revision control tool has to |
bos@583 | 294 be really good at <emphasis>merging</emphasis> forks, |
bos@583 | 295 because forks are absolutely fundamental: they happen all |
bos@583 | 296 the time.</para> |
bos@583 | 297 |
bos@584 | 298 <para id="x_95">If every piece of work that everybody does, all the |
bos@583 | 299 time, is framed in terms of forking and merging, then what |
bos@583 | 300 the open source world refers to as a <quote>fork</quote> |
bos@583 | 301 becomes <emphasis>purely</emphasis> a social issue. If |
bos@583 | 302 anything, distributed tools <emphasis>lower</emphasis> the |
bos@583 | 303 likelihood of a fork:</para> |
bos@583 | 304 <itemizedlist> |
bos@584 | 305 <listitem><para id="x_96">They eliminate the social distinction that |
bos@583 | 306 centralised tools impose: that between insiders (people |
bos@583 | 307 with commit access) and outsiders (people |
bos@583 | 308 without).</para></listitem> |
bos@584 | 309 <listitem><para id="x_97">They make it easier to reconcile after a |
bos@583 | 310 social fork, because all that's involved from the |
bos@583 | 311 perspective of the revision control software is just |
bos@583 | 312 another merge.</para></listitem></itemizedlist> |
bos@583 | 313 |
bos@584 | 314 <para id="x_98">Some people resist distributed tools because they want |
bos@583 | 315 to retain tight control over their projects, and they |
bos@583 | 316 believe that centralised tools give them this control. |
bos@583 | 317 However, if you're of this belief, and you publish your CVS |
bos@583 | 318 or Subversion repositories publicly, there are plenty of |
bos@583 | 319 tools available that can pull out your entire project's |
bos@583 | 320 history (albeit slowly) and recreate it somewhere that you |
bos@583 | 321 don't control. So while your control in this case is |
bos@583 | 322 illusory, you are forgoing the ability to fluidly |
bos@583 | 323 collaborate with whatever people feel compelled to mirror |
bos@583 | 324 and fork your history.</para> |
bos@583 | 325 |
bos@583 | 326 </sect3> |
bos@583 | 327 </sect2> |
bos@583 | 328 <sect2> |
bos@583 | 329 <title>Advantages for commercial projects</title> |
bos@583 | 330 |
bos@584 | 331 <para id="x_99">Many commercial projects are undertaken by teams that are |
bos@583 | 332 scattered across the globe. Contributors who are far from a |
bos@583 | 333 central server will see slower command execution and perhaps |
bos@583 | 334 less reliability. Commercial revision control systems attempt |
bos@583 | 335 to ameliorate these problems with remote-site replication |
bos@583 | 336 add-ons that are typically expensive to buy and cantankerous |
bos@583 | 337 to administer. A distributed system doesn't suffer from these |
bos@583 | 338 problems in the first place. Better yet, you can easily set |
bos@583 | 339 up multiple authoritative servers, say one per site, so that |
bos@583 | 340 there's no redundant communication between repositories over |
bos@583 | 341 expensive long-haul network links.</para> |
bos@583 | 342 |
bos@584 | 343 <para id="x_9a">Centralised revision control systems tend to have |
bos@583 | 344 relatively low scalability. It's not unusual for an expensive |
bos@583 | 345 centralised system to fall over under the combined load of |
bos@583 | 346 just a few dozen concurrent users. Once again, the typical |
bos@583 | 347 response tends to be an expensive and clunky replication |
bos@609 | 348 facility. Since the load on a central server&emdash;if you have |
bos@609 | 349 one at all&emdash;is many times lower with a distributed tool |
bos@583 | 350 (because all of the data is replicated everywhere), a single |
bos@583 | 351 cheap server can handle the needs of a much larger team, and |
bos@583 | 352 replication to balance load becomes a simple matter of |
bos@583 | 353 scripting.</para> |
bos@583 | 354 |
bos@584 | 355 <para id="x_9b">If you have an employee in the field, troubleshooting a |
bos@583 | 356 problem at a customer's site, they'll benefit from distributed |
bos@583 | 357 revision control. The tool will let them generate custom |
bos@583 | 358 builds, try different fixes in isolation from each other, and |
bos@583 | 359 search efficiently through history for the sources of bugs and |
bos@583 | 360 regressions in the customer's environment, all without needing |
bos@583 | 361 to connect to your company's network.</para> |
bos@583 | 362 |
bos@583 | 363 </sect2> |
bos@583 | 364 </sect1> |
bos@583 | 365 <sect1> |
bos@583 | 366 <title>Why choose Mercurial?</title> |
bos@583 | 367 |
bos@584 | 368 <para id="x_9c">Mercurial has a unique set of properties that make it a |
bos@583 | 369 particularly good choice as a revision control system.</para> |
bos@583 | 370 <itemizedlist> |
bos@584 | 371 <listitem><para id="x_9d">It is easy to learn and use.</para></listitem> |
bos@584 | 372 <listitem><para id="x_9e">It is lightweight.</para></listitem> |
bos@584 | 373 <listitem><para id="x_9f">It scales excellently.</para></listitem> |
bos@584 | 374 <listitem><para id="x_a0">It is easy to |
bos@583 | 375 customise.</para></listitem></itemizedlist> |
bos@583 | 376 |
bos@584 | 377 <para id="x_a1">If you are at all familiar with revision control systems, |
bos@583 | 378 you should be able to get up and running with Mercurial in less |
bos@583 | 379 than five minutes. Even if not, it will take no more than a few |
bos@583 | 380 minutes longer. Mercurial's command and feature sets are |
bos@583 | 381 generally uniform and consistent, so you can keep track of a few |
bos@583 | 382 general rules instead of a host of exceptions.</para> |
bos@583 | 383 |
bos@584 | 384 <para id="x_a2">On a small project, you can start working with Mercurial in |
bos@583 | 385 moments. Creating new changes and branches; transferring changes |
bos@583 | 386 around (whether locally or over a network); and history and |
bos@583 | 387 status operations are all fast. Mercurial attempts to stay |
bos@583 | 388 nimble and largely out of your way by combining low cognitive |
bos@583 | 389 overhead with blazingly fast operations.</para> |
bos@583 | 390 |
bos@584 | 391 <para id="x_a3">The usefulness of Mercurial is not limited to small |
bos@583 | 392 projects: it is used by projects with hundreds to thousands of |
bos@583 | 393 contributors, each containing tens of thousands of files and |
bos@583 | 394 hundreds of megabytes of source code.</para> |
bos@583 | 395 |
bos@584 | 396 <para id="x_a4">If the core functionality of Mercurial is not enough for |
bos@583 | 397 you, it's easy to build on. Mercurial is well suited to |
bos@583 | 398 scripting tasks, and its clean internals and implementation in |
bos@583 | 399 Python make it easy to add features in the form of extensions. |
bos@583 | 400 There are a number of popular and useful extensions already |
bos@583 | 401 available, ranging from helping to identify bugs to improving |
bos@583 | 402 performance.</para> |
bos@583 | 403 |
bos@583 | 404 </sect1> |
bos@583 | 405 <sect1> |
bos@583 | 406 <title>Mercurial compared with other tools</title> |
bos@583 | 407 |
bos@584 | 408 <para id="x_a5">Before you read on, please understand that this section |
bos@583 | 409 necessarily reflects my own experiences, interests, and (dare I |
bos@583 | 410 say it) biases. I have used every one of the revision control |
bos@583 | 411 tools listed below, in most cases for several years at a |
bos@583 | 412 time.</para> |
bos@583 | 413 |
bos@583 | 414 |
bos@583 | 415 <sect2> |
bos@583 | 416 <title>Subversion</title> |
bos@583 | 417 |
bos@584 | 418 <para id="x_a6">Subversion is a popular revision control tool, developed |
bos@583 | 419 to replace CVS. It has a centralised client/server |
bos@583 | 420 architecture.</para> |
bos@583 | 421 |
bos@584 | 422 <para id="x_a7">Subversion and Mercurial have similarly named commands for |
bos@583 | 423 performing the same operations, so if you're familiar with |
bos@583 | 424 one, it is easy to learn to use the other. Both tools are |
bos@583 | 425 portable to all popular operating systems.</para> |
bos@583 | 426 |
bos@584 | 427 <para id="x_a8">Prior to version 1.5, Subversion had no useful support for |
bos@583 | 428 merges. At the time of writing, its merge tracking capability |
bos@583 | 429 is new, and known to be <ulink |
bos@583 | 430 url="http://svnbook.red-bean.com/nightly/en/svn.branchmerge.advanced.html#svn.branchmerge.advanced.finalword">complicated |
bos@583 | 431 and buggy</ulink>.</para> |
bos@583 | 432 |
bos@584 | 433 <para id="x_a9">Mercurial has a substantial performance advantage over |
bos@583 | 434 Subversion on every revision control operation I have |
bos@583 | 435 benchmarked. I have measured its advantage as ranging from a |
bos@583 | 436 factor of two to a factor of six when compared with Subversion |
bos@583 | 437 1.4.3's <emphasis>ra_local</emphasis> file store, which is the |
bos@583 | 438 fastest access method available. In more realistic |
bos@583 | 439 deployments involving a network-based store, Subversion will |
bos@583 | 440 be at a substantially larger disadvantage. Because many |
bos@583 | 441 Subversion commands must talk to the server and Subversion |
bos@583 | 442 does not have useful replication facilities, server capacity |
bos@583 | 443 and network bandwidth become bottlenecks for modestly large |
bos@583 | 444 projects.</para> |
bos@583 | 445 |
bos@584 | 446 <para id="x_aa">Additionally, Subversion incurs substantial storage |
bos@583 | 447 overhead to avoid network transactions for a few common |
bos@583 | 448 operations, such as finding modified files |
bos@583 | 449 (<literal>status</literal>) and displaying modifications |
bos@583 | 450 against the current revision (<literal>diff</literal>). As a |
bos@583 | 451 result, a Subversion working copy is often the same size as, |
bos@583 | 452 or larger than, a Mercurial repository and working directory, |
bos@583 | 453 even though the Mercurial repository contains a complete |
bos@583 | 454 history of the project.</para> |
bos@583 | 455 |
bos@584 | 456 <para id="x_ab">Subversion is widely supported by third party tools. |
bos@583 | 457 Mercurial currently lags considerably in this area. This gap |
bos@583 | 458 is closing, however, and indeed some of Mercurial's GUI tools |
bos@583 | 459 now outshine their Subversion equivalents. Like Mercurial, |
bos@583 | 460 Subversion has an excellent user manual.</para> |
bos@583 | 461 |
bos@584 | 462 <para id="x_ac">Because Subversion doesn't store revision history on the |
bos@583 | 463 client, it is well suited to managing projects that deal with |
bos@583 | 464 lots of large, opaque binary files. If you check in fifty |
bos@583 | 465 revisions to an incompressible 10MB file, Subversion's |
bos@583 | 466 client-side space usage stays constant The space used by any |
bos@583 | 467 distributed SCM will grow rapidly in proportion to the number |
bos@583 | 468 of revisions, because the differences between each revision |
bos@583 | 469 are large.</para> |
bos@583 | 470 |
bos@584 | 471 <para id="x_ad">In addition, it's often difficult or, more usually, |
bos@583 | 472 impossible to merge different versions of a binary file. |
bos@583 | 473 Subversion's ability to let a user lock a file, so that they |
bos@583 | 474 temporarily have the exclusive right to commit changes to it, |
bos@583 | 475 can be a significant advantage to a project where binary files |
bos@583 | 476 are widely used.</para> |
bos@583 | 477 |
bos@584 | 478 <para id="x_ae">Mercurial can import revision history from a Subversion |
bos@583 | 479 repository. It can also export revision history to a |
bos@583 | 480 Subversion repository. This makes it easy to <quote>test the |
bos@583 | 481 waters</quote> and use Mercurial and Subversion in parallel |
bos@583 | 482 before deciding to switch. History conversion is incremental, |
bos@583 | 483 so you can perform an initial conversion, then small |
bos@583 | 484 additional conversions afterwards to bring in new |
bos@583 | 485 changes.</para> |
bos@583 | 486 |
bos@583 | 487 |
bos@583 | 488 </sect2> |
bos@583 | 489 <sect2> |
bos@583 | 490 <title>Git</title> |
bos@583 | 491 |
bos@584 | 492 <para id="x_af">Git is a distributed revision control tool that was |
bos@583 | 493 developed for managing the Linux kernel source tree. Like |
bos@583 | 494 Mercurial, its early design was somewhat influenced by |
bos@583 | 495 Monotone.</para> |
bos@583 | 496 |
bos@584 | 497 <para id="x_b0">Git has a very large command set, with version 1.5.0 |
bos@583 | 498 providing 139 individual commands. It has something of a |
bos@583 | 499 reputation for being difficult to learn. Compared to Git, |
bos@583 | 500 Mercurial has a strong focus on simplicity.</para> |
bos@583 | 501 |
bos@584 | 502 <para id="x_b1">In terms of performance, Git is extremely fast. In |
bos@583 | 503 several cases, it is faster than Mercurial, at least on Linux, |
bos@583 | 504 while Mercurial performs better on other operations. However, |
bos@583 | 505 on Windows, the performance and general level of support that |
bos@583 | 506 Git provides is, at the time of writing, far behind that of |
bos@583 | 507 Mercurial.</para> |
bos@583 | 508 |
bos@584 | 509 <para id="x_b2">While a Mercurial repository needs no maintenance, a Git |
bos@583 | 510 repository requires frequent manual <quote>repacks</quote> of |
bos@583 | 511 its metadata. Without these, performance degrades, while |
bos@583 | 512 space usage grows rapidly. A server that contains many Git |
bos@583 | 513 repositories that are not rigorously and frequently repacked |
bos@583 | 514 will become heavily disk-bound during backups, and there have |
bos@583 | 515 been instances of daily backups taking far longer than 24 |
bos@583 | 516 hours as a result. A freshly packed Git repository is |
bos@583 | 517 slightly smaller than a Mercurial repository, but an unpacked |
bos@583 | 518 repository is several orders of magnitude larger.</para> |
bos@583 | 519 |
bos@584 | 520 <para id="x_b3">The core of Git is written in C. Many Git commands are |
bos@583 | 521 implemented as shell or Perl scripts, and the quality of these |
bos@583 | 522 scripts varies widely. I have encountered several instances |
bos@583 | 523 where scripts charged along blindly in the presence of errors |
bos@583 | 524 that should have been fatal.</para> |
bos@583 | 525 |
bos@584 | 526 <para id="x_b4">Mercurial can import revision history from a Git |
bos@583 | 527 repository.</para> |
bos@583 | 528 |
bos@583 | 529 |
bos@583 | 530 </sect2> |
bos@583 | 531 <sect2> |
bos@583 | 532 <title>CVS</title> |
bos@583 | 533 |
bos@584 | 534 <para id="x_b5">CVS is probably the most widely used revision control tool |
bos@583 | 535 in the world. Due to its age and internal untidiness, it has |
bos@583 | 536 been only lightly maintained for many years.</para> |
bos@583 | 537 |
bos@584 | 538 <para id="x_b6">It has a centralised client/server architecture. It does |
bos@583 | 539 not group related file changes into atomic commits, making it |
bos@583 | 540 easy for people to <quote>break the build</quote>: one person |
bos@583 | 541 can successfully commit part of a change and then be blocked |
bos@583 | 542 by the need for a merge, causing other people to see only a |
bos@583 | 543 portion of the work they intended to do. This also affects |
bos@583 | 544 how you work with project history. If you want to see all of |
bos@583 | 545 the modifications someone made as part of a task, you will |
bos@583 | 546 need to manually inspect the descriptions and timestamps of |
bos@583 | 547 the changes made to each file involved (if you even know what |
bos@583 | 548 those files were).</para> |
bos@583 | 549 |
bos@584 | 550 <para id="x_b7">CVS has a muddled notion of tags and branches that I will |
bos@583 | 551 not attempt to even describe. It does not support renaming of |
bos@583 | 552 files or directories well, making it easy to corrupt a |
bos@583 | 553 repository. It has almost no internal consistency checking |
bos@583 | 554 capabilities, so it is usually not even possible to tell |
bos@583 | 555 whether or how a repository is corrupt. I would not recommend |
bos@583 | 556 CVS for any project, existing or new.</para> |
bos@583 | 557 |
bos@584 | 558 <para id="x_b8">Mercurial can import CVS revision history. However, there |
bos@583 | 559 are a few caveats that apply; these are true of every other |
bos@583 | 560 revision control tool's CVS importer, too. Due to CVS's lack |
bos@583 | 561 of atomic changes and unversioned filesystem hierarchy, it is |
bos@583 | 562 not possible to reconstruct CVS history completely accurately; |
bos@583 | 563 some guesswork is involved, and renames will usually not show |
bos@583 | 564 up. Because a lot of advanced CVS administration has to be |
bos@583 | 565 done by hand and is hence error-prone, it's common for CVS |
bos@583 | 566 importers to run into multiple problems with corrupted |
bos@583 | 567 repositories (completely bogus revision timestamps and files |
bos@583 | 568 that have remained locked for over a decade are just two of |
bos@583 | 569 the less interesting problems I can recall from personal |
bos@583 | 570 experience).</para> |
bos@583 | 571 |
bos@584 | 572 <para id="x_b9">Mercurial can import revision history from a CVS |
bos@583 | 573 repository.</para> |
bos@583 | 574 |
bos@583 | 575 |
bos@583 | 576 </sect2> |
bos@583 | 577 <sect2> |
bos@583 | 578 <title>Commercial tools</title> |
bos@583 | 579 |
bos@584 | 580 <para id="x_ba">Perforce has a centralised client/server architecture, |
bos@583 | 581 with no client-side caching of any data. Unlike modern |
bos@583 | 582 revision control tools, Perforce requires that a user run a |
bos@583 | 583 command to inform the server about every file they intend to |
bos@583 | 584 edit.</para> |
bos@583 | 585 |
bos@584 | 586 <para id="x_bb">The performance of Perforce is quite good for small teams, |
bos@583 | 587 but it falls off rapidly as the number of users grows beyond a |
bos@583 | 588 few dozen. Modestly large Perforce installations require the |
bos@583 | 589 deployment of proxies to cope with the load their users |
bos@583 | 590 generate.</para> |
bos@583 | 591 |
bos@583 | 592 |
bos@583 | 593 </sect2> |
bos@583 | 594 <sect2> |
bos@583 | 595 <title>Choosing a revision control tool</title> |
bos@583 | 596 |
bos@584 | 597 <para id="x_bc">With the exception of CVS, all of the tools listed above |
bos@583 | 598 have unique strengths that suit them to particular styles of |
bos@583 | 599 work. There is no single revision control tool that is best |
bos@583 | 600 in all situations.</para> |
bos@583 | 601 |
bos@584 | 602 <para id="x_bd">As an example, Subversion is a good choice for working |
bos@583 | 603 with frequently edited binary files, due to its centralised |
bos@583 | 604 nature and support for file locking.</para> |
bos@583 | 605 |
bos@584 | 606 <para id="x_be">I personally find Mercurial's properties of simplicity, |
bos@583 | 607 performance, and good merge support to be a compelling |
bos@583 | 608 combination that has served me well for several years.</para> |
bos@583 | 609 |
bos@583 | 610 |
bos@583 | 611 </sect2> |
bos@583 | 612 </sect1> |
bos@583 | 613 <sect1> |
bos@583 | 614 <title>Switching from another tool to Mercurial</title> |
bos@583 | 615 |
bos@584 | 616 <para id="x_bf">Mercurial is bundled with an extension named <literal |
bos@583 | 617 role="hg-ext">convert</literal>, which can incrementally |
bos@583 | 618 import revision history from several other revision control |
bos@583 | 619 tools. By <quote>incremental</quote>, I mean that you can |
bos@583 | 620 convert all of a project's history to date in one go, then rerun |
bos@583 | 621 the conversion later to obtain new changes that happened after |
bos@583 | 622 the initial conversion.</para> |
bos@583 | 623 |
bos@584 | 624 <para id="x_c0">The revision control tools supported by <literal |
bos@583 | 625 role="hg-ext">convert</literal> are as follows:</para> |
bos@583 | 626 <itemizedlist> |
bos@584 | 627 <listitem><para id="x_c1">Subversion</para></listitem> |
bos@584 | 628 <listitem><para id="x_c2">CVS</para></listitem> |
bos@584 | 629 <listitem><para id="x_c3">Git</para></listitem> |
bos@584 | 630 <listitem><para id="x_c4">Darcs</para></listitem></itemizedlist> |
bos@584 | 631 |
bos@584 | 632 <para id="x_c5">In addition, <literal role="hg-ext">convert</literal> can |
bos@583 | 633 export changes from Mercurial to Subversion. This makes it |
bos@583 | 634 possible to try Subversion and Mercurial in parallel before |
bos@583 | 635 committing to a switchover, without risking the loss of any |
bos@583 | 636 work.</para> |
bos@583 | 637 |
bos@584 | 638 <para id="x_c6">The <command role="hg-ext-convert">convert</command> command |
bos@583 | 639 is easy to use. Simply point it at the path or URL of the |
bos@583 | 640 source repository, optionally give it the name of the |
bos@583 | 641 destination repository, and it will start working. After the |
bos@583 | 642 initial conversion, just run the same command again to import |
bos@583 | 643 new changes.</para> |
bos@583 | 644 </sect1> |
bos@583 | 645 |
bos@583 | 646 <sect1> |
bos@583 | 647 <title>A short history of revision control</title> |
bos@583 | 648 |
bos@584 | 649 <para id="x_c7">The best known of the old-time revision control tools is |
bos@583 | 650 SCCS (Source Code Control System), which Marc Rochkind wrote at |
bos@583 | 651 Bell Labs, in the early 1970s. SCCS operated on individual |
bos@583 | 652 files, and required every person working on a project to have |
bos@583 | 653 access to a shared workspace on a single system. Only one |
bos@583 | 654 person could modify a file at any time; arbitration for access |
bos@583 | 655 to files was via locks. It was common for people to lock files, |
bos@583 | 656 and later forget to unlock them, preventing anyone else from |
bos@583 | 657 modifying those files without the help of an |
bos@583 | 658 administrator.</para> |
bos@583 | 659 |
bos@584 | 660 <para id="x_c8">Walter Tichy developed a free alternative to SCCS in the |
bos@583 | 661 early 1980s; he called his program RCS (Revision Control System). |
bos@583 | 662 Like SCCS, RCS required developers to work in a single shared |
bos@583 | 663 workspace, and to lock files to prevent multiple people from |
bos@583 | 664 modifying them simultaneously.</para> |
bos@583 | 665 |
bos@584 | 666 <para id="x_c9">Later in the 1980s, Dick Grune used RCS as a building block |
bos@583 | 667 for a set of shell scripts he initially called cmt, but then |
bos@583 | 668 renamed to CVS (Concurrent Versions System). The big innovation |
bos@583 | 669 of CVS was that it let developers work simultaneously and |
bos@583 | 670 somewhat independently in their own personal workspaces. The |
bos@583 | 671 personal workspaces prevented developers from stepping on each |
bos@583 | 672 other's toes all the time, as was common with SCCS and RCS. Each |
bos@583 | 673 developer had a copy of every project file, and could modify |
bos@583 | 674 their copies independently. They had to merge their edits prior |
bos@583 | 675 to committing changes to the central repository.</para> |
bos@583 | 676 |
bos@584 | 677 <para id="x_ca">Brian Berliner took Grune's original scripts and rewrote |
bos@583 | 678 them in C, releasing in 1989 the code that has since developed |
bos@583 | 679 into the modern version of CVS. CVS subsequently acquired the |
bos@583 | 680 ability to operate over a network connection, giving it a |
bos@583 | 681 client/server architecture. CVS's architecture is centralised; |
bos@583 | 682 only the server has a copy of the history of the project. Client |
bos@583 | 683 workspaces just contain copies of recent versions of the |
bos@583 | 684 project's files, and a little metadata to tell them where the |
bos@583 | 685 server is. CVS has been enormously successful; it is probably |
bos@583 | 686 the world's most widely used revision control system.</para> |
bos@583 | 687 |
bos@584 | 688 <para id="x_cb">In the early 1990s, Sun Microsystems developed an early |
bos@583 | 689 distributed revision control system, called TeamWare. A |
bos@583 | 690 TeamWare workspace contains a complete copy of the project's |
bos@583 | 691 history. TeamWare has no notion of a central repository. (CVS |
bos@583 | 692 relied upon RCS for its history storage; TeamWare used |
bos@583 | 693 SCCS.)</para> |
bos@583 | 694 |
bos@584 | 695 <para id="x_cc">As the 1990s progressed, awareness grew of a number of |
bos@583 | 696 problems with CVS. It records simultaneous changes to multiple |
bos@583 | 697 files individually, instead of grouping them together as a |
bos@583 | 698 single logically atomic operation. It does not manage its file |
bos@583 | 699 hierarchy well; it is easy to make a mess of a repository by |
bos@583 | 700 renaming files and directories. Worse, its source code is |
bos@583 | 701 difficult to read and maintain, which made the <quote>pain |
bos@583 | 702 level</quote> of fixing these architectural problems |
bos@583 | 703 prohibitive.</para> |
bos@583 | 704 |
bos@584 | 705 <para id="x_cd">In 2001, Jim Blandy and Karl Fogel, two developers who had |
bos@583 | 706 worked on CVS, started a project to replace it with a tool that |
bos@583 | 707 would have a better architecture and cleaner code. The result, |
bos@583 | 708 Subversion, does not stray from CVS's centralised client/server |
bos@583 | 709 model, but it adds multi-file atomic commits, better namespace |
bos@583 | 710 management, and a number of other features that make it a |
bos@583 | 711 generally better tool than CVS. Since its initial release, it |
bos@583 | 712 has rapidly grown in popularity.</para> |
bos@583 | 713 |
bos@584 | 714 <para id="x_ce">More or less simultaneously, Graydon Hoare began working on |
bos@583 | 715 an ambitious distributed revision control system that he named |
bos@583 | 716 Monotone. While Monotone addresses many of CVS's design flaws |
bos@583 | 717 and has a peer-to-peer architecture, it goes beyond earlier (and |
bos@583 | 718 subsequent) revision control tools in a number of innovative |
bos@583 | 719 ways. It uses cryptographic hashes as identifiers, and has an |
bos@583 | 720 integral notion of <quote>trust</quote> for code from different |
bos@583 | 721 sources.</para> |
bos@583 | 722 |
bos@584 | 723 <para id="x_cf">Mercurial began life in 2005. While a few aspects of its |
bos@583 | 724 design are influenced by Monotone, Mercurial focuses on ease of |
bos@583 | 725 use, high performance, and scalability to very large |
bos@583 | 726 projects.</para> |
bos@583 | 727 </sect1> |
bos@704 | 728 </chapter> |
bos@704 | 729 |
bos@559 | 730 <!-- |
bos@559 | 731 local variables: |
bos@704 | 732 sgml-parent-document: ("00book.xml" "book" "chapter") |
bos@559 | 733 end: |
bos@559 | 734 --> |