hgbook: es/intro.tex annotate

hgbook

annotate es/intro.tex @ 402:b05e35d641e4

Copying the files from en to es and taking intro chapter

author	Igor TAmara <igor@tamarapatino.org>
date	Fri Nov 07 21:42:57 2008 -0500 (2008-11-07)
parents	04c08ad7e92e
children	4cdeb830118b

rev	line source
igor@402	1 \chapter{Introduction}
igor@402	2 \label{chap:intro}
igor@402	3
igor@402	4 \section{About revision control}
igor@402	5
igor@402	6 Revision control is the process of managing multiple versions of a
igor@402	7 piece of information. In its simplest form, this is something that
igor@402	8 many people do by hand: every time you modify a file, save it under a
igor@402	9 new name that contains a number, each one higher than the number of
igor@402	10 the preceding version.
igor@402	11
igor@402	12 Manually managing multiple versions of even a single file is an
igor@402	13 error-prone task, though, so software tools to help automate this
igor@402	14 process have long been available. The earliest automated revision
igor@402	15 control tools were intended to help a single user to manage revisions
igor@402	16 of a single file. Over the past few decades, the scope of revision
igor@402	17 control tools has expanded greatly; they now manage multiple files,
igor@402	18 and help multiple people to work together. The best modern revision
igor@402	19 control tools have no problem coping with thousands of people working
igor@402	20 together on projects that consist of hundreds of thousands of files.
igor@402	21
igor@402	22 \subsection{Why use revision control?}
igor@402	23
igor@402	24 There are a number of reasons why you or your team might want to use
igor@402	25 an automated revision control tool for a project.
igor@402	26 \begin{itemize}
igor@402	27 \item It will track the history and evolution of your project, so you
igor@402	28 don't have to. For every change, you'll have a log of \emph{who}
igor@402	29 made it; \emph{why} they made it; \emph{when} they made it; and
igor@402	30 \emph{what} the change was.
igor@402	31 \item When you're working with other people, revision control software
igor@402	32 makes it easier for you to collaborate. For example, when people
igor@402	33 more or less simultaneously make potentially incompatible changes,
igor@402	34 the software will help you to identify and resolve those conflicts.
igor@402	35 \item It can help you to recover from mistakes. If you make a change
igor@402	36 that later turns out to be in error, you can revert to an earlier
igor@402	37 version of one or more files. In fact, a \emph{really} good
igor@402	38 revision control tool will even help you to efficiently figure out
igor@402	39 exactly when a problem was introduced (see
igor@402	40 section~\ref{sec:undo:bisect} for details).
igor@402	41 \item It will help you to work simultaneously on, and manage the drift
igor@402	42 between, multiple versions of your project.
igor@402	43 \end{itemize}
igor@402	44 Most of these reasons are equally valid---at least in theory---whether
igor@402	45 you're working on a project by yourself, or with a hundred other
igor@402	46 people.
igor@402	47
igor@402	48 A key question about the practicality of revision control at these two
igor@402	49 different scales (``lone hacker'' and ``huge team'') is how its
igor@402	50 \emph{benefits} compare to its \emph{costs}. A revision control tool
igor@402	51 that's difficult to understand or use is going to impose a high cost.
igor@402	52
igor@402	53 A five-hundred-person project is likely to collapse under its own
igor@402	54 weight almost immediately without a revision control tool and process.
igor@402	55 In this case, the cost of using revision control might hardly seem
igor@402	56 worth considering, since \emph{without} it, failure is almost
igor@402	57 guaranteed.
igor@402	58
igor@402	59 On the other hand, a one-person ``quick hack'' might seem like a poor
igor@402	60 place to use a revision control tool, because surely the cost of using
igor@402	61 one must be close to the overall cost of the project. Right?
igor@402	62
igor@402	63 Mercurial uniquely supports \emph{both} of these scales of
igor@402	64 development. You can learn the basics in just a few minutes, and due
igor@402	65 to its low overhead, you can apply revision control to the smallest of
igor@402	66 projects with ease. Its simplicity means you won't have a lot of
igor@402	67 abstruse concepts or command sequences competing for mental space with
igor@402	68 whatever you're \emph{really} trying to do. At the same time,
igor@402	69 Mercurial's high performance and peer-to-peer nature let you scale
igor@402	70 painlessly to handle large projects.
igor@402	71
igor@402	72 No revision control tool can rescue a poorly run project, but a good
igor@402	73 choice of tools can make a huge difference to the fluidity with which
igor@402	74 you can work on a project.
igor@402	75
igor@402	76 \subsection{The many names of revision control}
igor@402	77
igor@402	78 Revision control is a diverse field, so much so that it doesn't
igor@402	79 actually have a single name or acronym. Here are a few of the more
igor@402	80 common names and acronyms you'll encounter:
igor@402	81 \begin{itemize}
igor@402	82 \item Revision control (RCS)
igor@402	83 \item Software configuration management (SCM), or configuration management
igor@402	84 \item Source code management
igor@402	85 \item Source code control, or source control
igor@402	86 \item Version control (VCS)
igor@402	87 \end{itemize}
igor@402	88 Some people claim that these terms actually have different meanings,
igor@402	89 but in practice they overlap so much that there's no agreed or even
igor@402	90 useful way to tease them apart.
igor@402	91
igor@402	92 \section{A short history of revision control}
igor@402	93
igor@402	94 The best known of the old-time revision control tools is SCCS (Source
igor@402	95 Code Control System), which Marc Rochkind wrote at Bell Labs, in the
igor@402	96 early 1970s. SCCS operated on individual files, and required every
igor@402	97 person working on a project to have access to a shared workspace on a
igor@402	98 single system. Only one person could modify a file at any time;
igor@402	99 arbitration for access to files was via locks. It was common for
igor@402	100 people to lock files, and later forget to unlock them, preventing
igor@402	101 anyone else from modifying those files without the help of an
igor@402	102 administrator.
igor@402	103
igor@402	104 Walter Tichy developed a free alternative to SCCS in the early 1980s;
igor@402	105 he called his program RCS (Revison Control System). Like SCCS, RCS
igor@402	106 required developers to work in a single shared workspace, and to lock
igor@402	107 files to prevent multiple people from modifying them simultaneously.
igor@402	108
igor@402	109 Later in the 1980s, Dick Grune used RCS as a building block for a set
igor@402	110 of shell scripts he initially called cmt, but then renamed to CVS
igor@402	111 (Concurrent Versions System). The big innovation of CVS was that it
igor@402	112 let developers work simultaneously and somewhat independently in their
igor@402	113 own personal workspaces. The personal workspaces prevented developers
igor@402	114 from stepping on each other's toes all the time, as was common with
igor@402	115 SCCS and RCS. Each developer had a copy of every project file, and
igor@402	116 could modify their copies independently. They had to merge their
igor@402	117 edits prior to committing changes to the central repository.
igor@402	118
igor@402	119 Brian Berliner took Grune's original scripts and rewrote them in~C,
igor@402	120 releasing in 1989 the code that has since developed into the modern
igor@402	121 version of CVS. CVS subsequently acquired the ability to operate over
igor@402	122 a network connection, giving it a client/server architecture. CVS's
igor@402	123 architecture is centralised; only the server has a copy of the history
igor@402	124 of the project. Client workspaces just contain copies of recent
igor@402	125 versions of the project's files, and a little metadata to tell them
igor@402	126 where the server is. CVS has been enormously successful; it is
igor@402	127 probably the world's most widely used revision control system.
igor@402	128
igor@402	129 In the early 1990s, Sun Microsystems developed an early distributed
igor@402	130 revision control system, called TeamWare. A TeamWare workspace
igor@402	131 contains a complete copy of the project's history. TeamWare has no
igor@402	132 notion of a central repository. (CVS relied upon RCS for its history
igor@402	133 storage; TeamWare used SCCS.)
igor@402	134
igor@402	135 As the 1990s progressed, awareness grew of a number of problems with
igor@402	136 CVS. It records simultaneous changes to multiple files individually,
igor@402	137 instead of grouping them together as a single logically atomic
igor@402	138 operation. It does not manage its file hierarchy well; it is easy to
igor@402	139 make a mess of a repository by renaming files and directories. Worse,
igor@402	140 its source code is difficult to read and maintain, which made the
igor@402	141 ``pain level'' of fixing these architectural problems prohibitive.
igor@402	142
igor@402	143 In 2001, Jim Blandy and Karl Fogel, two developers who had worked on
igor@402	144 CVS, started a project to replace it with a tool that would have a
igor@402	145 better architecture and cleaner code. The result, Subversion, does
igor@402	146 not stray from CVS's centralised client/server model, but it adds
igor@402	147 multi-file atomic commits, better namespace management, and a number
igor@402	148 of other features that make it a generally better tool than CVS.
igor@402	149 Since its initial release, it has rapidly grown in popularity.
igor@402	150
igor@402	151 More or less simultaneously, Graydon Hoare began working on an
igor@402	152 ambitious distributed revision control system that he named Monotone.
igor@402	153 While Monotone addresses many of CVS's design flaws and has a
igor@402	154 peer-to-peer architecture, it goes beyond earlier (and subsequent)
igor@402	155 revision control tools in a number of innovative ways. It uses
igor@402	156 cryptographic hashes as identifiers, and has an integral notion of
igor@402	157 ``trust'' for code from different sources.
igor@402	158
igor@402	159 Mercurial began life in 2005. While a few aspects of its design are
igor@402	160 influenced by Monotone, Mercurial focuses on ease of use, high
igor@402	161 performance, and scalability to very large projects.
igor@402	162
igor@402	163 \section{Trends in revision control}
igor@402	164
igor@402	165 There has been an unmistakable trend in the development and use of
igor@402	166 revision control tools over the past four decades, as people have
igor@402	167 become familiar with the capabilities of their tools and constrained
igor@402	168 by their limitations.
igor@402	169
igor@402	170 The first generation began by managing single files on individual
igor@402	171 computers. Although these tools represented a huge advance over
igor@402	172 ad-hoc manual revision control, their locking model and reliance on a
igor@402	173 single computer limited them to small, tightly-knit teams.
igor@402	174
igor@402	175 The second generation loosened these constraints by moving to
igor@402	176 network-centered architectures, and managing entire projects at a
igor@402	177 time. As projects grew larger, they ran into new problems. With
igor@402	178 clients needing to talk to servers very frequently, server scaling
igor@402	179 became an issue for large projects. An unreliable network connection
igor@402	180 could prevent remote users from being able to talk to the server at
igor@402	181 all. As open source projects started making read-only access
igor@402	182 available anonymously to anyone, people without commit privileges
igor@402	183 found that they could not use the tools to interact with a project in
igor@402	184 a natural way, as they could not record their changes.
igor@402	185
igor@402	186 The current generation of revision control tools is peer-to-peer in
igor@402	187 nature. All of these systems have dropped the dependency on a single
igor@402	188 central server, and allow people to distribute their revision control
igor@402	189 data to where it's actually needed. Collaboration over the Internet
igor@402	190 has moved from constrained by technology to a matter of choice and
igor@402	191 consensus. Modern tools can operate offline indefinitely and
igor@402	192 autonomously, with a network connection only needed when syncing
igor@402	193 changes with another repository.
igor@402	194
igor@402	195 \section{A few of the advantages of distributed revision control}
igor@402	196
igor@402	197 Even though distributed revision control tools have for several years
igor@402	198 been as robust and usable as their previous-generation counterparts,
igor@402	199 people using older tools have not yet necessarily woken up to their
igor@402	200 advantages. There are a number of ways in which distributed tools
igor@402	201 shine relative to centralised ones.
igor@402	202
igor@402	203 For an individual developer, distributed tools are almost always much
igor@402	204 faster than centralised tools. This is for a simple reason: a
igor@402	205 centralised tool needs to talk over the network for many common
igor@402	206 operations, because most metadata is stored in a single copy on the
igor@402	207 central server. A distributed tool stores all of its metadata
igor@402	208 locally. All else being equal, talking over the network adds overhead
igor@402	209 to a centralised tool. Don't underestimate the value of a snappy,
igor@402	210 responsive tool: you're going to spend a lot of time interacting with
igor@402	211 your revision control software.
igor@402	212
igor@402	213 Distributed tools are indifferent to the vagaries of your server
igor@402	214 infrastructure, again because they replicate metadata to so many
igor@402	215 locations. If you use a centralised system and your server catches
igor@402	216 fire, you'd better hope that your backup media are reliable, and that
igor@402	217 your last backup was recent and actually worked. With a distributed
igor@402	218 tool, you have many backups available on every contributor's computer.
igor@402	219
igor@402	220 The reliability of your network will affect distributed tools far less
igor@402	221 than it will centralised tools. You can't even use a centralised tool
igor@402	222 without a network connection, except for a few highly constrained
igor@402	223 commands. With a distributed tool, if your network connection goes
igor@402	224 down while you're working, you may not even notice. The only thing
igor@402	225 you won't be able to do is talk to repositories on other computers,
igor@402	226 something that is relatively rare compared with local operations. If
igor@402	227 you have a far-flung team of collaborators, this may be significant.
igor@402	228
igor@402	229 \subsection{Advantages for open source projects}
igor@402	230
igor@402	231 If you take a shine to an open source project and decide that you
igor@402	232 would like to start hacking on it, and that project uses a distributed
igor@402	233 revision control tool, you are at once a peer with the people who
igor@402	234 consider themselves the ``core'' of that project. If they publish
igor@402	235 their repositories, you can immediately copy their project history,
igor@402	236 start making changes, and record your work, using the same tools in
igor@402	237 the same ways as insiders. By contrast, with a centralised tool, you
igor@402	238 must use the software in a ``read only'' mode unless someone grants
igor@402	239 you permission to commit changes to their central server. Until then,
igor@402	240 you won't be able to record changes, and your local modifications will
igor@402	241 be at risk of corruption any time you try to update your client's view
igor@402	242 of the repository.
igor@402	243
igor@402	244 \subsubsection{The forking non-problem}
igor@402	245
igor@402	246 It has been suggested that distributed revision control tools pose
igor@402	247 some sort of risk to open source projects because they make it easy to
igor@402	248 ``fork'' the development of a project. A fork happens when there are
igor@402	249 differences in opinion or attitude between groups of developers that
igor@402	250 cause them to decide that they can't work together any longer. Each
igor@402	251 side takes a more or less complete copy of the project's source code,
igor@402	252 and goes off in its own direction.
igor@402	253
igor@402	254 Sometimes the camps in a fork decide to reconcile their differences.
igor@402	255 With a centralised revision control system, the \emph{technical}
igor@402	256 process of reconciliation is painful, and has to be performed largely
igor@402	257 by hand. You have to decide whose revision history is going to
igor@402	258 ``win'', and graft the other team's changes into the tree somehow.
igor@402	259 This usually loses some or all of one side's revision history.
igor@402	260
igor@402	261 What distributed tools do with respect to forking is they make forking
igor@402	262 the \emph{only} way to develop a project. Every single change that
igor@402	263 you make is potentially a fork point. The great strength of this
igor@402	264 approach is that a distributed revision control tool has to be really
igor@402	265 good at \emph{merging} forks, because forks are absolutely
igor@402	266 fundamental: they happen all the time.
igor@402	267
igor@402	268 If every piece of work that everybody does, all the time, is framed in
igor@402	269 terms of forking and merging, then what the open source world refers
igor@402	270 to as a ``fork'' becomes \emph{purely} a social issue. If anything,
igor@402	271 distributed tools \emph{lower} the likelihood of a fork:
igor@402	272 \begin{itemize}
igor@402	273 \item They eliminate the social distinction that centralised tools
igor@402	274 impose: that between insiders (people with commit access) and
igor@402	275 outsiders (people without).
igor@402	276 \item They make it easier to reconcile after a social fork, because
igor@402	277 all that's involved from the perspective of the revision control
igor@402	278 software is just another merge.
igor@402	279 \end{itemize}
igor@402	280
igor@402	281 Some people resist distributed tools because they want to retain tight
igor@402	282 control over their projects, and they believe that centralised tools
igor@402	283 give them this control. However, if you're of this belief, and you
igor@402	284 publish your CVS or Subversion repositories publically, there are
igor@402	285 plenty of tools available that can pull out your entire project's
igor@402	286 history (albeit slowly) and recreate it somewhere that you don't
igor@402	287 control. So while your control in this case is illusory, you are
igor@402	288 forgoing the ability to fluidly collaborate with whatever people feel
igor@402	289 compelled to mirror and fork your history.
igor@402	290
igor@402	291 \subsection{Advantages for commercial projects}
igor@402	292
igor@402	293 Many commercial projects are undertaken by teams that are scattered
igor@402	294 across the globe. Contributors who are far from a central server will
igor@402	295 see slower command execution and perhaps less reliability. Commercial
igor@402	296 revision control systems attempt to ameliorate these problems with
igor@402	297 remote-site replication add-ons that are typically expensive to buy
igor@402	298 and cantankerous to administer. A distributed system doesn't suffer
igor@402	299 from these problems in the first place. Better yet, you can easily
igor@402	300 set up multiple authoritative servers, say one per site, so that
igor@402	301 there's no redundant communication between repositories over expensive
igor@402	302 long-haul network links.
igor@402	303
igor@402	304 Centralised revision control systems tend to have relatively low
igor@402	305 scalability. It's not unusual for an expensive centralised system to
igor@402	306 fall over under the combined load of just a few dozen concurrent
igor@402	307 users. Once again, the typical response tends to be an expensive and
igor@402	308 clunky replication facility. Since the load on a central server---if
igor@402	309 you have one at all---is many times lower with a distributed
igor@402	310 tool (because all of the data is replicated everywhere), a single
igor@402	311 cheap server can handle the needs of a much larger team, and
igor@402	312 replication to balance load becomes a simple matter of scripting.
igor@402	313
igor@402	314 If you have an employee in the field, troubleshooting a problem at a
igor@402	315 customer's site, they'll benefit from distributed revision control.
igor@402	316 The tool will let them generate custom builds, try different fixes in
igor@402	317 isolation from each other, and search efficiently through history for
igor@402	318 the sources of bugs and regressions in the customer's environment, all
igor@402	319 without needing to connect to your company's network.
igor@402	320
igor@402	321 \section{Why choose Mercurial?}
igor@402	322
igor@402	323 Mercurial has a unique set of properties that make it a particularly
igor@402	324 good choice as a revision control system.
igor@402	325 \begin{itemize}
igor@402	326 \item It is easy to learn and use.
igor@402	327 \item It is lightweight.
igor@402	328 \item It scales excellently.
igor@402	329 \item It is easy to customise.
igor@402	330 \end{itemize}
igor@402	331
igor@402	332 If you are at all familiar with revision control systems, you should
igor@402	333 be able to get up and running with Mercurial in less than five
igor@402	334 minutes. Even if not, it will take no more than a few minutes
igor@402	335 longer. Mercurial's command and feature sets are generally uniform
igor@402	336 and consistent, so you can keep track of a few general rules instead
igor@402	337 of a host of exceptions.
igor@402	338
igor@402	339 On a small project, you can start working with Mercurial in moments.
igor@402	340 Creating new changes and branches; transferring changes around
igor@402	341 (whether locally or over a network); and history and status operations
igor@402	342 are all fast. Mercurial attempts to stay nimble and largely out of
igor@402	343 your way by combining low cognitive overhead with blazingly fast
igor@402	344 operations.
igor@402	345
igor@402	346 The usefulness of Mercurial is not limited to small projects: it is
igor@402	347 used by projects with hundreds to thousands of contributors, each
igor@402	348 containing tens of thousands of files and hundreds of megabytes of
igor@402	349 source code.
igor@402	350
igor@402	351 If the core functionality of Mercurial is not enough for you, it's
igor@402	352 easy to build on. Mercurial is well suited to scripting tasks, and
igor@402	353 its clean internals and implementation in Python make it easy to add
igor@402	354 features in the form of extensions. There are a number of popular and
igor@402	355 useful extensions already available, ranging from helping to identify
igor@402	356 bugs to improving performance.
igor@402	357
igor@402	358 \section{Mercurial compared with other tools}
igor@402	359
igor@402	360 Before you read on, please understand that this section necessarily
igor@402	361 reflects my own experiences, interests, and (dare I say it) biases. I
igor@402	362 have used every one of the revision control tools listed below, in
igor@402	363 most cases for several years at a time.
igor@402	364
igor@402	365
igor@402	366 \subsection{Subversion}
igor@402	367
igor@402	368 Subversion is a popular revision control tool, developed to replace
igor@402	369 CVS. It has a centralised client/server architecture.
igor@402	370
igor@402	371 Subversion and Mercurial have similarly named commands for performing
igor@402	372 the same operations, so if you're familiar with one, it is easy to
igor@402	373 learn to use the other. Both tools are portable to all popular
igor@402	374 operating systems.
igor@402	375
igor@402	376 Prior to version 1.5, Subversion had no useful support for merges.
igor@402	377 At the time of writing, its merge tracking capability is new, and known to be
igor@402	378 \href{http://svnbook.red-bean.com/nightly/en/svn.branchmerge.advanced.html#svn.branchmerge.advanced.finalword}{complicated
igor@402	379 and buggy}.
igor@402	380
igor@402	381 Mercurial has a substantial performance advantage over Subversion on
igor@402	382 every revision control operation I have benchmarked. I have measured
igor@402	383 its advantage as ranging from a factor of two to a factor of six when
igor@402	384 compared with Subversion~1.4.3's \emph{ra\_local} file store, which is
igor@402	385 the fastest access method available. In more realistic deployments
igor@402	386 involving a network-based store, Subversion will be at a substantially
igor@402	387 larger disadvantage. Because many Subversion commands must talk to
igor@402	388 the server and Subversion does not have useful replication facilities,
igor@402	389 server capacity and network bandwidth become bottlenecks for modestly
igor@402	390 large projects.
igor@402	391
igor@402	392 Additionally, Subversion incurs substantial storage overhead to avoid
igor@402	393 network transactions for a few common operations, such as finding
igor@402	394 modified files (\texttt{status}) and displaying modifications against
igor@402	395 the current revision (\texttt{diff}). As a result, a Subversion
igor@402	396 working copy is often the same size as, or larger than, a Mercurial
igor@402	397 repository and working directory, even though the Mercurial repository
igor@402	398 contains a complete history of the project.
igor@402	399
igor@402	400 Subversion is widely supported by third party tools. Mercurial
igor@402	401 currently lags considerably in this area. This gap is closing,
igor@402	402 however, and indeed some of Mercurial's GUI tools now outshine their
igor@402	403 Subversion equivalents. Like Mercurial, Subversion has an excellent
igor@402	404 user manual.
igor@402	405
igor@402	406 Because Subversion doesn't store revision history on the client, it is
igor@402	407 well suited to managing projects that deal with lots of large, opaque
igor@402	408 binary files. If you check in fifty revisions to an incompressible
igor@402	409 10MB file, Subversion's client-side space usage stays constant The
igor@402	410 space used by any distributed SCM will grow rapidly in proportion to
igor@402	411 the number of revisions, because the differences between each revision
igor@402	412 are large.
igor@402	413
igor@402	414 In addition, it's often difficult or, more usually, impossible to
igor@402	415 merge different versions of a binary file. Subversion's ability to
igor@402	416 let a user lock a file, so that they temporarily have the exclusive
igor@402	417 right to commit changes to it, can be a significant advantage to a
igor@402	418 project where binary files are widely used.
igor@402	419
igor@402	420 Mercurial can import revision history from a Subversion repository.
igor@402	421 It can also export revision history to a Subversion repository. This
igor@402	422 makes it easy to ``test the waters'' and use Mercurial and Subversion
igor@402	423 in parallel before deciding to switch. History conversion is
igor@402	424 incremental, so you can perform an initial conversion, then small
igor@402	425 additional conversions afterwards to bring in new changes.
igor@402	426
igor@402	427
igor@402	428 \subsection{Git}
igor@402	429
igor@402	430 Git is a distributed revision control tool that was developed for
igor@402	431 managing the Linux kernel source tree. Like Mercurial, its early
igor@402	432 design was somewhat influenced by Monotone.
igor@402	433
igor@402	434 Git has a very large command set, with version~1.5.0 providing~139
igor@402	435 individual commands. It has something of a reputation for being
igor@402	436 difficult to learn. Compared to Git, Mercurial has a strong focus on
igor@402	437 simplicity.
igor@402	438
igor@402	439 In terms of performance, Git is extremely fast. In several cases, it
igor@402	440 is faster than Mercurial, at least on Linux, while Mercurial performs
igor@402	441 better on other operations. However, on Windows, the performance and
igor@402	442 general level of support that Git provides is, at the time of writing,
igor@402	443 far behind that of Mercurial.
igor@402	444
igor@402	445 While a Mercurial repository needs no maintenance, a Git repository
igor@402	446 requires frequent manual ``repacks'' of its metadata. Without these,
igor@402	447 performance degrades, while space usage grows rapidly. A server that
igor@402	448 contains many Git repositories that are not rigorously and frequently
igor@402	449 repacked will become heavily disk-bound during backups, and there have
igor@402	450 been instances of daily backups taking far longer than~24 hours as a
igor@402	451 result. A freshly packed Git repository is slightly smaller than a
igor@402	452 Mercurial repository, but an unpacked repository is several orders of
igor@402	453 magnitude larger.
igor@402	454
igor@402	455 The core of Git is written in C. Many Git commands are implemented as
igor@402	456 shell or Perl scripts, and the quality of these scripts varies widely.
igor@402	457 I have encountered several instances where scripts charged along
igor@402	458 blindly in the presence of errors that should have been fatal.
igor@402	459
igor@402	460 Mercurial can import revision history from a Git repository.
igor@402	461
igor@402	462
igor@402	463 \subsection{CVS}
igor@402	464
igor@402	465 CVS is probably the most widely used revision control tool in the
igor@402	466 world. Due to its age and internal untidiness, it has been only
igor@402	467 lightly maintained for many years.
igor@402	468
igor@402	469 It has a centralised client/server architecture. It does not group
igor@402	470 related file changes into atomic commits, making it easy for people to
igor@402	471 ``break the build'': one person can successfully commit part of a
igor@402	472 change and then be blocked by the need for a merge, causing other
igor@402	473 people to see only a portion of the work they intended to do. This
igor@402	474 also affects how you work with project history. If you want to see
igor@402	475 all of the modifications someone made as part of a task, you will need
igor@402	476 to manually inspect the descriptions and timestamps of the changes
igor@402	477 made to each file involved (if you even know what those files were).
igor@402	478
igor@402	479 CVS has a muddled notion of tags and branches that I will not attempt
igor@402	480 to even describe. It does not support renaming of files or
igor@402	481 directories well, making it easy to corrupt a repository. It has
igor@402	482 almost no internal consistency checking capabilities, so it is usually
igor@402	483 not even possible to tell whether or how a repository is corrupt. I
igor@402	484 would not recommend CVS for any project, existing or new.
igor@402	485
igor@402	486 Mercurial can import CVS revision history. However, there are a few
igor@402	487 caveats that apply; these are true of every other revision control
igor@402	488 tool's CVS importer, too. Due to CVS's lack of atomic changes and
igor@402	489 unversioned filesystem hierarchy, it is not possible to reconstruct
igor@402	490 CVS history completely accurately; some guesswork is involved, and
igor@402	491 renames will usually not show up. Because a lot of advanced CVS
igor@402	492 administration has to be done by hand and is hence error-prone, it's
igor@402	493 common for CVS importers to run into multiple problems with corrupted
igor@402	494 repositories (completely bogus revision timestamps and files that have
igor@402	495 remained locked for over a decade are just two of the less interesting
igor@402	496 problems I can recall from personal experience).
igor@402	497
igor@402	498 Mercurial can import revision history from a CVS repository.
igor@402	499
igor@402	500
igor@402	501 \subsection{Commercial tools}
igor@402	502
igor@402	503 Perforce has a centralised client/server architecture, with no
igor@402	504 client-side caching of any data. Unlike modern revision control
igor@402	505 tools, Perforce requires that a user run a command to inform the
igor@402	506 server about every file they intend to edit.
igor@402	507
igor@402	508 The performance of Perforce is quite good for small teams, but it
igor@402	509 falls off rapidly as the number of users grows beyond a few dozen.
igor@402	510 Modestly large Perforce installations require the deployment of
igor@402	511 proxies to cope with the load their users generate.
igor@402	512
igor@402	513
igor@402	514 \subsection{Choosing a revision control tool}
igor@402	515
igor@402	516 With the exception of CVS, all of the tools listed above have unique
igor@402	517 strengths that suit them to particular styles of work. There is no
igor@402	518 single revision control tool that is best in all situations.
igor@402	519
igor@402	520 As an example, Subversion is a good choice for working with frequently
igor@402	521 edited binary files, due to its centralised nature and support for
igor@402	522 file locking.
igor@402	523
igor@402	524 I personally find Mercurial's properties of simplicity, performance,
igor@402	525 and good merge support to be a compelling combination that has served
igor@402	526 me well for several years.
igor@402	527
igor@402	528
igor@402	529 \section{Switching from another tool to Mercurial}
igor@402	530
igor@402	531 Mercurial is bundled with an extension named \hgext{convert}, which
igor@402	532 can incrementally import revision history from several other revision
igor@402	533 control tools. By ``incremental'', I mean that you can convert all of
igor@402	534 a project's history to date in one go, then rerun the conversion later
igor@402	535 to obtain new changes that happened after the initial conversion.
igor@402	536
igor@402	537 The revision control tools supported by \hgext{convert} are as
igor@402	538 follows:
igor@402	539 \begin{itemize}
igor@402	540 \item Subversion
igor@402	541 \item CVS
igor@402	542 \item Git
igor@402	543 \item Darcs
igor@402	544 \end{itemize}
igor@402	545
igor@402	546 In addition, \hgext{convert} can export changes from Mercurial to
igor@402	547 Subversion. This makes it possible to try Subversion and Mercurial in
igor@402	548 parallel before committing to a switchover, without risking the loss
igor@402	549 of any work.
igor@402	550
igor@402	551 The \hgxcmd{conver}{convert} command is easy to use. Simply point it
igor@402	552 at the path or URL of the source repository, optionally give it the
igor@402	553 name of the destination repository, and it will start working. After
igor@402	554 the initial conversion, just run the same command again to import new
igor@402	555 changes.
igor@402	556
igor@402	557
igor@402	558 %%% Local Variables:
igor@402	559 %%% mode: latex
igor@402	560 %%% TeX-master: "00book"
igor@402	561 %%% End: