hgbook

diff es/intro.tex @ 402:b05e35d641e4
Copying the files from en to es and taking intro chapter
author: Igor TAmara <igor@tamarapatino.org>
date: Fri Nov 07 21:42:57 2008 -0500 (2008-11-07)
parents: 04c08ad7e92e
children: 4cdeb830118b
     1.1 --- a/es/intro.tex	Sat Oct 18 07:48:21 2008 -0500
     1.2 +++ b/es/intro.tex	Fri Nov 07 21:42:57 2008 -0500
     1.3 @@ -0,0 +1,561 @@
     1.4 +\chapter{Introduction}
     1.5 +\label{chap:intro}
     1.6 +
     1.7 +\section{About revision control}
     1.8 +
     1.9 +Revision control is the process of managing multiple versions of a
    1.10 +piece of information.  In its simplest form, this is something that
    1.11 +many people do by hand: every time you modify a file, save it under a
    1.12 +new name that contains a number, each one higher than the number of
    1.13 +the preceding version.
    1.14 +
    1.15 +Manually managing multiple versions of even a single file is an
    1.16 +error-prone task, though, so software tools to help automate this
    1.17 +process have long been available.  The earliest automated revision
    1.18 +control tools were intended to help a single user to manage revisions
    1.19 +of a single file.  Over the past few decades, the scope of revision
    1.20 +control tools has expanded greatly; they now manage multiple files,
    1.21 +and help multiple people to work together.  The best modern revision
    1.22 +control tools have no problem coping with thousands of people working
    1.23 +together on projects that consist of hundreds of thousands of files.
    1.24 +
    1.25 +\subsection{Why use revision control?}
    1.26 +
    1.27 +There are a number of reasons why you or your team might want to use
    1.28 +an automated revision control tool for a project.
    1.29 +\begin{itemize}
    1.30 +\item It will track the history and evolution of your project, so you
    1.31 +  don't have to.  For every change, you'll have a log of \emph{who}
    1.32 +  made it; \emph{why} they made it; \emph{when} they made it; and
    1.33 +  \emph{what} the change was.
    1.34 +\item When you're working with other people, revision control software
    1.35 +  makes it easier for you to collaborate.  For example, when people
    1.36 +  more or less simultaneously make potentially incompatible changes,
    1.37 +  the software will help you to identify and resolve those conflicts.
    1.38 +\item It can help you to recover from mistakes.  If you make a change
    1.39 +  that later turns out to be in error, you can revert to an earlier
    1.40 +  version of one or more files.  In fact, a \emph{really} good
    1.41 +  revision control tool will even help you to efficiently figure out
    1.42 +  exactly when a problem was introduced (see
    1.43 +  section~\ref{sec:undo:bisect} for details).
    1.44 +\item It will help you to work simultaneously on, and manage the drift
    1.45 +  between, multiple versions of your project.
    1.46 +\end{itemize}
    1.47 +Most of these reasons are equally valid---at least in theory---whether
    1.48 +you're working on a project by yourself, or with a hundred other
    1.49 +people.
    1.50 +
    1.51 +A key question about the practicality of revision control at these two
    1.52 +different scales (``lone hacker'' and ``huge team'') is how its
    1.53 +\emph{benefits} compare to its \emph{costs}.  A revision control tool
    1.54 +that's difficult to understand or use is going to impose a high cost.
    1.55 +
    1.56 +A five-hundred-person project is likely to collapse under its own
    1.57 +weight almost immediately without a revision control tool and process.
    1.58 +In this case, the cost of using revision control might hardly seem
    1.59 +worth considering, since \emph{without} it, failure is almost
    1.60 +guaranteed.
    1.61 +
    1.62 +On the other hand, a one-person ``quick hack'' might seem like a poor
    1.63 +place to use a revision control tool, because surely the cost of using
    1.64 +one must be close to the overall cost of the project.  Right?
    1.65 +
    1.66 +Mercurial uniquely supports \emph{both} of these scales of
    1.67 +development.  You can learn the basics in just a few minutes, and due
    1.68 +to its low overhead, you can apply revision control to the smallest of
    1.69 +projects with ease.  Its simplicity means you won't have a lot of
    1.70 +abstruse concepts or command sequences competing for mental space with
    1.71 +whatever you're \emph{really} trying to do.  At the same time,
    1.72 +Mercurial's high performance and peer-to-peer nature let you scale
    1.73 +painlessly to handle large projects.
    1.74 +
    1.75 +No revision control tool can rescue a poorly run project, but a good
    1.76 +choice of tools can make a huge difference to the fluidity with which
    1.77 +you can work on a project.
    1.78 +
    1.79 +\subsection{The many names of revision control}
    1.80 +
    1.81 +Revision control is a diverse field, so much so that it doesn't
    1.82 +actually have a single name or acronym.  Here are a few of the more
    1.83 +common names and acronyms you'll encounter:
    1.84 +\begin{itemize}
    1.85 +\item Revision control (RCS)
    1.86 +\item Software configuration management (SCM), or configuration management
    1.87 +\item Source code management
    1.88 +\item Source code control, or source control
    1.89 +\item Version control (VCS)
    1.90 +\end{itemize}
    1.91 +Some people claim that these terms actually have different meanings,
    1.92 +but in practice they overlap so much that there's no agreed or even
    1.93 +useful way to tease them apart.
    1.94 +
    1.95 +\section{A short history of revision control}
    1.96 +
    1.97 +The best known of the old-time revision control tools is SCCS (Source
    1.98 +Code Control System), which Marc Rochkind wrote at Bell Labs, in the
    1.99 +early 1970s.  SCCS operated on individual files, and required every
   1.100 +person working on a project to have access to a shared workspace on a
   1.101 +single system.  Only one person could modify a file at any time;
   1.102 +arbitration for access to files was via locks.  It was common for
   1.103 +people to lock files, and later forget to unlock them, preventing
   1.104 +anyone else from modifying those files without the help of an
   1.105 +administrator.  
   1.106 +
   1.107 +Walter Tichy developed a free alternative to SCCS in the early 1980s;
   1.108 +he called his program RCS (Revison Control System).  Like SCCS, RCS
   1.109 +required developers to work in a single shared workspace, and to lock
   1.110 +files to prevent multiple people from modifying them simultaneously.
   1.111 +
   1.112 +Later in the 1980s, Dick Grune used RCS as a building block for a set
   1.113 +of shell scripts he initially called cmt, but then renamed to CVS
   1.114 +(Concurrent Versions System).  The big innovation of CVS was that it
   1.115 +let developers work simultaneously and somewhat independently in their
   1.116 +own personal workspaces.  The personal workspaces prevented developers
   1.117 +from stepping on each other's toes all the time, as was common with
   1.118 +SCCS and RCS.  Each developer had a copy of every project file, and
   1.119 +could modify their copies independently.  They had to merge their
   1.120 +edits prior to committing changes to the central repository.
   1.121 +
   1.122 +Brian Berliner took Grune's original scripts and rewrote them in~C,
   1.123 +releasing in 1989 the code that has since developed into the modern
   1.124 +version of CVS.  CVS subsequently acquired the ability to operate over
   1.125 +a network connection, giving it a client/server architecture.  CVS's
   1.126 +architecture is centralised; only the server has a copy of the history
   1.127 +of the project.  Client workspaces just contain copies of recent
   1.128 +versions of the project's files, and a little metadata to tell them
   1.129 +where the server is.  CVS has been enormously successful; it is
   1.130 +probably the world's most widely used revision control system.
   1.131 +
   1.132 +In the early 1990s, Sun Microsystems developed an early distributed
   1.133 +revision control system, called TeamWare.  A TeamWare workspace
   1.134 +contains a complete copy of the project's history.  TeamWare has no
   1.135 +notion of a central repository.  (CVS relied upon RCS for its history
   1.136 +storage; TeamWare used SCCS.)
   1.137 +
   1.138 +As the 1990s progressed, awareness grew of a number of problems with
   1.139 +CVS.  It records simultaneous changes to multiple files individually,
   1.140 +instead of grouping them together as a single logically atomic
   1.141 +operation.  It does not manage its file hierarchy well; it is easy to
   1.142 +make a mess of a repository by renaming files and directories.  Worse,
   1.143 +its source code is difficult to read and maintain, which made the
   1.144 +``pain level'' of fixing these architectural problems prohibitive.
   1.145 +
   1.146 +In 2001, Jim Blandy and Karl Fogel, two developers who had worked on
   1.147 +CVS, started a project to replace it with a tool that would have a
   1.148 +better architecture and cleaner code.  The result, Subversion, does
   1.149 +not stray from CVS's centralised client/server model, but it adds
   1.150 +multi-file atomic commits, better namespace management, and a number
   1.151 +of other features that make it a generally better tool than CVS.
   1.152 +Since its initial release, it has rapidly grown in popularity.
   1.153 +
   1.154 +More or less simultaneously, Graydon Hoare began working on an
   1.155 +ambitious distributed revision control system that he named Monotone.
   1.156 +While Monotone addresses many of CVS's design flaws and has a
   1.157 +peer-to-peer architecture, it goes beyond earlier (and subsequent)
   1.158 +revision control tools in a number of innovative ways.  It uses
   1.159 +cryptographic hashes as identifiers, and has an integral notion of
   1.160 +``trust'' for code from different sources.
   1.161 +
   1.162 +Mercurial began life in 2005.  While a few aspects of its design are
   1.163 +influenced by Monotone, Mercurial focuses on ease of use, high
   1.164 +performance, and scalability to very large projects.
   1.165 +
   1.166 +\section{Trends in revision control}
   1.167 +
   1.168 +There has been an unmistakable trend in the development and use of
   1.169 +revision control tools over the past four decades, as people have
   1.170 +become familiar with the capabilities of their tools and constrained
   1.171 +by their limitations.
   1.172 +
   1.173 +The first generation began by managing single files on individual
   1.174 +computers.  Although these tools represented a huge advance over
   1.175 +ad-hoc manual revision control, their locking model and reliance on a
   1.176 +single computer limited them to small, tightly-knit teams.
   1.177 +
   1.178 +The second generation loosened these constraints by moving to
   1.179 +network-centered architectures, and managing entire projects at a
   1.180 +time.  As projects grew larger, they ran into new problems.  With
   1.181 +clients needing to talk to servers very frequently, server scaling
   1.182 +became an issue for large projects.  An unreliable network connection
   1.183 +could prevent remote users from being able to talk to the server at
   1.184 +all.  As open source projects started making read-only access
   1.185 +available anonymously to anyone, people without commit privileges
   1.186 +found that they could not use the tools to interact with a project in
   1.187 +a natural way, as they could not record their changes.
   1.188 +
   1.189 +The current generation of revision control tools is peer-to-peer in
   1.190 +nature.  All of these systems have dropped the dependency on a single
   1.191 +central server, and allow people to distribute their revision control
   1.192 +data to where it's actually needed.  Collaboration over the Internet
   1.193 +has moved from constrained by technology to a matter of choice and
   1.194 +consensus.  Modern tools can operate offline indefinitely and
   1.195 +autonomously, with a network connection only needed when syncing
   1.196 +changes with another repository.
   1.197 +
   1.198 +\section{A few of the advantages of distributed revision control}
   1.199 +
   1.200 +Even though distributed revision control tools have for several years
   1.201 +been as robust and usable as their previous-generation counterparts,
   1.202 +people using older tools have not yet necessarily woken up to their
   1.203 +advantages.  There are a number of ways in which distributed tools
   1.204 +shine relative to centralised ones.
   1.205 +
   1.206 +For an individual developer, distributed tools are almost always much
   1.207 +faster than centralised tools.  This is for a simple reason: a
   1.208 +centralised tool needs to talk over the network for many common
   1.209 +operations, because most metadata is stored in a single copy on the
   1.210 +central server.  A distributed tool stores all of its metadata
   1.211 +locally.  All else being equal, talking over the network adds overhead
   1.212 +to a centralised tool.  Don't underestimate the value of a snappy,
   1.213 +responsive tool: you're going to spend a lot of time interacting with
   1.214 +your revision control software.
   1.215 +
   1.216 +Distributed tools are indifferent to the vagaries of your server
   1.217 +infrastructure, again because they replicate metadata to so many
   1.218 +locations.  If you use a centralised system and your server catches
   1.219 +fire, you'd better hope that your backup media are reliable, and that
   1.220 +your last backup was recent and actually worked.  With a distributed
   1.221 +tool, you have many backups available on every contributor's computer.
   1.222 +
   1.223 +The reliability of your network will affect distributed tools far less
   1.224 +than it will centralised tools.  You can't even use a centralised tool
   1.225 +without a network connection, except for a few highly constrained
   1.226 +commands.  With a distributed tool, if your network connection goes
   1.227 +down while you're working, you may not even notice.  The only thing
   1.228 +you won't be able to do is talk to repositories on other computers,
   1.229 +something that is relatively rare compared with local operations.  If
   1.230 +you have a far-flung team of collaborators, this may be significant.
   1.231 +
   1.232 +\subsection{Advantages for open source projects}
   1.233 +
   1.234 +If you take a shine to an open source project and decide that you
   1.235 +would like to start hacking on it, and that project uses a distributed
   1.236 +revision control tool, you are at once a peer with the people who
   1.237 +consider themselves the ``core'' of that project.  If they publish
   1.238 +their repositories, you can immediately copy their project history,
   1.239 +start making changes, and record your work, using the same tools in
   1.240 +the same ways as insiders.  By contrast, with a centralised tool, you
   1.241 +must use the software in a ``read only'' mode unless someone grants
   1.242 +you permission to commit changes to their central server.  Until then,
   1.243 +you won't be able to record changes, and your local modifications will
   1.244 +be at risk of corruption any time you try to update your client's view
   1.245 +of the repository.
   1.246 +
   1.247 +\subsubsection{The forking non-problem}
   1.248 +
   1.249 +It has been suggested that distributed revision control tools pose
   1.250 +some sort of risk to open source projects because they make it easy to
   1.251 +``fork'' the development of a project.  A fork happens when there are
   1.252 +differences in opinion or attitude between groups of developers that
   1.253 +cause them to decide that they can't work together any longer.  Each
   1.254 +side takes a more or less complete copy of the project's source code,
   1.255 +and goes off in its own direction.
   1.256 +
   1.257 +Sometimes the camps in a fork decide to reconcile their differences.
   1.258 +With a centralised revision control system, the \emph{technical}
   1.259 +process of reconciliation is painful, and has to be performed largely
   1.260 +by hand.  You have to decide whose revision history is going to
   1.261 +``win'', and graft the other team's changes into the tree somehow.
   1.262 +This usually loses some or all of one side's revision history.
   1.263 +
   1.264 +What distributed tools do with respect to forking is they make forking
   1.265 +the \emph{only} way to develop a project.  Every single change that
   1.266 +you make is potentially a fork point.  The great strength of this
   1.267 +approach is that a distributed revision control tool has to be really
   1.268 +good at \emph{merging} forks, because forks are absolutely
   1.269 +fundamental: they happen all the time.  
   1.270 +
   1.271 +If every piece of work that everybody does, all the time, is framed in
   1.272 +terms of forking and merging, then what the open source world refers
   1.273 +to as a ``fork'' becomes \emph{purely} a social issue.  If anything,
   1.274 +distributed tools \emph{lower} the likelihood of a fork:
   1.275 +\begin{itemize}
   1.276 +\item They eliminate the social distinction that centralised tools
   1.277 +  impose: that between insiders (people with commit access) and
   1.278 +  outsiders (people without).
   1.279 +\item They make it easier to reconcile after a social fork, because
   1.280 +  all that's involved from the perspective of the revision control
   1.281 +  software is just another merge.
   1.282 +\end{itemize}
   1.283 +
   1.284 +Some people resist distributed tools because they want to retain tight
   1.285 +control over their projects, and they believe that centralised tools
   1.286 +give them this control.  However, if you're of this belief, and you
   1.287 +publish your CVS or Subversion repositories publically, there are
   1.288 +plenty of tools available that can pull out your entire project's
   1.289 +history (albeit slowly) and recreate it somewhere that you don't
   1.290 +control.  So while your control in this case is illusory, you are
   1.291 +forgoing the ability to fluidly collaborate with whatever people feel
   1.292 +compelled to mirror and fork your history.
   1.293 +
   1.294 +\subsection{Advantages for commercial projects}
   1.295 +
   1.296 +Many commercial projects are undertaken by teams that are scattered
   1.297 +across the globe.  Contributors who are far from a central server will
   1.298 +see slower command execution and perhaps less reliability.  Commercial
   1.299 +revision control systems attempt to ameliorate these problems with
   1.300 +remote-site replication add-ons that are typically expensive to buy
   1.301 +and cantankerous to administer.  A distributed system doesn't suffer
   1.302 +from these problems in the first place.  Better yet, you can easily
   1.303 +set up multiple authoritative servers, say one per site, so that
   1.304 +there's no redundant communication between repositories over expensive
   1.305 +long-haul network links.
   1.306 +
   1.307 +Centralised revision control systems tend to have relatively low
   1.308 +scalability.  It's not unusual for an expensive centralised system to
   1.309 +fall over under the combined load of just a few dozen concurrent
   1.310 +users.  Once again, the typical response tends to be an expensive and
   1.311 +clunky replication facility.  Since the load on a central server---if
   1.312 +you have one at all---is many times lower with a distributed
   1.313 +tool (because all of the data is replicated everywhere), a single
   1.314 +cheap server can handle the needs of a much larger team, and
   1.315 +replication to balance load becomes a simple matter of scripting.
   1.316 +
   1.317 +If you have an employee in the field, troubleshooting a problem at a
   1.318 +customer's site, they'll benefit from distributed revision control.
   1.319 +The tool will let them generate custom builds, try different fixes in
   1.320 +isolation from each other, and search efficiently through history for
   1.321 +the sources of bugs and regressions in the customer's environment, all
   1.322 +without needing to connect to your company's network.
   1.323 +
   1.324 +\section{Why choose Mercurial?}
   1.325 +
   1.326 +Mercurial has a unique set of properties that make it a particularly
   1.327 +good choice as a revision control system.
   1.328 +\begin{itemize}
   1.329 +\item It is easy to learn and use.
   1.330 +\item It is lightweight.
   1.331 +\item It scales excellently.
   1.332 +\item It is easy to customise.
   1.333 +\end{itemize}
   1.334 +
   1.335 +If you are at all familiar with revision control systems, you should
   1.336 +be able to get up and running with Mercurial in less than five
   1.337 +minutes.  Even if not, it will take no more than a few minutes
   1.338 +longer.  Mercurial's command and feature sets are generally uniform
   1.339 +and consistent, so you can keep track of a few general rules instead
   1.340 +of a host of exceptions.
   1.341 +
   1.342 +On a small project, you can start working with Mercurial in moments.
   1.343 +Creating new changes and branches; transferring changes around
   1.344 +(whether locally or over a network); and history and status operations
   1.345 +are all fast.  Mercurial attempts to stay nimble and largely out of
   1.346 +your way by combining low cognitive overhead with blazingly fast
   1.347 +operations.
   1.348 +
   1.349 +The usefulness of Mercurial is not limited to small projects: it is
   1.350 +used by projects with hundreds to thousands of contributors, each
   1.351 +containing tens of thousands of files and hundreds of megabytes of
   1.352 +source code.
   1.353 +
   1.354 +If the core functionality of Mercurial is not enough for you, it's
   1.355 +easy to build on.  Mercurial is well suited to scripting tasks, and
   1.356 +its clean internals and implementation in Python make it easy to add
   1.357 +features in the form of extensions.  There are a number of popular and
   1.358 +useful extensions already available, ranging from helping to identify
   1.359 +bugs to improving performance.
   1.360 +
   1.361 +\section{Mercurial compared with other tools}
   1.362 +
   1.363 +Before you read on, please understand that this section necessarily
   1.364 +reflects my own experiences, interests, and (dare I say it) biases.  I
   1.365 +have used every one of the revision control tools listed below, in
   1.366 +most cases for several years at a time.
   1.367 +
   1.368 +
   1.369 +\subsection{Subversion}
   1.370 +
   1.371 +Subversion is a popular revision control tool, developed to replace
   1.372 +CVS.  It has a centralised client/server architecture.
   1.373 +
   1.374 +Subversion and Mercurial have similarly named commands for performing
   1.375 +the same operations, so if you're familiar with one, it is easy to
   1.376 +learn to use the other.  Both tools are portable to all popular
   1.377 +operating systems.
   1.378 +
   1.379 +Prior to version 1.5, Subversion had no useful support for merges.
   1.380 +At the time of writing, its merge tracking capability is new, and known to be
   1.381 +\href{http://svnbook.red-bean.com/nightly/en/svn.branchmerge.advanced.html#svn.branchmerge.advanced.finalword}{complicated
   1.382 +  and buggy}.
   1.383 +
   1.384 +Mercurial has a substantial performance advantage over Subversion on
   1.385 +every revision control operation I have benchmarked.  I have measured
   1.386 +its advantage as ranging from a factor of two to a factor of six when
   1.387 +compared with Subversion~1.4.3's \emph{ra\_local} file store, which is
   1.388 +the fastest access method available.  In more realistic deployments
   1.389 +involving a network-based store, Subversion will be at a substantially
   1.390 +larger disadvantage.  Because many Subversion commands must talk to
   1.391 +the server and Subversion does not have useful replication facilities,
   1.392 +server capacity and network bandwidth become bottlenecks for modestly
   1.393 +large projects.
   1.394 +
   1.395 +Additionally, Subversion incurs substantial storage overhead to avoid
   1.396 +network transactions for a few common operations, such as finding
   1.397 +modified files (\texttt{status}) and displaying modifications against
   1.398 +the current revision (\texttt{diff}).  As a result, a Subversion
   1.399 +working copy is often the same size as, or larger than, a Mercurial
   1.400 +repository and working directory, even though the Mercurial repository
   1.401 +contains a complete history of the project.
   1.402 +
   1.403 +Subversion is widely supported by third party tools.  Mercurial
   1.404 +currently lags considerably in this area.  This gap is closing,
   1.405 +however, and indeed some of Mercurial's GUI tools now outshine their
   1.406 +Subversion equivalents.  Like Mercurial, Subversion has an excellent
   1.407 +user manual.
   1.408 +
   1.409 +Because Subversion doesn't store revision history on the client, it is
   1.410 +well suited to managing projects that deal with lots of large, opaque
   1.411 +binary files.  If you check in fifty revisions to an incompressible
   1.412 +10MB file, Subversion's client-side space usage stays constant The
   1.413 +space used by any distributed SCM will grow rapidly in proportion to
   1.414 +the number of revisions, because the differences between each revision
   1.415 +are large.
   1.416 +
   1.417 +In addition, it's often difficult or, more usually, impossible to
   1.418 +merge different versions of a binary file.  Subversion's ability to
   1.419 +let a user lock a file, so that they temporarily have the exclusive
   1.420 +right to commit changes to it, can be a significant advantage to a
   1.421 +project where binary files are widely used.
   1.422 +
   1.423 +Mercurial can import revision history from a Subversion repository.
   1.424 +It can also export revision history to a Subversion repository.  This
   1.425 +makes it easy to ``test the waters'' and use Mercurial and Subversion
   1.426 +in parallel before deciding to switch.  History conversion is
   1.427 +incremental, so you can perform an initial conversion, then small
   1.428 +additional conversions afterwards to bring in new changes.
   1.429 +
   1.430 +
   1.431 +\subsection{Git}
   1.432 +
   1.433 +Git is a distributed revision control tool that was developed for
   1.434 +managing the Linux kernel source tree.  Like Mercurial, its early
   1.435 +design was somewhat influenced by Monotone.
   1.436 +
   1.437 +Git has a very large command set, with version~1.5.0 providing~139
   1.438 +individual commands.  It has something of a reputation for being
   1.439 +difficult to learn.  Compared to Git, Mercurial has a strong focus on
   1.440 +simplicity.
   1.441 +
   1.442 +In terms of performance, Git is extremely fast.  In several cases, it
   1.443 +is faster than Mercurial, at least on Linux, while Mercurial performs
   1.444 +better on other operations.  However, on Windows, the performance and
   1.445 +general level of support that Git provides is, at the time of writing,
   1.446 +far behind that of Mercurial.
   1.447 +
   1.448 +While a Mercurial repository needs no maintenance, a Git repository
   1.449 +requires frequent manual ``repacks'' of its metadata.  Without these,
   1.450 +performance degrades, while space usage grows rapidly.  A server that
   1.451 +contains many Git repositories that are not rigorously and frequently
   1.452 +repacked will become heavily disk-bound during backups, and there have
   1.453 +been instances of daily backups taking far longer than~24 hours as a
   1.454 +result.  A freshly packed Git repository is slightly smaller than a
   1.455 +Mercurial repository, but an unpacked repository is several orders of
   1.456 +magnitude larger.
   1.457 +
   1.458 +The core of Git is written in C.  Many Git commands are implemented as
   1.459 +shell or Perl scripts, and the quality of these scripts varies widely.
   1.460 +I have encountered several instances where scripts charged along
   1.461 +blindly in the presence of errors that should have been fatal.
   1.462 +
   1.463 +Mercurial can import revision history from a Git repository.
   1.464 +
   1.465 +
   1.466 +\subsection{CVS}
   1.467 +
   1.468 +CVS is probably the most widely used revision control tool in the
   1.469 +world.  Due to its age and internal untidiness, it has been only
   1.470 +lightly maintained for many years.
   1.471 +
   1.472 +It has a centralised client/server architecture.  It does not group
   1.473 +related file changes into atomic commits, making it easy for people to
   1.474 +``break the build'': one person can successfully commit part of a
   1.475 +change and then be blocked by the need for a merge, causing other
   1.476 +people to see only a portion of the work they intended to do.  This
   1.477 +also affects how you work with project history.  If you want to see
   1.478 +all of the modifications someone made as part of a task, you will need
   1.479 +to manually inspect the descriptions and timestamps of the changes
   1.480 +made to each file involved (if you even know what those files were).
   1.481 +
   1.482 +CVS has a muddled notion of tags and branches that I will not attempt
   1.483 +to even describe.  It does not support renaming of files or
   1.484 +directories well, making it easy to corrupt a repository.  It has
   1.485 +almost no internal consistency checking capabilities, so it is usually
   1.486 +not even possible to tell whether or how a repository is corrupt.  I
   1.487 +would not recommend CVS for any project, existing or new.
   1.488 +
   1.489 +Mercurial can import CVS revision history.  However, there are a few
   1.490 +caveats that apply; these are true of every other revision control
   1.491 +tool's CVS importer, too.  Due to CVS's lack of atomic changes and
   1.492 +unversioned filesystem hierarchy, it is not possible to reconstruct
   1.493 +CVS history completely accurately; some guesswork is involved, and
   1.494 +renames will usually not show up.  Because a lot of advanced CVS
   1.495 +administration has to be done by hand and is hence error-prone, it's
   1.496 +common for CVS importers to run into multiple problems with corrupted
   1.497 +repositories (completely bogus revision timestamps and files that have
   1.498 +remained locked for over a decade are just two of the less interesting
   1.499 +problems I can recall from personal experience).
   1.500 +
   1.501 +Mercurial can import revision history from a CVS repository.
   1.502 +
   1.503 +
   1.504 +\subsection{Commercial tools}
   1.505 +
   1.506 +Perforce has a centralised client/server architecture, with no
   1.507 +client-side caching of any data.  Unlike modern revision control
   1.508 +tools, Perforce requires that a user run a command to inform the
   1.509 +server about every file they intend to edit.
   1.510 +
   1.511 +The performance of Perforce is quite good for small teams, but it
   1.512 +falls off rapidly as the number of users grows beyond a few dozen.
   1.513 +Modestly large Perforce installations require the deployment of
   1.514 +proxies to cope with the load their users generate.
   1.515 +
   1.516 +
   1.517 +\subsection{Choosing a revision control tool}
   1.518 +
   1.519 +With the exception of CVS, all of the tools listed above have unique
   1.520 +strengths that suit them to particular styles of work.  There is no
   1.521 +single revision control tool that is best in all situations.
   1.522 +
   1.523 +As an example, Subversion is a good choice for working with frequently
   1.524 +edited binary files, due to its centralised nature and support for
   1.525 +file locking.
   1.526 +
   1.527 +I personally find Mercurial's properties of simplicity, performance,
   1.528 +and good merge support to be a compelling combination that has served
   1.529 +me well for several years.
   1.530 +
   1.531 +
   1.532 +\section{Switching from another tool to Mercurial}
   1.533 +
   1.534 +Mercurial is bundled with an extension named \hgext{convert}, which
   1.535 +can incrementally import revision history from several other revision
   1.536 +control tools.  By ``incremental'', I mean that you can convert all of
   1.537 +a project's history to date in one go, then rerun the conversion later
   1.538 +to obtain new changes that happened after the initial conversion.
   1.539 +
   1.540 +The revision control tools supported by \hgext{convert} are as
   1.541 +follows:
   1.542 +\begin{itemize}
   1.543 +\item Subversion
   1.544 +\item CVS
   1.545 +\item Git
   1.546 +\item Darcs
   1.547 +\end{itemize}
   1.548 +
   1.549 +In addition, \hgext{convert} can export changes from Mercurial to
   1.550 +Subversion.  This makes it possible to try Subversion and Mercurial in
   1.551 +parallel before committing to a switchover, without risking the loss
   1.552 +of any work.
   1.553 +
   1.554 +The \hgxcmd{conver}{convert} command is easy to use.  Simply point it
   1.555 +at the path or URL of the source repository, optionally give it the
   1.556 +name of the destination repository, and it will start working.  After
   1.557 +the initial conversion, just run the same command again to import new
   1.558 +changes.
   1.559 +
   1.560 +
   1.561 +%%% Local Variables: 
   1.562 +%%% mode: latex
   1.563 +%%% TeX-master: "00book"
   1.564 +%%% End:
author	Igor TAmara <igor@tamarapatino.org>
date	Fri Nov 07 21:42:57 2008 -0500 (2008-11-07)
parents	04c08ad7e92e
children	4cdeb830118b