bos@16: \chapter{Introduction}
bos@16: \label{chap:intro}
bos@16: 
romain@923: \section{A propros de la gestion source}
romain@923: 
romain@923: La gestion de source est un processus permettant de gérer différentes
romain@923: version de la même information. Dans sa forme la plus simple, c'est
romain@923: quelquechose que tout le monde fait manuellement : quand vous modifiez
romain@923: un fichier, vous le sauvegarder sous un nouveau nom contenant un numéro,
romain@923: à chaque fois plus grand la précédente version.
romain@923: 
romain@923: Ce genre de gestion de version manuel est cependant sujette facilement
romain@923: à des erreurs, ainsi, depuis longtemps, des logiciels existent pour
romain@923: adresser cette problématique. Les premiers outils de gestion de source
romain@923: étaient destinés à aider un seul utilisateur, à automatiser la gestion
romain@923: des versions d'un seulf fichier. Dans les dernières décades, cette cilble 
romain@923: a largement était agrandie, ils gèrent désormais de multiple fichiers, et
romain@923: aident un grand nombre de personnes à travailler ensemble. Le outils les
romain@923: plus modernes n'ont aucune difficultés à gérer plusieurs milliers de 
romain@923: personnes travaillant ensemble sur des projets regroupant plusieurs 
romain@923: centaines de milliers de fichiers.
romain@923: 
romain@923: \subsection{Pourquoi utiliser un gestionnaire de source ?}
romain@923: 
romain@923: Il y a de nombreuse raisons pour que vous ou votre équipe souhaitiez
romain@923: utiliser un outil automatisant la gestion de version pour votre projet.
bos@217: \begin{itemize}
romain@923: \item L'outil se chargera de suivre l'évolution de votre projet, sans
romain@923: que vous ayez à le faire. Pour chaque modification, vous aurez à votre
romain@923: disposition un journal indiquant \emph{qui} a faient quoi, \emph{pourquoi}
romain@923: ils l'ont fait, \emph{quand} ils l'ont fait, et \emph{ce} qu'ils ont
romain@923: modifiés.
romain@923: \item Quand vous travaillez avec d'autres personnes, les logiciels de 
romain@923: gestion de source facilite le travail collaboratif. Par exemple, quand
romain@923: plusieurs personnes font, plus ou moins simultannéement, des modifications
romain@923: incompatibles, le logiciel vous aidera à identifier et résoudre les conflits.
romain@924: \item L'outil vous aidera à réparer vos erreurs. Si vous effectuez un changement
romain@924: qui se révèlera être une erreur, vous pourrez revenir fiablement à une version
romain@924: antérieur d'une fichier ou même d'un ensemble de fichier. En fait, un outil de
romain@924: gestion de source \emph{vraiment} efficace vous permettra d'identifier à quel
romain@924: moment le problème est apparu (voir la section~\ref{sec:undo:bisect} pour plus
romain@924: de détails).
romain@924: \item L'outil vous permettra aussi de travailler sur plusieurs versions différentes
romain@924: de votre projet et à gérer l'écart entre chaque.
bos@217: \end{itemize}
romain@924: La plupart de ces raisons ont autant d'importances---du moins en théorie--- que
romain@924: vous travailliez sur un projet pour vous, ou avec une centaine d'autres
romain@924: personnes.
romain@924: 
romain@924: Une question fondamental à propos des outils de gestion de source, qu'il s'agisse
romain@924: du projet d'une personne ou d'une grande équipe, est quelles sont ses  
romain@924: \emph{avantages} par rapport à ses \emph{coût}. Un outil qui est difficile à 
romain@924: utiliser ou à comprendre exigera un effort d'adoption.
romain@924: 
romain@924: Un projet de cinq milles personnnes s'effondrera très certainement de lui même
romain@924: sans aucun processus et outil de gestion de source. Dans ce cas, le coût 
romain@924: d'utilisation d'un logiciel de gestion de source est dérisoire, puisque 
romain@924: \emph{sans}, l'échec est presque garanti.
romain@924: 
romain@924: D'un autre coté, un ``rapide hack'' d'une personnne peut sembler un contexte
romain@924: bien pauvre pour utiliser un outil de gestion de source, car, bien évidement
romain@924: le coût d'utilisation dépasse le coût total du projet. N'est ce pas ?
romain@924: 
romain@924: Mercurial supporte ces \emph{deux} échelles de travail. Vous pouvez apprendre
romain@924: les bases en juste quelques minutes, et, grâce à sa performance, vous pouvez
romain@924: l'utiliser avec facilité sur le plus petit des projets. Cette simplicité 
romain@924: signifie que vous n'avez pas de concepts obscures ou de séquence de commandes
romain@924: défiant l'imagination, complètement décorrelé de \emph{ce que vous êtes 
romain@924: vraiment entrain de faire}. En même temps, ces mêmes performances et sa 
romain@924: nature ``peer-to-peer'' vous permet d'augmenter, sans difficulté, son 
romain@924: utilisation à de très grand projet.
romain@924: 
romain@924: Aucun outil de gestion de source ne peut sauver un projet mal mené, mais un
romain@924: bon outil peut faire une grande différence dans la fluidité avec lequel 
romain@924: vous pourrez travailler avec.
romain@924: 
romain@924: \subsection{Les multiples noms de la gestion de source}
romain@924: 
romain@924: La gestion de source est un domaine divers, tellement qu'il n'existe pas
romain@924: une seul nom ou acronyme pour le désigner. Voilà quelqu'uns des noms ou 
romain@924: acronymes que vous rencontrerez le plus souvent:
bos@217: \begin{itemize}
romain@924: \item \textit{Revision control (RCS)} ;
romain@924: \item Software configuration management (SCM), ou \textit{configuration management} ;
romain@924: \item \textit{Source code management} ;
romain@924: \item \textit{Source code control}, ou \textit{source control} ;
romain@924: \item \textit{Version control (VCS)}.
bos@217: \end{itemize}
romain@924: 
romain@924: \notebox {
romain@924: Note du traducteur : J'ai conservé la liste des noms en anglais pour des raisons de commodité (ils sont plus ``googelable''). J'ai choisi de conserver le terme ``gestion de sources'' comme traduction unique dans l'ensemble du document.
romain@924: 
romain@924: En outre, j'ai opté pour conserver l'ensemble des opérations de Mercurial (commit, push, pull,...) en anglais, là aussi pour faciliter la lecture d'autres documents en anglais, et 
romain@924: aussi son utilisation.
romain@924: }
romain@924: 
romain@924: Certains personnes prétendent que ces termes ont en fait des sens
romain@924: différents mais en pratique ils se recouvrent tellement qu'il n'y a pas
romain@924: réellement de manière pertinente de les distinguer.
romain@924: 
romain@924: \section{Une courte histoire de la gestion de source}
romain@924: 
romain@924: Le plus célèbre des anciens outils de gestion de source est \textit{SCCS (Source
romain@924: Code Control System)}, que Marc Rochkind conçu dans les laboratoire de recherche de Bell 
romain@924: (\textit{Bell Labs}), dans le début des années 70. \textit{SCCS} ne fonctionner que sur des fichiers individuels, et demandait à personne travaillant sur le projet d'avoir un accès à un répertoire de travail commun, sur un unique système.
romain@924: Seulement une personne pouvait modifier un fichier au même moment, ce fonctionnement était assuré par l'utilisation de verrou (``lock''). Il était courant que des personnes ne vérouille
romain@924: des fichiers, et plus tard, oublie de le dévérouiller; empêchant  n'importe qui d'autre de 
romain@924: travailler sur ces fichiers sans l'aide de l'administrateur...
romain@924: 
romain@924: Walter Tichy a développé une alternative libre à \textit{SCCS} au début des années 80, qu'il
romain@924: nomma \textit{RSC (Revison Control System)}.  Comme \textit{SCCS}, \textit{RCS}
romain@924: demander aux développeurs de travailler sur le même répertoire partagé, et de vérouiller les
romain@924: fichiers pour se prémunir de tout conflit issue de modifications concurrentes.
romain@924: 
romain@924: Un peu plus tard dans les années 1980, Dick Grune utilisa \textit{RCS} comme une brique de base pour un ensemble de scripts \textit{shell} qu'il intitula cmt, avant de la renommer en \textit{CVS (Concurrent Versions System)}.  La grande innovation de CVS était que les développeurs pouvaient travailler simultanéement and indépendament dans leur propre espace de travail. Ces espaces de travail privés assuraient que les développeurs ne se marche mutuellement sur les pieds, comme c'était souvent le cas avec RCS et SCCS. Chaque développeur disposait donc de sa copie de tout les fichiers du projet, et ils pouvaient donc librement les modifier. Ils devaient néanmoins effectuer la ``fusion'' (\textit{``merge''}) de leur fichiers, avant d'effectuer le ``commit'' de leur modification sur le dépôt central.
bos@218: 
bos@218: Brian Berliner took Grune's original scripts and rewrote them in~C,
bos@218: releasing in 1989 the code that has since developed into the modern
bos@218: version of CVS.  CVS subsequently acquired the ability to operate over
bos@218: a network connection, giving it a client/server architecture.  CVS's
bos@218: architecture is centralised; only the server has a copy of the history
bos@218: of the project.  Client workspaces just contain copies of recent
bos@218: versions of the project's files, and a little metadata to tell them
bos@218: where the server is.  CVS has been enormously successful; it is
bos@218: probably the world's most widely used revision control system.
bos@218: 
bos@218: In the early 1990s, Sun Microsystems developed an early distributed
bos@218: revision control system, called TeamWare.  A TeamWare workspace
bos@218: contains a complete copy of the project's history.  TeamWare has no
bos@218: notion of a central repository.  (CVS relied upon RCS for its history
bos@218: storage; TeamWare used SCCS.)
bos@218: 
bos@218: As the 1990s progressed, awareness grew of a number of problems with
bos@218: CVS.  It records simultaneous changes to multiple files individually,
bos@218: instead of grouping them together as a single logically atomic
bos@218: operation.  It does not manage its file hierarchy well; it is easy to
bos@218: make a mess of a repository by renaming files and directories.  Worse,
bos@218: its source code is difficult to read and maintain, which made the
bos@218: ``pain level'' of fixing these architectural problems prohibitive.
bos@218: 
bos@218: In 2001, Jim Blandy and Karl Fogel, two developers who had worked on
bos@218: CVS, started a project to replace it with a tool that would have a
bos@218: better architecture and cleaner code.  The result, Subversion, does
bos@218: not stray from CVS's centralised client/server model, but it adds
bos@218: multi-file atomic commits, better namespace management, and a number
bos@218: of other features that make it a generally better tool than CVS.
bos@218: Since its initial release, it has rapidly grown in popularity.
bos@218: 
bos@218: More or less simultaneously, Graydon Hoare began working on an
bos@218: ambitious distributed revision control system that he named Monotone.
bos@218: While Monotone addresses many of CVS's design flaws and has a
bos@218: peer-to-peer architecture, it goes beyond earlier (and subsequent)
bos@218: revision control tools in a number of innovative ways.  It uses
bos@218: cryptographic hashes as identifiers, and has an integral notion of
bos@218: ``trust'' for code from different sources.
bos@218: 
bos@218: Mercurial began life in 2005.  While a few aspects of its design are
bos@218: influenced by Monotone, Mercurial focuses on ease of use, high
bos@218: performance, and scalability to very large projects.
bos@155: 
bos@219: \section{Trends in revision control}
bos@219: 
bos@219: There has been an unmistakable trend in the development and use of
bos@219: revision control tools over the past four decades, as people have
bos@219: become familiar with the capabilities of their tools and constrained
bos@219: by their limitations.
bos@219: 
bos@219: The first generation began by managing single files on individual
bos@219: computers.  Although these tools represented a huge advance over
bos@219: ad-hoc manual revision control, their locking model and reliance on a
bos@219: single computer limited them to small, tightly-knit teams.
bos@219: 
bos@219: The second generation loosened these constraints by moving to
bos@219: network-centered architectures, and managing entire projects at a
bos@219: time.  As projects grew larger, they ran into new problems.  With
bos@219: clients needing to talk to servers very frequently, server scaling
bos@219: became an issue for large projects.  An unreliable network connection
bos@219: could prevent remote users from being able to talk to the server at
bos@219: all.  As open source projects started making read-only access
bos@219: available anonymously to anyone, people without commit privileges
bos@219: found that they could not use the tools to interact with a project in
bos@219: a natural way, as they could not record their changes.
bos@219: 
bos@219: The current generation of revision control tools is peer-to-peer in
bos@219: nature.  All of these systems have dropped the dependency on a single
bos@219: central server, and allow people to distribute their revision control
bos@219: data to where it's actually needed.  Collaboration over the Internet
bos@219: has moved from constrained by technology to a matter of choice and
bos@219: consensus.  Modern tools can operate offline indefinitely and
bos@219: autonomously, with a network connection only needed when syncing
bos@219: changes with another repository.
bos@219: 
bos@219: \section{A few of the advantages of distributed revision control}
bos@219: 
bos@219: Even though distributed revision control tools have for several years
bos@219: been as robust and usable as their previous-generation counterparts,
bos@219: people using older tools have not yet necessarily woken up to their
bos@219: advantages.  There are a number of ways in which distributed tools
bos@219: shine relative to centralised ones.
bos@219: 
bos@219: For an individual developer, distributed tools are almost always much
bos@219: faster than centralised tools.  This is for a simple reason: a
bos@219: centralised tool needs to talk over the network for many common
bos@219: operations, because most metadata is stored in a single copy on the
bos@219: central server.  A distributed tool stores all of its metadata
bos@219: locally.  All else being equal, talking over the network adds overhead
bos@219: to a centralised tool.  Don't underestimate the value of a snappy,
bos@219: responsive tool: you're going to spend a lot of time interacting with
bos@219: your revision control software.
bos@219: 
bos@219: Distributed tools are indifferent to the vagaries of your server
bos@219: infrastructure, again because they replicate metadata to so many
bos@219: locations.  If you use a centralised system and your server catches
bos@219: fire, you'd better hope that your backup media are reliable, and that
bos@219: your last backup was recent and actually worked.  With a distributed
bos@219: tool, you have many backups available on every contributor's computer.
bos@219: 
bos@219: The reliability of your network will affect distributed tools far less
bos@219: than it will centralised tools.  You can't even use a centralised tool
bos@219: without a network connection, except for a few highly constrained
bos@219: commands.  With a distributed tool, if your network connection goes
bos@219: down while you're working, you may not even notice.  The only thing
bos@219: you won't be able to do is talk to repositories on other computers,
bos@219: something that is relatively rare compared with local operations.  If
bos@219: you have a far-flung team of collaborators, this may be significant.
bos@219: 
bos@220: \subsection{Advantages for open source projects}
bos@220: 
bos@219: If you take a shine to an open source project and decide that you
bos@219: would like to start hacking on it, and that project uses a distributed
bos@219: revision control tool, you are at once a peer with the people who
bos@219: consider themselves the ``core'' of that project.  If they publish
bos@219: their repositories, you can immediately copy their project history,
bos@219: start making changes, and record your work, using the same tools in
bos@219: the same ways as insiders.  By contrast, with a centralised tool, you
bos@219: must use the software in a ``read only'' mode unless someone grants
bos@219: you permission to commit changes to their central server.  Until then,
bos@219: you won't be able to record changes, and your local modifications will
bos@219: be at risk of corruption any time you try to update your client's view
bos@219: of the repository.
bos@155: 
bos@220: \subsubsection{The forking non-problem}
bos@220: 
bos@220: It has been suggested that distributed revision control tools pose
bos@220: some sort of risk to open source projects because they make it easy to
bos@220: ``fork'' the development of a project.  A fork happens when there are
bos@220: differences in opinion or attitude between groups of developers that
bos@220: cause them to decide that they can't work together any longer.  Each
bos@220: side takes a more or less complete copy of the project's source code,
bos@220: and goes off in its own direction.
bos@220: 
bos@220: Sometimes the camps in a fork decide to reconcile their differences.
bos@220: With a centralised revision control system, the \emph{technical}
bos@220: process of reconciliation is painful, and has to be performed largely
bos@220: by hand.  You have to decide whose revision history is going to
bos@220: ``win'', and graft the other team's changes into the tree somehow.
bos@220: This usually loses some or all of one side's revision history.
bos@220: 
bos@220: What distributed tools do with respect to forking is they make forking
bos@220: the \emph{only} way to develop a project.  Every single change that
bos@220: you make is potentially a fork point.  The great strength of this
bos@220: approach is that a distributed revision control tool has to be really
bos@220: good at \emph{merging} forks, because forks are absolutely
bos@220: fundamental: they happen all the time.  
bos@220: 
bos@220: If every piece of work that everybody does, all the time, is framed in
bos@220: terms of forking and merging, then what the open source world refers
bos@220: to as a ``fork'' becomes \emph{purely} a social issue.  If anything,
bos@220: distributed tools \emph{lower} the likelihood of a fork:
bos@220: \begin{itemize}
bos@220: \item They eliminate the social distinction that centralised tools
bos@220:   impose: that between insiders (people with commit access) and
bos@220:   outsiders (people without).
bos@220: \item They make it easier to reconcile after a social fork, because
bos@220:   all that's involved from the perspective of the revision control
bos@220:   software is just another merge.
bos@220: \end{itemize}
bos@220: 
bos@220: Some people resist distributed tools because they want to retain tight
bos@220: control over their projects, and they believe that centralised tools
bos@220: give them this control.  However, if you're of this belief, and you
bos@220: publish your CVS or Subversion repositories publically, there are
bos@220: plenty of tools available that can pull out your entire project's
bos@220: history (albeit slowly) and recreate it somewhere that you don't
bos@220: control.  So while your control in this case is illusory, you are
tktan@263: forgoing the ability to fluidly collaborate with whatever people feel
bos@220: compelled to mirror and fork your history.
bos@220: 
bos@220: \subsection{Advantages for commercial projects}
bos@220: 
bos@220: Many commercial projects are undertaken by teams that are scattered
bos@220: across the globe.  Contributors who are far from a central server will
bos@220: see slower command execution and perhaps less reliability.  Commercial
bos@220: revision control systems attempt to ameliorate these problems with
bos@220: remote-site replication add-ons that are typically expensive to buy
bos@220: and cantankerous to administer.  A distributed system doesn't suffer
bos@220: from these problems in the first place.  Better yet, you can easily
bos@220: set up multiple authoritative servers, say one per site, so that
bos@220: there's no redundant communication between repositories over expensive
bos@220: long-haul network links.
bos@220: 
bos@220: Centralised revision control systems tend to have relatively low
bos@220: scalability.  It's not unusual for an expensive centralised system to
bos@220: fall over under the combined load of just a few dozen concurrent
bos@220: users.  Once again, the typical response tends to be an expensive and
bos@220: clunky replication facility.  Since the load on a central server---if
bos@280: you have one at all---is many times lower with a distributed
bos@220: tool (because all of the data is replicated everywhere), a single
bos@220: cheap server can handle the needs of a much larger team, and
bos@220: replication to balance load becomes a simple matter of scripting.
bos@220: 
bos@220: If you have an employee in the field, troubleshooting a problem at a
bos@220: customer's site, they'll benefit from distributed revision control.
bos@220: The tool will let them generate custom builds, try different fixes in
bos@220: isolation from each other, and search efficiently through history for
bos@220: the sources of bugs and regressions in the customer's environment, all
bos@220: without needing to connect to your company's network.
bos@219: 
bos@155: \section{Why choose Mercurial?}
bos@155: 
bos@221: Mercurial has a unique set of properties that make it a particularly
bos@221: good choice as a revision control system.
bos@221: \begin{itemize}
bos@221: \item It is easy to learn and use.
bos@221: \item It is lightweight.
bos@221: \item It scales excellently.
bos@221: \item It is easy to customise.
bos@221: \end{itemize}
bos@221: 
bos@221: If you are at all familiar with revision control systems, you should
bos@221: be able to get up and running with Mercurial in less than five
bos@221: minutes.  Even if not, it will take no more than a few minutes
bos@221: longer.  Mercurial's command and feature sets are generally uniform
bos@221: and consistent, so you can keep track of a few general rules instead
bos@221: of a host of exceptions.
bos@221: 
bos@221: On a small project, you can start working with Mercurial in moments.
bos@221: Creating new changes and branches; transferring changes around
bos@221: (whether locally or over a network); and history and status operations
bos@221: are all fast.  Mercurial attempts to stay nimble and largely out of
bos@221: your way by combining low cognitive overhead with blazingly fast
bos@221: operations.
bos@221: 
bos@221: The usefulness of Mercurial is not limited to small projects: it is
bos@221: used by projects with hundreds to thousands of contributors, each
bos@221: containing tens of thousands of files and hundreds of megabytes of
bos@221: source code.
bos@221: 
bos@221: If the core functionality of Mercurial is not enough for you, it's
bos@221: easy to build on.  Mercurial is well suited to scripting tasks, and
bos@221: its clean internals and implementation in Python make it easy to add
bos@221: features in the form of extensions.  There are a number of popular and
bos@221: useful extensions already available, ranging from helping to identify
bos@221: bugs to improving performance.
bos@221: 
bos@221: \section{Mercurial compared with other tools}
bos@221: 
bos@221: Before you read on, please understand that this section necessarily
bos@221: reflects my own experiences, interests, and (dare I say it) biases.  I
bos@221: have used every one of the revision control tools listed below, in
bos@221: most cases for several years at a time.
bos@221: 
bos@280: 
bos@221: \subsection{Subversion}
bos@221: 
bos@221: Subversion is a popular revision control tool, developed to replace
bos@221: CVS.  It has a centralised client/server architecture.
bos@221: 
bos@221: Subversion and Mercurial have similarly named commands for performing
bos@280: the same operations, so if you're familiar with one, it is easy to
bos@280: learn to use the other.  Both tools are portable to all popular
bos@221: operating systems.
bos@221: 
bos@315: Prior to version 1.5, Subversion had no useful support for merges.
bos@315: At the time of writing, its merge tracking capability is new, and known to be
bos@315: \href{http://svnbook.red-bean.com/nightly/en/svn.branchmerge.advanced.html#svn.branchmerge.advanced.finalword}{complicated
bos@315:   and buggy}.
bos@256: 
bos@221: Mercurial has a substantial performance advantage over Subversion on
bos@221: every revision control operation I have benchmarked.  I have measured
bos@221: its advantage as ranging from a factor of two to a factor of six when
bos@221: compared with Subversion~1.4.3's \emph{ra\_local} file store, which is
simon@313: the fastest access method available.  In more realistic deployments
bos@221: involving a network-based store, Subversion will be at a substantially
bos@256: larger disadvantage.  Because many Subversion commands must talk to
bos@256: the server and Subversion does not have useful replication facilities,
bos@280: server capacity and network bandwidth become bottlenecks for modestly
bos@280: large projects.
bos@280: 
bos@280: Additionally, Subversion incurs substantial storage overhead to avoid
bos@280: network transactions for a few common operations, such as finding
bos@280: modified files (\texttt{status}) and displaying modifications against
bos@280: the current revision (\texttt{diff}).  As a result, a Subversion
bos@280: working copy is often the same size as, or larger than, a Mercurial
bos@280: repository and working directory, even though the Mercurial repository
bos@280: contains a complete history of the project.
bos@280: 
bos@280: Subversion is widely supported by third party tools.  Mercurial
bos@280: currently lags considerably in this area.  This gap is closing,
bos@280: however, and indeed some of Mercurial's GUI tools now outshine their
bos@280: Subversion equivalents.  Like Mercurial, Subversion has an excellent
bos@280: user manual.
bos@280: 
bos@280: Because Subversion doesn't store revision history on the client, it is
bos@280: well suited to managing projects that deal with lots of large, opaque
bos@280: binary files.  If you check in fifty revisions to an incompressible
bos@280: 10MB file, Subversion's client-side space usage stays constant The
bos@280: space used by any distributed SCM will grow rapidly in proportion to
bos@280: the number of revisions, because the differences between each revision
bos@280: are large.
bos@280: 
bos@280: In addition, it's often difficult or, more usually, impossible to
bos@280: merge different versions of a binary file.  Subversion's ability to
bos@280: let a user lock a file, so that they temporarily have the exclusive
bos@280: right to commit changes to it, can be a significant advantage to a
bos@280: project where binary files are widely used.
bos@280: 
bos@280: Mercurial can import revision history from a Subversion repository.
bos@280: It can also export revision history to a Subversion repository.  This
bos@280: makes it easy to ``test the waters'' and use Mercurial and Subversion
bos@280: in parallel before deciding to switch.  History conversion is
bos@280: incremental, so you can perform an initial conversion, then small
bos@280: additional conversions afterwards to bring in new changes.
bos@280: 
bos@221: 
bos@221: \subsection{Git}
bos@221: 
bos@221: Git is a distributed revision control tool that was developed for
bos@221: managing the Linux kernel source tree.  Like Mercurial, its early
bos@221: design was somewhat influenced by Monotone.
bos@221: 
bos@280: Git has a very large command set, with version~1.5.0 providing~139
bos@280: individual commands.  It has something of a reputation for being
bos@280: difficult to learn.  Compared to Git, Mercurial has a strong focus on
bos@280: simplicity.
bos@280: 
bos@280: In terms of performance, Git is extremely fast.  In several cases, it
bos@280: is faster than Mercurial, at least on Linux, while Mercurial performs
bos@280: better on other operations.  However, on Windows, the performance and
bos@280: general level of support that Git provides is, at the time of writing,
bos@280: far behind that of Mercurial.
bos@221: 
bos@221: While a Mercurial repository needs no maintenance, a Git repository
bos@221: requires frequent manual ``repacks'' of its metadata.  Without these,
bos@221: performance degrades, while space usage grows rapidly.  A server that
bos@221: contains many Git repositories that are not rigorously and frequently
bos@221: repacked will become heavily disk-bound during backups, and there have
bos@221: been instances of daily backups taking far longer than~24 hours as a
bos@221: result.  A freshly packed Git repository is slightly smaller than a
bos@221: Mercurial repository, but an unpacked repository is several orders of
bos@221: magnitude larger.
bos@221: 
bos@221: The core of Git is written in C.  Many Git commands are implemented as
bos@221: shell or Perl scripts, and the quality of these scripts varies widely.
bos@280: I have encountered several instances where scripts charged along
bos@221: blindly in the presence of errors that should have been fatal.
bos@221: 
bos@280: Mercurial can import revision history from a Git repository.
bos@280: 
bos@280: 
bos@221: \subsection{CVS}
bos@221: 
bos@221: CVS is probably the most widely used revision control tool in the
bos@280: world.  Due to its age and internal untidiness, it has been only
bos@280: lightly maintained for many years.
bos@221: 
bos@221: It has a centralised client/server architecture.  It does not group
bos@221: related file changes into atomic commits, making it easy for people to
bos@256: ``break the build'': one person can successfully commit part of a
bos@256: change and then be blocked by the need for a merge, causing other
bos@256: people to see only a portion of the work they intended to do.  This
bos@256: also affects how you work with project history.  If you want to see
bos@256: all of the modifications someone made as part of a task, you will need
bos@256: to manually inspect the descriptions and timestamps of the changes
bos@256: made to each file involved (if you even know what those files were).
bos@256: 
bos@256: CVS has a muddled notion of tags and branches that I will not attempt
bos@256: to even describe.  It does not support renaming of files or
bos@256: directories well, making it easy to corrupt a repository.  It has
bos@256: almost no internal consistency checking capabilities, so it is usually
bos@256: not even possible to tell whether or how a repository is corrupt.  I
bos@256: would not recommend CVS for any project, existing or new.
bos@221: 
bos@221: Mercurial can import CVS revision history.  However, there are a few
bos@221: caveats that apply; these are true of every other revision control
bos@221: tool's CVS importer, too.  Due to CVS's lack of atomic changes and
bos@221: unversioned filesystem hierarchy, it is not possible to reconstruct
bos@221: CVS history completely accurately; some guesswork is involved, and
bos@221: renames will usually not show up.  Because a lot of advanced CVS
bos@221: administration has to be done by hand and is hence error-prone, it's
bos@221: common for CVS importers to run into multiple problems with corrupted
bos@221: repositories (completely bogus revision timestamps and files that have
bos@221: remained locked for over a decade are just two of the less interesting
bos@221: problems I can recall from personal experience).
bos@221: 
bos@280: Mercurial can import revision history from a CVS repository.
bos@280: 
bos@280: 
bos@221: \subsection{Commercial tools}
bos@221: 
bos@221: Perforce has a centralised client/server architecture, with no
bos@221: client-side caching of any data.  Unlike modern revision control
bos@221: tools, Perforce requires that a user run a command to inform the
bos@221: server about every file they intend to edit.
bos@221: 
bos@221: The performance of Perforce is quite good for small teams, but it
bos@221: falls off rapidly as the number of users grows beyond a few dozen.
bos@221: Modestly large Perforce installations require the deployment of
bos@221: proxies to cope with the load their users generate.
bos@16: 
bos@280: 
bos@280: \subsection{Choosing a revision control tool}
bos@280: 
bos@280: With the exception of CVS, all of the tools listed above have unique
bos@280: strengths that suit them to particular styles of work.  There is no
bos@280: single revision control tool that is best in all situations.
bos@280: 
bos@280: As an example, Subversion is a good choice for working with frequently
bos@280: edited binary files, due to its centralised nature and support for
bos@318: file locking.
bos@280: 
bos@280: I personally find Mercurial's properties of simplicity, performance,
bos@280: and good merge support to be a compelling combination that has served
bos@280: me well for several years.
bos@280: 
bos@280: 
bos@280: \section{Switching from another tool to Mercurial}
bos@280: 
bos@280: Mercurial is bundled with an extension named \hgext{convert}, which
bos@280: can incrementally import revision history from several other revision
bos@280: control tools.  By ``incremental'', I mean that you can convert all of
bos@280: a project's history to date in one go, then rerun the conversion later
bos@280: to obtain new changes that happened after the initial conversion.
bos@280: 
bos@280: The revision control tools supported by \hgext{convert} are as
bos@280: follows:
bos@280: \begin{itemize}
bos@280: \item Subversion
bos@280: \item CVS
bos@280: \item Git
bos@280: \item Darcs
bos@280: \end{itemize}
bos@280: 
bos@280: In addition, \hgext{convert} can export changes from Mercurial to
bos@280: Subversion.  This makes it possible to try Subversion and Mercurial in
bos@280: parallel before committing to a switchover, without risking the loss
bos@280: of any work.
bos@280: 
bos@280: The \hgxcmd{conver}{convert} command is easy to use.  Simply point it
bos@280: at the path or URL of the source repository, optionally give it the
bos@280: name of the destination repository, and it will start working.  After
bos@280: the initial conversion, just run the same command again to import new
bos@280: changes.
bos@280: 
bos@280: 
bos@16: %%% Local Variables: 
bos@16: %%% mode: latex
bos@16: %%% TeX-master: "00book"
bos@16: %%% End: