hgbook

annotate es/intro.tex @ 403:4cdeb830118b

Starting translating intro to spanish
author Igor TAmara <igor@tamarapatino.org>
date Sat Nov 08 19:31:13 2008 -0500 (2008-11-08)
parents b05e35d641e4
children 1839fd383e50
rev   line source
igor@403 1 \chapter{Introducción}
igor@402 2 \label{chap:intro}
igor@402 3
igor@403 4 \section{Acerca del control de revisiones}
igor@403 5
igor@403 6 El control de revisiones es el proceso de administrar diferentes
igor@403 7 versiones de una pieza de información. En su forma más simple es algo
igor@403 8 que la mayoría de gente hace a mano: cada vez que usted modifica un
igor@403 9 fichero, lo graba con un nuevo nombre que contiene un número, el
igor@403 10 siguiente mayor que el anterior.
igor@403 11
igor@403 12 Administrar manualmente muchas versiones de un fichero es una tarea
igor@403 13 propensa a errores, a pesar de que hace bastante tiempo hay
igor@403 14 herramientas que ayudan en este proceso. Las primeras herramientas
igor@403 15 para automatizar el control de revisiones fueron pensadas para que un
igor@403 16 usuario administrara un solo fichero. En las décadas pasadas, el
igor@403 17 alcance de las herramientas de control de revisiones ha ido aumentando
igor@403 18 considerablemente; ahora manejan muchos archivos y facilitan el
igor@403 19 trabajo en conjunto de varias personas. Las mejores herramientas de
igor@403 20 control de revisiones de la actualidad no tienen problema con miles de
igor@403 21 personas trabajando en proyectos que consisten de decenas de miles de
igor@403 22 ficheros.
igor@403 23
igor@403 24 \subsection{¿Por qué usar control de revisiones?}
igor@403 25
igor@403 26 Hay muchas razones por las cuales usted o su equipo desearía usar una
igor@403 27 herramienta automática de control de revisiones para un proyecto.
igor@402 28 \begin{itemize}
igor@403 29 \item Contar con la historia y la evolución de su proyecto, para
igor@403 30 evitar hacer la tarea manualmente. Por cada cambio tendrá una
igor@403 31 bitácora de \emph{quién} lo hizo; \emph{por qué} se hizo;
igor@403 32 \emph{cuándo} se hizo; y de \emph{qué} se trataba el cambio.
igor@403 33 \item Cuando trabaja con más personas, los programas de control de
igor@403 34 revisiones facilitan la colaboración. Por ejemplo, cuando varias
igor@403 35 personas de forma casi simultanea pueden hacer cambios
igor@403 36 incompatibles, el programa le ayudará a identificar y resolver tales
igor@403 37 conflictos.
igor@403 38 \item Puede ayudarle a recuperarse de equivocaciones. Si aplica un
igor@403 39 cambio que posteriormente se evidencia como un error, puede
igor@403 40 revertirlo a una versión previa a uno o muchos ficheros. De hecho,
igor@403 41 una herramienta \emph{realmente} buena, incluso puede ayudarle
igor@403 42 efectivamente a darse cuenta exactamente cuándo se introdujo el
igor@403 43 error( para más detalles ver la sección~\ref{sec:undo:bisect}).
igor@403 44 \item Le permitirá trabajar simultáneamente, y manejar las diferencias
igor@403 45 entre múltiples versiones de su proyecto.
igor@402 46 \end{itemize}
igor@403 47 La mayoría de estas razones son igualmente validas ---por lo menos en
igor@403 48 teoría--- así esté trabajando en un proyecto solo, o con mucha gente.
igor@403 49
igor@403 50 Algo fundamental acerca de lo práctico de un sistema de control de
igor@403 51 revisiones en estas dos escalas (``un hacker solo'' y ``un equipo
igor@403 52 gigantesco'') es cómo se comparan los \emph{beneficios} con los
igor@403 53 \emph{costos}. Una herramienta de control de revisiones que sea
igor@403 54 difícil de entender o usar impondrá un costo alto.
igor@403 55
igor@403 56 Un proyecto de quinientas personas es muy propenso a colapsar
igor@403 57 solamente con su peso inmediatamente sin una herramienta de control de
igor@403 58 versiones y un proceso. En este caso, el costo de usar control de
igor@403 59 revisiones ni siquiera se tiene en cueant, puesto que \emph{sin} él,
igor@403 60 el fracaso está casi garantizado.
igor@403 61
igor@403 62 Por otra parte, un ``arreglo rápido'' de una sola persona, excluiría
igor@403 63 la necesidad de usar una herramienta de control de revisiones, porque
igor@403 64 casi seguramente, el costo de usar una estaría cerca del costo del
igor@403 65 proyecto. ¿No es así?
igor@403 66
igor@403 67 Mercurial solamente soporta \emph{ambas} escalas de de
igor@403 68 desarrollo. Puede aprender lo básico en pocos minutos, y dado su bajo
igor@403 69 sobrecosto, puede aplicar el control de revisiones al proyecto más
igor@403 70 pequeño con facilidad. Su simplicidad significa que no tendrá que
igor@403 71 preocuparse por conceptos obtusos o secuencias de órdenes compitiendo
igor@403 72 por espacio mental con lo que sea que \emph{realmente} esté tratando
igor@403 73 de hacer. Al mismo tiempo, Mercurial tiene alto desempeño y su
igor@403 74 naturaleza peer-to-peer le permite escalar indoloramente para manejar
igor@403 75 grandes proyectos.
igor@403 76
igor@403 77 Ninguna herramienta de control de revisiones puede salvar un
igor@403 78 proyecto mal administrado, pero la elección de herramientas puede
igor@403 79 hacer una gran diferencia en la fluidez con la cual puede trabajar en
igor@403 80 el proyecto.
igor@403 81
igor@403 82 \subsection{La cantidad de nombres del control de revisiones}
igor@403 83
igor@403 84 El control de revisiones es un campo amplio, tan ampli que no hay un
igor@403 85 acrónimo o nombre único. A continuación presentamos un listado de
igor@403 86 nombres comunes y acrónimos que se podrían encontrar:
igor@402 87 \begin{itemize}
igor@403 88 \item Control de revisiones (RCS)
igor@403 89 \item Manejo de Configuraciones de Programas(SCM), o administracón de
igor@403 90 configuraciones
igor@403 91 \item Administración de código fuente
igor@403 92 \item Control de Código Fuente, o Control de Fuentes
igor@403 93 \item Control de Versiones(VCS)
igor@402 94 \end{itemize}
igor@403 95 Algunas personas aducen que estos términos tienen significados
igor@403 96 diversos, pero en la práctica se sobrelapan tanto que no hay un
igor@403 97 acuerdo o una forma adecuada de separarlos.
igor@403 98
igor@403 99 \section{Historia resumida del control de revisiones}
igor@403 100
igor@403 101 La herramienta de control de revisiones más antigua conocida es SCCS
igor@403 102 (Sistema de Control de Código), escrito por Marc Rochkind en Bell
igor@403 103 Labs, a comienzos de los setentas(1970s). SCCS operaba sobre archivos
igor@403 104 individuales, y requería que cada persona que trabajara en el proyecto
igor@403 105 tuviera acceso a un espacio compartido en un solo sistema. Solamente
igor@403 106 una persona podía modificar un archivo en un momento dado; el
igor@403 107 arbitramiento del acceso a los ficheros se hacía con candados. Era
igor@403 108 común que la gente pusiera los candados a los ficheros, y que
igor@403 109 posteriormente olvidara quitarlos, impidiendo que otro pudiera
igor@403 110 modificar los ficheros en cuestión sin la intervención del
igor@403 111 administrador.
igor@403 112
igor@403 113 Walter Tichy desarrolló una alternativa gratutita a SCCS a comienzos
igor@403 114 de los ochentas(1980s), llamó a su programa RCS(Sistema de Control de
igor@403 115 Revisiones). Al igual que SCCS, RCS requería que los desarrolladores
igor@403 116 trabajaran en un único espacio compartido y colocaran candados a los
igor@403 117 ficheros para evitar que varias personas los estuvieran modificando
igor@403 118 simultáneamente.
igor@403 119
igor@403 120 Después en los ochenta, Dick Grune usó RCS como un bloque de
igor@403 121 construcción para un conjunto de guiones de línea de comando, que
igor@403 122 inicialmente llamó cmt, pero que renombró a CVS(Sistema Concurrente de
igor@403 123 Versiones). La gran innovación de CVS era que permitía a los
igor@403 124 desarrolladores trabajar simultáneamente de una forma más o menos
igor@403 125 independiente en sus propios espacios de trabajo. Los espacios de
igor@403 126 trabajo personales impedian que los desarrolladores se pisaran las
igor@403 127 mangueras todo el tiempo, situación común con SCCS y RCS. Cada
igor@403 128 desarrollador tenía una copia de todo el fichero del proyecto y podía
igor@403 129 modificar su copia independientemente, Tenían que fusionar sus
igor@403 130 ediciones antes de consignar los cambios al repositorio central.
igor@403 131
igor@403 132 Brian Berliner tomó los scripts originales de Grune y los reescribió
igor@403 133 en~C, haciéndolos públicos en 1989, código sobre el cual se ha
igor@403 134 desarrollado la versión moderna de CVS. CVS posteriormente adquirió
igor@403 135 la habilidad de operar sobre una conexión de red, dotándolo de una
igor@403 136 arquitectura, cliente/servidor. La arquitectura de CVS es
igor@403 137 centralizada; La historia del proyecto está únicamente en el
igor@403 138 repositorio central. Los espacios de trabajo de los clientes
igor@403 139 contienen únicamente copias recientes de las versiones de los
igor@403 140 ficheros, y pocos metadatos para indicar dónde está el servidor. CVS
igor@403 141 ha tenido un éxito enorme; Es probablemente el sistema de control de
igor@403 142 revisiones más extendido del planeta.
igor@402 143
igor@402 144 In the early 1990s, Sun Microsystems developed an early distributed
igor@402 145 revision control system, called TeamWare. A TeamWare workspace
igor@402 146 contains a complete copy of the project's history. TeamWare has no
igor@402 147 notion of a central repository. (CVS relied upon RCS for its history
igor@402 148 storage; TeamWare used SCCS.)
igor@402 149
igor@402 150 As the 1990s progressed, awareness grew of a number of problems with
igor@402 151 CVS. It records simultaneous changes to multiple files individually,
igor@402 152 instead of grouping them together as a single logically atomic
igor@402 153 operation. It does not manage its file hierarchy well; it is easy to
igor@402 154 make a mess of a repository by renaming files and directories. Worse,
igor@402 155 its source code is difficult to read and maintain, which made the
igor@402 156 ``pain level'' of fixing these architectural problems prohibitive.
igor@402 157
igor@402 158 In 2001, Jim Blandy and Karl Fogel, two developers who had worked on
igor@402 159 CVS, started a project to replace it with a tool that would have a
igor@402 160 better architecture and cleaner code. The result, Subversion, does
igor@402 161 not stray from CVS's centralised client/server model, but it adds
igor@402 162 multi-file atomic commits, better namespace management, and a number
igor@402 163 of other features that make it a generally better tool than CVS.
igor@402 164 Since its initial release, it has rapidly grown in popularity.
igor@402 165
igor@402 166 More or less simultaneously, Graydon Hoare began working on an
igor@402 167 ambitious distributed revision control system that he named Monotone.
igor@402 168 While Monotone addresses many of CVS's design flaws and has a
igor@402 169 peer-to-peer architecture, it goes beyond earlier (and subsequent)
igor@402 170 revision control tools in a number of innovative ways. It uses
igor@402 171 cryptographic hashes as identifiers, and has an integral notion of
igor@402 172 ``trust'' for code from different sources.
igor@402 173
igor@402 174 Mercurial began life in 2005. While a few aspects of its design are
igor@402 175 influenced by Monotone, Mercurial focuses on ease of use, high
igor@402 176 performance, and scalability to very large projects.
igor@402 177
igor@402 178 \section{Trends in revision control}
igor@402 179
igor@402 180 There has been an unmistakable trend in the development and use of
igor@402 181 revision control tools over the past four decades, as people have
igor@402 182 become familiar with the capabilities of their tools and constrained
igor@402 183 by their limitations.
igor@402 184
igor@402 185 The first generation began by managing single files on individual
igor@402 186 computers. Although these tools represented a huge advance over
igor@402 187 ad-hoc manual revision control, their locking model and reliance on a
igor@402 188 single computer limited them to small, tightly-knit teams.
igor@402 189
igor@402 190 The second generation loosened these constraints by moving to
igor@402 191 network-centered architectures, and managing entire projects at a
igor@402 192 time. As projects grew larger, they ran into new problems. With
igor@402 193 clients needing to talk to servers very frequently, server scaling
igor@402 194 became an issue for large projects. An unreliable network connection
igor@402 195 could prevent remote users from being able to talk to the server at
igor@402 196 all. As open source projects started making read-only access
igor@402 197 available anonymously to anyone, people without commit privileges
igor@402 198 found that they could not use the tools to interact with a project in
igor@402 199 a natural way, as they could not record their changes.
igor@402 200
igor@402 201 The current generation of revision control tools is peer-to-peer in
igor@402 202 nature. All of these systems have dropped the dependency on a single
igor@402 203 central server, and allow people to distribute their revision control
igor@402 204 data to where it's actually needed. Collaboration over the Internet
igor@402 205 has moved from constrained by technology to a matter of choice and
igor@402 206 consensus. Modern tools can operate offline indefinitely and
igor@402 207 autonomously, with a network connection only needed when syncing
igor@402 208 changes with another repository.
igor@402 209
igor@402 210 \section{A few of the advantages of distributed revision control}
igor@402 211
igor@402 212 Even though distributed revision control tools have for several years
igor@402 213 been as robust and usable as their previous-generation counterparts,
igor@402 214 people using older tools have not yet necessarily woken up to their
igor@402 215 advantages. There are a number of ways in which distributed tools
igor@402 216 shine relative to centralised ones.
igor@402 217
igor@402 218 For an individual developer, distributed tools are almost always much
igor@402 219 faster than centralised tools. This is for a simple reason: a
igor@402 220 centralised tool needs to talk over the network for many common
igor@402 221 operations, because most metadata is stored in a single copy on the
igor@402 222 central server. A distributed tool stores all of its metadata
igor@402 223 locally. All else being equal, talking over the network adds overhead
igor@402 224 to a centralised tool. Don't underestimate the value of a snappy,
igor@402 225 responsive tool: you're going to spend a lot of time interacting with
igor@402 226 your revision control software.
igor@402 227
igor@402 228 Distributed tools are indifferent to the vagaries of your server
igor@402 229 infrastructure, again because they replicate metadata to so many
igor@402 230 locations. If you use a centralised system and your server catches
igor@402 231 fire, you'd better hope that your backup media are reliable, and that
igor@402 232 your last backup was recent and actually worked. With a distributed
igor@402 233 tool, you have many backups available on every contributor's computer.
igor@402 234
igor@402 235 The reliability of your network will affect distributed tools far less
igor@402 236 than it will centralised tools. You can't even use a centralised tool
igor@402 237 without a network connection, except for a few highly constrained
igor@402 238 commands. With a distributed tool, if your network connection goes
igor@402 239 down while you're working, you may not even notice. The only thing
igor@402 240 you won't be able to do is talk to repositories on other computers,
igor@402 241 something that is relatively rare compared with local operations. If
igor@402 242 you have a far-flung team of collaborators, this may be significant.
igor@402 243
igor@402 244 \subsection{Advantages for open source projects}
igor@402 245
igor@402 246 If you take a shine to an open source project and decide that you
igor@402 247 would like to start hacking on it, and that project uses a distributed
igor@402 248 revision control tool, you are at once a peer with the people who
igor@402 249 consider themselves the ``core'' of that project. If they publish
igor@402 250 their repositories, you can immediately copy their project history,
igor@402 251 start making changes, and record your work, using the same tools in
igor@402 252 the same ways as insiders. By contrast, with a centralised tool, you
igor@402 253 must use the software in a ``read only'' mode unless someone grants
igor@402 254 you permission to commit changes to their central server. Until then,
igor@402 255 you won't be able to record changes, and your local modifications will
igor@402 256 be at risk of corruption any time you try to update your client's view
igor@402 257 of the repository.
igor@402 258
igor@402 259 \subsubsection{The forking non-problem}
igor@402 260
igor@402 261 It has been suggested that distributed revision control tools pose
igor@402 262 some sort of risk to open source projects because they make it easy to
igor@402 263 ``fork'' the development of a project. A fork happens when there are
igor@402 264 differences in opinion or attitude between groups of developers that
igor@402 265 cause them to decide that they can't work together any longer. Each
igor@402 266 side takes a more or less complete copy of the project's source code,
igor@402 267 and goes off in its own direction.
igor@402 268
igor@402 269 Sometimes the camps in a fork decide to reconcile their differences.
igor@402 270 With a centralised revision control system, the \emph{technical}
igor@402 271 process of reconciliation is painful, and has to be performed largely
igor@402 272 by hand. You have to decide whose revision history is going to
igor@402 273 ``win'', and graft the other team's changes into the tree somehow.
igor@402 274 This usually loses some or all of one side's revision history.
igor@402 275
igor@402 276 What distributed tools do with respect to forking is they make forking
igor@402 277 the \emph{only} way to develop a project. Every single change that
igor@402 278 you make is potentially a fork point. The great strength of this
igor@402 279 approach is that a distributed revision control tool has to be really
igor@402 280 good at \emph{merging} forks, because forks are absolutely
igor@402 281 fundamental: they happen all the time.
igor@402 282
igor@402 283 If every piece of work that everybody does, all the time, is framed in
igor@402 284 terms of forking and merging, then what the open source world refers
igor@402 285 to as a ``fork'' becomes \emph{purely} a social issue. If anything,
igor@402 286 distributed tools \emph{lower} the likelihood of a fork:
igor@402 287 \begin{itemize}
igor@402 288 \item They eliminate the social distinction that centralised tools
igor@402 289 impose: that between insiders (people with commit access) and
igor@402 290 outsiders (people without).
igor@402 291 \item They make it easier to reconcile after a social fork, because
igor@402 292 all that's involved from the perspective of the revision control
igor@402 293 software is just another merge.
igor@402 294 \end{itemize}
igor@402 295
igor@402 296 Some people resist distributed tools because they want to retain tight
igor@402 297 control over their projects, and they believe that centralised tools
igor@402 298 give them this control. However, if you're of this belief, and you
igor@402 299 publish your CVS or Subversion repositories publically, there are
igor@402 300 plenty of tools available that can pull out your entire project's
igor@402 301 history (albeit slowly) and recreate it somewhere that you don't
igor@402 302 control. So while your control in this case is illusory, you are
igor@402 303 forgoing the ability to fluidly collaborate with whatever people feel
igor@402 304 compelled to mirror and fork your history.
igor@402 305
igor@402 306 \subsection{Advantages for commercial projects}
igor@402 307
igor@402 308 Many commercial projects are undertaken by teams that are scattered
igor@402 309 across the globe. Contributors who are far from a central server will
igor@402 310 see slower command execution and perhaps less reliability. Commercial
igor@402 311 revision control systems attempt to ameliorate these problems with
igor@402 312 remote-site replication add-ons that are typically expensive to buy
igor@402 313 and cantankerous to administer. A distributed system doesn't suffer
igor@402 314 from these problems in the first place. Better yet, you can easily
igor@402 315 set up multiple authoritative servers, say one per site, so that
igor@402 316 there's no redundant communication between repositories over expensive
igor@402 317 long-haul network links.
igor@402 318
igor@402 319 Centralised revision control systems tend to have relatively low
igor@402 320 scalability. It's not unusual for an expensive centralised system to
igor@402 321 fall over under the combined load of just a few dozen concurrent
igor@402 322 users. Once again, the typical response tends to be an expensive and
igor@402 323 clunky replication facility. Since the load on a central server---if
igor@402 324 you have one at all---is many times lower with a distributed
igor@402 325 tool (because all of the data is replicated everywhere), a single
igor@402 326 cheap server can handle the needs of a much larger team, and
igor@402 327 replication to balance load becomes a simple matter of scripting.
igor@402 328
igor@402 329 If you have an employee in the field, troubleshooting a problem at a
igor@402 330 customer's site, they'll benefit from distributed revision control.
igor@402 331 The tool will let them generate custom builds, try different fixes in
igor@402 332 isolation from each other, and search efficiently through history for
igor@402 333 the sources of bugs and regressions in the customer's environment, all
igor@402 334 without needing to connect to your company's network.
igor@402 335
igor@402 336 \section{Why choose Mercurial?}
igor@402 337
igor@402 338 Mercurial has a unique set of properties that make it a particularly
igor@402 339 good choice as a revision control system.
igor@402 340 \begin{itemize}
igor@402 341 \item It is easy to learn and use.
igor@402 342 \item It is lightweight.
igor@402 343 \item It scales excellently.
igor@402 344 \item It is easy to customise.
igor@402 345 \end{itemize}
igor@402 346
igor@402 347 If you are at all familiar with revision control systems, you should
igor@402 348 be able to get up and running with Mercurial in less than five
igor@402 349 minutes. Even if not, it will take no more than a few minutes
igor@402 350 longer. Mercurial's command and feature sets are generally uniform
igor@402 351 and consistent, so you can keep track of a few general rules instead
igor@402 352 of a host of exceptions.
igor@402 353
igor@402 354 On a small project, you can start working with Mercurial in moments.
igor@402 355 Creating new changes and branches; transferring changes around
igor@402 356 (whether locally or over a network); and history and status operations
igor@402 357 are all fast. Mercurial attempts to stay nimble and largely out of
igor@402 358 your way by combining low cognitive overhead with blazingly fast
igor@402 359 operations.
igor@402 360
igor@402 361 The usefulness of Mercurial is not limited to small projects: it is
igor@402 362 used by projects with hundreds to thousands of contributors, each
igor@402 363 containing tens of thousands of files and hundreds of megabytes of
igor@402 364 source code.
igor@402 365
igor@402 366 If the core functionality of Mercurial is not enough for you, it's
igor@402 367 easy to build on. Mercurial is well suited to scripting tasks, and
igor@402 368 its clean internals and implementation in Python make it easy to add
igor@402 369 features in the form of extensions. There are a number of popular and
igor@402 370 useful extensions already available, ranging from helping to identify
igor@402 371 bugs to improving performance.
igor@402 372
igor@402 373 \section{Mercurial compared with other tools}
igor@402 374
igor@402 375 Before you read on, please understand that this section necessarily
igor@402 376 reflects my own experiences, interests, and (dare I say it) biases. I
igor@402 377 have used every one of the revision control tools listed below, in
igor@402 378 most cases for several years at a time.
igor@402 379
igor@402 380
igor@402 381 \subsection{Subversion}
igor@402 382
igor@402 383 Subversion is a popular revision control tool, developed to replace
igor@402 384 CVS. It has a centralised client/server architecture.
igor@402 385
igor@402 386 Subversion and Mercurial have similarly named commands for performing
igor@402 387 the same operations, so if you're familiar with one, it is easy to
igor@402 388 learn to use the other. Both tools are portable to all popular
igor@402 389 operating systems.
igor@402 390
igor@402 391 Prior to version 1.5, Subversion had no useful support for merges.
igor@402 392 At the time of writing, its merge tracking capability is new, and known to be
igor@402 393 \href{http://svnbook.red-bean.com/nightly/en/svn.branchmerge.advanced.html#svn.branchmerge.advanced.finalword}{complicated
igor@402 394 and buggy}.
igor@402 395
igor@402 396 Mercurial has a substantial performance advantage over Subversion on
igor@402 397 every revision control operation I have benchmarked. I have measured
igor@402 398 its advantage as ranging from a factor of two to a factor of six when
igor@402 399 compared with Subversion~1.4.3's \emph{ra\_local} file store, which is
igor@402 400 the fastest access method available. In more realistic deployments
igor@402 401 involving a network-based store, Subversion will be at a substantially
igor@402 402 larger disadvantage. Because many Subversion commands must talk to
igor@402 403 the server and Subversion does not have useful replication facilities,
igor@402 404 server capacity and network bandwidth become bottlenecks for modestly
igor@402 405 large projects.
igor@402 406
igor@402 407 Additionally, Subversion incurs substantial storage overhead to avoid
igor@402 408 network transactions for a few common operations, such as finding
igor@402 409 modified files (\texttt{status}) and displaying modifications against
igor@402 410 the current revision (\texttt{diff}). As a result, a Subversion
igor@402 411 working copy is often the same size as, or larger than, a Mercurial
igor@402 412 repository and working directory, even though the Mercurial repository
igor@402 413 contains a complete history of the project.
igor@402 414
igor@402 415 Subversion is widely supported by third party tools. Mercurial
igor@402 416 currently lags considerably in this area. This gap is closing,
igor@402 417 however, and indeed some of Mercurial's GUI tools now outshine their
igor@402 418 Subversion equivalents. Like Mercurial, Subversion has an excellent
igor@402 419 user manual.
igor@402 420
igor@402 421 Because Subversion doesn't store revision history on the client, it is
igor@402 422 well suited to managing projects that deal with lots of large, opaque
igor@402 423 binary files. If you check in fifty revisions to an incompressible
igor@402 424 10MB file, Subversion's client-side space usage stays constant The
igor@402 425 space used by any distributed SCM will grow rapidly in proportion to
igor@402 426 the number of revisions, because the differences between each revision
igor@402 427 are large.
igor@402 428
igor@402 429 In addition, it's often difficult or, more usually, impossible to
igor@402 430 merge different versions of a binary file. Subversion's ability to
igor@402 431 let a user lock a file, so that they temporarily have the exclusive
igor@402 432 right to commit changes to it, can be a significant advantage to a
igor@402 433 project where binary files are widely used.
igor@402 434
igor@402 435 Mercurial can import revision history from a Subversion repository.
igor@402 436 It can also export revision history to a Subversion repository. This
igor@402 437 makes it easy to ``test the waters'' and use Mercurial and Subversion
igor@402 438 in parallel before deciding to switch. History conversion is
igor@402 439 incremental, so you can perform an initial conversion, then small
igor@402 440 additional conversions afterwards to bring in new changes.
igor@402 441
igor@402 442
igor@402 443 \subsection{Git}
igor@402 444
igor@402 445 Git is a distributed revision control tool that was developed for
igor@402 446 managing the Linux kernel source tree. Like Mercurial, its early
igor@402 447 design was somewhat influenced by Monotone.
igor@402 448
igor@402 449 Git has a very large command set, with version~1.5.0 providing~139
igor@402 450 individual commands. It has something of a reputation for being
igor@402 451 difficult to learn. Compared to Git, Mercurial has a strong focus on
igor@402 452 simplicity.
igor@402 453
igor@402 454 In terms of performance, Git is extremely fast. In several cases, it
igor@402 455 is faster than Mercurial, at least on Linux, while Mercurial performs
igor@402 456 better on other operations. However, on Windows, the performance and
igor@402 457 general level of support that Git provides is, at the time of writing,
igor@402 458 far behind that of Mercurial.
igor@402 459
igor@402 460 While a Mercurial repository needs no maintenance, a Git repository
igor@402 461 requires frequent manual ``repacks'' of its metadata. Without these,
igor@402 462 performance degrades, while space usage grows rapidly. A server that
igor@402 463 contains many Git repositories that are not rigorously and frequently
igor@402 464 repacked will become heavily disk-bound during backups, and there have
igor@402 465 been instances of daily backups taking far longer than~24 hours as a
igor@402 466 result. A freshly packed Git repository is slightly smaller than a
igor@402 467 Mercurial repository, but an unpacked repository is several orders of
igor@402 468 magnitude larger.
igor@402 469
igor@402 470 The core of Git is written in C. Many Git commands are implemented as
igor@402 471 shell or Perl scripts, and the quality of these scripts varies widely.
igor@402 472 I have encountered several instances where scripts charged along
igor@402 473 blindly in the presence of errors that should have been fatal.
igor@402 474
igor@402 475 Mercurial can import revision history from a Git repository.
igor@402 476
igor@402 477
igor@402 478 \subsection{CVS}
igor@402 479
igor@402 480 CVS is probably the most widely used revision control tool in the
igor@402 481 world. Due to its age and internal untidiness, it has been only
igor@402 482 lightly maintained for many years.
igor@402 483
igor@402 484 It has a centralised client/server architecture. It does not group
igor@402 485 related file changes into atomic commits, making it easy for people to
igor@402 486 ``break the build'': one person can successfully commit part of a
igor@402 487 change and then be blocked by the need for a merge, causing other
igor@402 488 people to see only a portion of the work they intended to do. This
igor@402 489 also affects how you work with project history. If you want to see
igor@402 490 all of the modifications someone made as part of a task, you will need
igor@402 491 to manually inspect the descriptions and timestamps of the changes
igor@402 492 made to each file involved (if you even know what those files were).
igor@402 493
igor@402 494 CVS has a muddled notion of tags and branches that I will not attempt
igor@402 495 to even describe. It does not support renaming of files or
igor@402 496 directories well, making it easy to corrupt a repository. It has
igor@402 497 almost no internal consistency checking capabilities, so it is usually
igor@402 498 not even possible to tell whether or how a repository is corrupt. I
igor@402 499 would not recommend CVS for any project, existing or new.
igor@402 500
igor@402 501 Mercurial can import CVS revision history. However, there are a few
igor@402 502 caveats that apply; these are true of every other revision control
igor@402 503 tool's CVS importer, too. Due to CVS's lack of atomic changes and
igor@402 504 unversioned filesystem hierarchy, it is not possible to reconstruct
igor@402 505 CVS history completely accurately; some guesswork is involved, and
igor@402 506 renames will usually not show up. Because a lot of advanced CVS
igor@402 507 administration has to be done by hand and is hence error-prone, it's
igor@402 508 common for CVS importers to run into multiple problems with corrupted
igor@402 509 repositories (completely bogus revision timestamps and files that have
igor@402 510 remained locked for over a decade are just two of the less interesting
igor@402 511 problems I can recall from personal experience).
igor@402 512
igor@402 513 Mercurial can import revision history from a CVS repository.
igor@402 514
igor@402 515
igor@402 516 \subsection{Commercial tools}
igor@402 517
igor@402 518 Perforce has a centralised client/server architecture, with no
igor@402 519 client-side caching of any data. Unlike modern revision control
igor@402 520 tools, Perforce requires that a user run a command to inform the
igor@402 521 server about every file they intend to edit.
igor@402 522
igor@402 523 The performance of Perforce is quite good for small teams, but it
igor@402 524 falls off rapidly as the number of users grows beyond a few dozen.
igor@402 525 Modestly large Perforce installations require the deployment of
igor@402 526 proxies to cope with the load their users generate.
igor@402 527
igor@402 528
igor@402 529 \subsection{Choosing a revision control tool}
igor@402 530
igor@402 531 With the exception of CVS, all of the tools listed above have unique
igor@402 532 strengths that suit them to particular styles of work. There is no
igor@402 533 single revision control tool that is best in all situations.
igor@402 534
igor@402 535 As an example, Subversion is a good choice for working with frequently
igor@402 536 edited binary files, due to its centralised nature and support for
igor@402 537 file locking.
igor@402 538
igor@402 539 I personally find Mercurial's properties of simplicity, performance,
igor@402 540 and good merge support to be a compelling combination that has served
igor@402 541 me well for several years.
igor@402 542
igor@402 543
igor@402 544 \section{Switching from another tool to Mercurial}
igor@402 545
igor@402 546 Mercurial is bundled with an extension named \hgext{convert}, which
igor@402 547 can incrementally import revision history from several other revision
igor@402 548 control tools. By ``incremental'', I mean that you can convert all of
igor@402 549 a project's history to date in one go, then rerun the conversion later
igor@402 550 to obtain new changes that happened after the initial conversion.
igor@402 551
igor@402 552 The revision control tools supported by \hgext{convert} are as
igor@402 553 follows:
igor@402 554 \begin{itemize}
igor@402 555 \item Subversion
igor@402 556 \item CVS
igor@402 557 \item Git
igor@402 558 \item Darcs
igor@402 559 \end{itemize}
igor@402 560
igor@402 561 In addition, \hgext{convert} can export changes from Mercurial to
igor@402 562 Subversion. This makes it possible to try Subversion and Mercurial in
igor@402 563 parallel before committing to a switchover, without risking the loss
igor@402 564 of any work.
igor@402 565
igor@402 566 The \hgxcmd{conver}{convert} command is easy to use. Simply point it
igor@402 567 at the path or URL of the source repository, optionally give it the
igor@402 568 name of the destination repository, and it will start working. After
igor@402 569 the initial conversion, just run the same command again to import new
igor@402 570 changes.
igor@402 571
igor@402 572
igor@402 573 %%% Local Variables:
igor@402 574 %%% mode: latex
igor@402 575 %%% TeX-master: "00book"
igor@402 576 %%% End: