rev |
line source |
igor@403
|
1 \chapter{Introducción}
|
igor@402
|
2 \label{chap:intro}
|
igor@402
|
3
|
igor@403
|
4 \section{Acerca del control de revisiones}
|
igor@403
|
5
|
igor@403
|
6 El control de revisiones es el proceso de administrar diferentes
|
igor@403
|
7 versiones de una pieza de información. En su forma más simple es algo
|
igor@403
|
8 que la mayoría de gente hace a mano: cada vez que usted modifica un
|
igor@403
|
9 fichero, lo graba con un nuevo nombre que contiene un número, el
|
igor@403
|
10 siguiente mayor que el anterior.
|
igor@403
|
11
|
igor@403
|
12 Administrar manualmente muchas versiones de un fichero es una tarea
|
igor@403
|
13 propensa a errores, a pesar de que hace bastante tiempo hay
|
igor@403
|
14 herramientas que ayudan en este proceso. Las primeras herramientas
|
igor@403
|
15 para automatizar el control de revisiones fueron pensadas para que un
|
igor@403
|
16 usuario administrara un solo fichero. En las décadas pasadas, el
|
igor@403
|
17 alcance de las herramientas de control de revisiones ha ido aumentando
|
igor@403
|
18 considerablemente; ahora manejan muchos archivos y facilitan el
|
igor@403
|
19 trabajo en conjunto de varias personas. Las mejores herramientas de
|
igor@403
|
20 control de revisiones de la actualidad no tienen problema con miles de
|
igor@403
|
21 personas trabajando en proyectos que consisten de decenas de miles de
|
igor@403
|
22 ficheros.
|
igor@403
|
23
|
igor@403
|
24 \subsection{¿Por qué usar control de revisiones?}
|
igor@403
|
25
|
igor@403
|
26 Hay muchas razones por las cuales usted o su equipo desearía usar una
|
igor@403
|
27 herramienta automática de control de revisiones para un proyecto.
|
igor@402
|
28 \begin{itemize}
|
igor@403
|
29 \item Contar con la historia y la evolución de su proyecto, para
|
igor@403
|
30 evitar hacer la tarea manualmente. Por cada cambio tendrá una
|
igor@403
|
31 bitácora de \emph{quién} lo hizo; \emph{por qué} se hizo;
|
igor@403
|
32 \emph{cuándo} se hizo; y de \emph{qué} se trataba el cambio.
|
igor@403
|
33 \item Cuando trabaja con más personas, los programas de control de
|
igor@403
|
34 revisiones facilitan la colaboración. Por ejemplo, cuando varias
|
igor@403
|
35 personas de forma casi simultanea pueden hacer cambios
|
igor@403
|
36 incompatibles, el programa le ayudará a identificar y resolver tales
|
igor@403
|
37 conflictos.
|
igor@403
|
38 \item Puede ayudarle a recuperarse de equivocaciones. Si aplica un
|
igor@403
|
39 cambio que posteriormente se evidencia como un error, puede
|
igor@403
|
40 revertirlo a una versión previa a uno o muchos ficheros. De hecho,
|
igor@403
|
41 una herramienta \emph{realmente} buena, incluso puede ayudarle
|
igor@403
|
42 efectivamente a darse cuenta exactamente cuándo se introdujo el
|
igor@403
|
43 error( para más detalles ver la sección~\ref{sec:undo:bisect}).
|
igor@403
|
44 \item Le permitirá trabajar simultáneamente, y manejar las diferencias
|
igor@403
|
45 entre múltiples versiones de su proyecto.
|
igor@402
|
46 \end{itemize}
|
igor@403
|
47 La mayoría de estas razones son igualmente validas ---por lo menos en
|
igor@403
|
48 teoría--- así esté trabajando en un proyecto solo, o con mucha gente.
|
igor@403
|
49
|
igor@403
|
50 Algo fundamental acerca de lo práctico de un sistema de control de
|
igor@403
|
51 revisiones en estas dos escalas (``un hacker solo'' y ``un equipo
|
igor@403
|
52 gigantesco'') es cómo se comparan los \emph{beneficios} con los
|
igor@403
|
53 \emph{costos}. Una herramienta de control de revisiones que sea
|
igor@403
|
54 difícil de entender o usar impondrá un costo alto.
|
igor@403
|
55
|
igor@403
|
56 Un proyecto de quinientas personas es muy propenso a colapsar
|
igor@403
|
57 solamente con su peso inmediatamente sin una herramienta de control de
|
igor@403
|
58 versiones y un proceso. En este caso, el costo de usar control de
|
igor@403
|
59 revisiones ni siquiera se tiene en cueant, puesto que \emph{sin} él,
|
igor@403
|
60 el fracaso está casi garantizado.
|
igor@403
|
61
|
igor@403
|
62 Por otra parte, un ``arreglo rápido'' de una sola persona, excluiría
|
igor@403
|
63 la necesidad de usar una herramienta de control de revisiones, porque
|
igor@403
|
64 casi seguramente, el costo de usar una estaría cerca del costo del
|
igor@403
|
65 proyecto. ¿No es así?
|
igor@403
|
66
|
igor@403
|
67 Mercurial solamente soporta \emph{ambas} escalas de de
|
igor@403
|
68 desarrollo. Puede aprender lo básico en pocos minutos, y dado su bajo
|
igor@403
|
69 sobrecosto, puede aplicar el control de revisiones al proyecto más
|
igor@403
|
70 pequeño con facilidad. Su simplicidad significa que no tendrá que
|
igor@403
|
71 preocuparse por conceptos obtusos o secuencias de órdenes compitiendo
|
igor@403
|
72 por espacio mental con lo que sea que \emph{realmente} esté tratando
|
igor@403
|
73 de hacer. Al mismo tiempo, Mercurial tiene alto desempeño y su
|
igor@403
|
74 naturaleza peer-to-peer le permite escalar indoloramente para manejar
|
igor@403
|
75 grandes proyectos.
|
igor@403
|
76
|
igor@403
|
77 Ninguna herramienta de control de revisiones puede salvar un
|
igor@403
|
78 proyecto mal administrado, pero la elección de herramientas puede
|
igor@403
|
79 hacer una gran diferencia en la fluidez con la cual puede trabajar en
|
igor@403
|
80 el proyecto.
|
igor@403
|
81
|
igor@403
|
82 \subsection{La cantidad de nombres del control de revisiones}
|
igor@403
|
83
|
igor@403
|
84 El control de revisiones es un campo amplio, tan ampli que no hay un
|
igor@403
|
85 acrónimo o nombre único. A continuación presentamos un listado de
|
igor@403
|
86 nombres comunes y acrónimos que se podrían encontrar:
|
igor@402
|
87 \begin{itemize}
|
igor@403
|
88 \item Control de revisiones (RCS)
|
igor@403
|
89 \item Manejo de Configuraciones de Programas(SCM), o administracón de
|
igor@403
|
90 configuraciones
|
igor@403
|
91 \item Administración de código fuente
|
igor@403
|
92 \item Control de Código Fuente, o Control de Fuentes
|
igor@403
|
93 \item Control de Versiones(VCS)
|
igor@402
|
94 \end{itemize}
|
igor@403
|
95 Algunas personas aducen que estos términos tienen significados
|
igor@403
|
96 diversos, pero en la práctica se sobrelapan tanto que no hay un
|
igor@403
|
97 acuerdo o una forma adecuada de separarlos.
|
igor@403
|
98
|
igor@403
|
99 \section{Historia resumida del control de revisiones}
|
igor@403
|
100
|
igor@403
|
101 La herramienta de control de revisiones más antigua conocida es SCCS
|
igor@403
|
102 (Sistema de Control de Código), escrito por Marc Rochkind en Bell
|
igor@403
|
103 Labs, a comienzos de los setentas(1970s). SCCS operaba sobre archivos
|
igor@403
|
104 individuales, y requería que cada persona que trabajara en el proyecto
|
igor@403
|
105 tuviera acceso a un espacio compartido en un solo sistema. Solamente
|
igor@403
|
106 una persona podía modificar un archivo en un momento dado; el
|
igor@403
|
107 arbitramiento del acceso a los ficheros se hacía con candados. Era
|
igor@403
|
108 común que la gente pusiera los candados a los ficheros, y que
|
igor@403
|
109 posteriormente olvidara quitarlos, impidiendo que otro pudiera
|
igor@403
|
110 modificar los ficheros en cuestión sin la intervención del
|
igor@403
|
111 administrador.
|
igor@403
|
112
|
igor@403
|
113 Walter Tichy desarrolló una alternativa gratutita a SCCS a comienzos
|
igor@403
|
114 de los ochentas(1980s), llamó a su programa RCS(Sistema de Control de
|
igor@403
|
115 Revisiones). Al igual que SCCS, RCS requería que los desarrolladores
|
igor@403
|
116 trabajaran en un único espacio compartido y colocaran candados a los
|
igor@403
|
117 ficheros para evitar que varias personas los estuvieran modificando
|
igor@403
|
118 simultáneamente.
|
igor@403
|
119
|
igor@403
|
120 Después en los ochenta, Dick Grune usó RCS como un bloque de
|
igor@403
|
121 construcción para un conjunto de guiones de línea de comando, que
|
igor@403
|
122 inicialmente llamó cmt, pero que renombró a CVS(Sistema Concurrente de
|
igor@403
|
123 Versiones). La gran innovación de CVS era que permitía a los
|
igor@403
|
124 desarrolladores trabajar simultáneamente de una forma más o menos
|
igor@403
|
125 independiente en sus propios espacios de trabajo. Los espacios de
|
igor@403
|
126 trabajo personales impedian que los desarrolladores se pisaran las
|
igor@403
|
127 mangueras todo el tiempo, situación común con SCCS y RCS. Cada
|
igor@403
|
128 desarrollador tenía una copia de todo el fichero del proyecto y podía
|
igor@403
|
129 modificar su copia independientemente, Tenían que fusionar sus
|
igor@403
|
130 ediciones antes de consignar los cambios al repositorio central.
|
igor@403
|
131
|
igor@403
|
132 Brian Berliner tomó los scripts originales de Grune y los reescribió
|
igor@403
|
133 en~C, haciéndolos públicos en 1989, código sobre el cual se ha
|
igor@403
|
134 desarrollado la versión moderna de CVS. CVS posteriormente adquirió
|
igor@403
|
135 la habilidad de operar sobre una conexión de red, dotándolo de una
|
igor@403
|
136 arquitectura, cliente/servidor. La arquitectura de CVS es
|
igor@403
|
137 centralizada; La historia del proyecto está únicamente en el
|
igor@403
|
138 repositorio central. Los espacios de trabajo de los clientes
|
igor@403
|
139 contienen únicamente copias recientes de las versiones de los
|
igor@403
|
140 ficheros, y pocos metadatos para indicar dónde está el servidor. CVS
|
igor@403
|
141 ha tenido un éxito enorme; Es probablemente el sistema de control de
|
igor@403
|
142 revisiones más extendido del planeta.
|
igor@402
|
143
|
igor@402
|
144 In the early 1990s, Sun Microsystems developed an early distributed
|
igor@402
|
145 revision control system, called TeamWare. A TeamWare workspace
|
igor@402
|
146 contains a complete copy of the project's history. TeamWare has no
|
igor@402
|
147 notion of a central repository. (CVS relied upon RCS for its history
|
igor@402
|
148 storage; TeamWare used SCCS.)
|
igor@402
|
149
|
igor@402
|
150 As the 1990s progressed, awareness grew of a number of problems with
|
igor@402
|
151 CVS. It records simultaneous changes to multiple files individually,
|
igor@402
|
152 instead of grouping them together as a single logically atomic
|
igor@402
|
153 operation. It does not manage its file hierarchy well; it is easy to
|
igor@402
|
154 make a mess of a repository by renaming files and directories. Worse,
|
igor@402
|
155 its source code is difficult to read and maintain, which made the
|
igor@402
|
156 ``pain level'' of fixing these architectural problems prohibitive.
|
igor@402
|
157
|
igor@402
|
158 In 2001, Jim Blandy and Karl Fogel, two developers who had worked on
|
igor@402
|
159 CVS, started a project to replace it with a tool that would have a
|
igor@402
|
160 better architecture and cleaner code. The result, Subversion, does
|
igor@402
|
161 not stray from CVS's centralised client/server model, but it adds
|
igor@402
|
162 multi-file atomic commits, better namespace management, and a number
|
igor@402
|
163 of other features that make it a generally better tool than CVS.
|
igor@402
|
164 Since its initial release, it has rapidly grown in popularity.
|
igor@402
|
165
|
igor@402
|
166 More or less simultaneously, Graydon Hoare began working on an
|
igor@402
|
167 ambitious distributed revision control system that he named Monotone.
|
igor@402
|
168 While Monotone addresses many of CVS's design flaws and has a
|
igor@402
|
169 peer-to-peer architecture, it goes beyond earlier (and subsequent)
|
igor@402
|
170 revision control tools in a number of innovative ways. It uses
|
igor@402
|
171 cryptographic hashes as identifiers, and has an integral notion of
|
igor@402
|
172 ``trust'' for code from different sources.
|
igor@402
|
173
|
igor@402
|
174 Mercurial began life in 2005. While a few aspects of its design are
|
igor@402
|
175 influenced by Monotone, Mercurial focuses on ease of use, high
|
igor@402
|
176 performance, and scalability to very large projects.
|
igor@402
|
177
|
igor@402
|
178 \section{Trends in revision control}
|
igor@402
|
179
|
igor@402
|
180 There has been an unmistakable trend in the development and use of
|
igor@402
|
181 revision control tools over the past four decades, as people have
|
igor@402
|
182 become familiar with the capabilities of their tools and constrained
|
igor@402
|
183 by their limitations.
|
igor@402
|
184
|
igor@402
|
185 The first generation began by managing single files on individual
|
igor@402
|
186 computers. Although these tools represented a huge advance over
|
igor@402
|
187 ad-hoc manual revision control, their locking model and reliance on a
|
igor@402
|
188 single computer limited them to small, tightly-knit teams.
|
igor@402
|
189
|
igor@402
|
190 The second generation loosened these constraints by moving to
|
igor@402
|
191 network-centered architectures, and managing entire projects at a
|
igor@402
|
192 time. As projects grew larger, they ran into new problems. With
|
igor@402
|
193 clients needing to talk to servers very frequently, server scaling
|
igor@402
|
194 became an issue for large projects. An unreliable network connection
|
igor@402
|
195 could prevent remote users from being able to talk to the server at
|
igor@402
|
196 all. As open source projects started making read-only access
|
igor@402
|
197 available anonymously to anyone, people without commit privileges
|
igor@402
|
198 found that they could not use the tools to interact with a project in
|
igor@402
|
199 a natural way, as they could not record their changes.
|
igor@402
|
200
|
igor@402
|
201 The current generation of revision control tools is peer-to-peer in
|
igor@402
|
202 nature. All of these systems have dropped the dependency on a single
|
igor@402
|
203 central server, and allow people to distribute their revision control
|
igor@402
|
204 data to where it's actually needed. Collaboration over the Internet
|
igor@402
|
205 has moved from constrained by technology to a matter of choice and
|
igor@402
|
206 consensus. Modern tools can operate offline indefinitely and
|
igor@402
|
207 autonomously, with a network connection only needed when syncing
|
igor@402
|
208 changes with another repository.
|
igor@402
|
209
|
igor@402
|
210 \section{A few of the advantages of distributed revision control}
|
igor@402
|
211
|
igor@402
|
212 Even though distributed revision control tools have for several years
|
igor@402
|
213 been as robust and usable as their previous-generation counterparts,
|
igor@402
|
214 people using older tools have not yet necessarily woken up to their
|
igor@402
|
215 advantages. There are a number of ways in which distributed tools
|
igor@402
|
216 shine relative to centralised ones.
|
igor@402
|
217
|
igor@402
|
218 For an individual developer, distributed tools are almost always much
|
igor@402
|
219 faster than centralised tools. This is for a simple reason: a
|
igor@402
|
220 centralised tool needs to talk over the network for many common
|
igor@402
|
221 operations, because most metadata is stored in a single copy on the
|
igor@402
|
222 central server. A distributed tool stores all of its metadata
|
igor@402
|
223 locally. All else being equal, talking over the network adds overhead
|
igor@402
|
224 to a centralised tool. Don't underestimate the value of a snappy,
|
igor@402
|
225 responsive tool: you're going to spend a lot of time interacting with
|
igor@402
|
226 your revision control software.
|
igor@402
|
227
|
igor@402
|
228 Distributed tools are indifferent to the vagaries of your server
|
igor@402
|
229 infrastructure, again because they replicate metadata to so many
|
igor@402
|
230 locations. If you use a centralised system and your server catches
|
igor@402
|
231 fire, you'd better hope that your backup media are reliable, and that
|
igor@402
|
232 your last backup was recent and actually worked. With a distributed
|
igor@402
|
233 tool, you have many backups available on every contributor's computer.
|
igor@402
|
234
|
igor@402
|
235 The reliability of your network will affect distributed tools far less
|
igor@402
|
236 than it will centralised tools. You can't even use a centralised tool
|
igor@402
|
237 without a network connection, except for a few highly constrained
|
igor@402
|
238 commands. With a distributed tool, if your network connection goes
|
igor@402
|
239 down while you're working, you may not even notice. The only thing
|
igor@402
|
240 you won't be able to do is talk to repositories on other computers,
|
igor@402
|
241 something that is relatively rare compared with local operations. If
|
igor@402
|
242 you have a far-flung team of collaborators, this may be significant.
|
igor@402
|
243
|
igor@402
|
244 \subsection{Advantages for open source projects}
|
igor@402
|
245
|
igor@402
|
246 If you take a shine to an open source project and decide that you
|
igor@402
|
247 would like to start hacking on it, and that project uses a distributed
|
igor@402
|
248 revision control tool, you are at once a peer with the people who
|
igor@402
|
249 consider themselves the ``core'' of that project. If they publish
|
igor@402
|
250 their repositories, you can immediately copy their project history,
|
igor@402
|
251 start making changes, and record your work, using the same tools in
|
igor@402
|
252 the same ways as insiders. By contrast, with a centralised tool, you
|
igor@402
|
253 must use the software in a ``read only'' mode unless someone grants
|
igor@402
|
254 you permission to commit changes to their central server. Until then,
|
igor@402
|
255 you won't be able to record changes, and your local modifications will
|
igor@402
|
256 be at risk of corruption any time you try to update your client's view
|
igor@402
|
257 of the repository.
|
igor@402
|
258
|
igor@402
|
259 \subsubsection{The forking non-problem}
|
igor@402
|
260
|
igor@402
|
261 It has been suggested that distributed revision control tools pose
|
igor@402
|
262 some sort of risk to open source projects because they make it easy to
|
igor@402
|
263 ``fork'' the development of a project. A fork happens when there are
|
igor@402
|
264 differences in opinion or attitude between groups of developers that
|
igor@402
|
265 cause them to decide that they can't work together any longer. Each
|
igor@402
|
266 side takes a more or less complete copy of the project's source code,
|
igor@402
|
267 and goes off in its own direction.
|
igor@402
|
268
|
igor@402
|
269 Sometimes the camps in a fork decide to reconcile their differences.
|
igor@402
|
270 With a centralised revision control system, the \emph{technical}
|
igor@402
|
271 process of reconciliation is painful, and has to be performed largely
|
igor@402
|
272 by hand. You have to decide whose revision history is going to
|
igor@402
|
273 ``win'', and graft the other team's changes into the tree somehow.
|
igor@402
|
274 This usually loses some or all of one side's revision history.
|
igor@402
|
275
|
igor@402
|
276 What distributed tools do with respect to forking is they make forking
|
igor@402
|
277 the \emph{only} way to develop a project. Every single change that
|
igor@402
|
278 you make is potentially a fork point. The great strength of this
|
igor@402
|
279 approach is that a distributed revision control tool has to be really
|
igor@402
|
280 good at \emph{merging} forks, because forks are absolutely
|
igor@402
|
281 fundamental: they happen all the time.
|
igor@402
|
282
|
igor@402
|
283 If every piece of work that everybody does, all the time, is framed in
|
igor@402
|
284 terms of forking and merging, then what the open source world refers
|
igor@402
|
285 to as a ``fork'' becomes \emph{purely} a social issue. If anything,
|
igor@402
|
286 distributed tools \emph{lower} the likelihood of a fork:
|
igor@402
|
287 \begin{itemize}
|
igor@402
|
288 \item They eliminate the social distinction that centralised tools
|
igor@402
|
289 impose: that between insiders (people with commit access) and
|
igor@402
|
290 outsiders (people without).
|
igor@402
|
291 \item They make it easier to reconcile after a social fork, because
|
igor@402
|
292 all that's involved from the perspective of the revision control
|
igor@402
|
293 software is just another merge.
|
igor@402
|
294 \end{itemize}
|
igor@402
|
295
|
igor@402
|
296 Some people resist distributed tools because they want to retain tight
|
igor@402
|
297 control over their projects, and they believe that centralised tools
|
igor@402
|
298 give them this control. However, if you're of this belief, and you
|
igor@402
|
299 publish your CVS or Subversion repositories publically, there are
|
igor@402
|
300 plenty of tools available that can pull out your entire project's
|
igor@402
|
301 history (albeit slowly) and recreate it somewhere that you don't
|
igor@402
|
302 control. So while your control in this case is illusory, you are
|
igor@402
|
303 forgoing the ability to fluidly collaborate with whatever people feel
|
igor@402
|
304 compelled to mirror and fork your history.
|
igor@402
|
305
|
igor@402
|
306 \subsection{Advantages for commercial projects}
|
igor@402
|
307
|
igor@402
|
308 Many commercial projects are undertaken by teams that are scattered
|
igor@402
|
309 across the globe. Contributors who are far from a central server will
|
igor@402
|
310 see slower command execution and perhaps less reliability. Commercial
|
igor@402
|
311 revision control systems attempt to ameliorate these problems with
|
igor@402
|
312 remote-site replication add-ons that are typically expensive to buy
|
igor@402
|
313 and cantankerous to administer. A distributed system doesn't suffer
|
igor@402
|
314 from these problems in the first place. Better yet, you can easily
|
igor@402
|
315 set up multiple authoritative servers, say one per site, so that
|
igor@402
|
316 there's no redundant communication between repositories over expensive
|
igor@402
|
317 long-haul network links.
|
igor@402
|
318
|
igor@402
|
319 Centralised revision control systems tend to have relatively low
|
igor@402
|
320 scalability. It's not unusual for an expensive centralised system to
|
igor@402
|
321 fall over under the combined load of just a few dozen concurrent
|
igor@402
|
322 users. Once again, the typical response tends to be an expensive and
|
igor@402
|
323 clunky replication facility. Since the load on a central server---if
|
igor@402
|
324 you have one at all---is many times lower with a distributed
|
igor@402
|
325 tool (because all of the data is replicated everywhere), a single
|
igor@402
|
326 cheap server can handle the needs of a much larger team, and
|
igor@402
|
327 replication to balance load becomes a simple matter of scripting.
|
igor@402
|
328
|
igor@402
|
329 If you have an employee in the field, troubleshooting a problem at a
|
igor@402
|
330 customer's site, they'll benefit from distributed revision control.
|
igor@402
|
331 The tool will let them generate custom builds, try different fixes in
|
igor@402
|
332 isolation from each other, and search efficiently through history for
|
igor@402
|
333 the sources of bugs and regressions in the customer's environment, all
|
igor@402
|
334 without needing to connect to your company's network.
|
igor@402
|
335
|
igor@402
|
336 \section{Why choose Mercurial?}
|
igor@402
|
337
|
igor@402
|
338 Mercurial has a unique set of properties that make it a particularly
|
igor@402
|
339 good choice as a revision control system.
|
igor@402
|
340 \begin{itemize}
|
igor@402
|
341 \item It is easy to learn and use.
|
igor@402
|
342 \item It is lightweight.
|
igor@402
|
343 \item It scales excellently.
|
igor@402
|
344 \item It is easy to customise.
|
igor@402
|
345 \end{itemize}
|
igor@402
|
346
|
igor@402
|
347 If you are at all familiar with revision control systems, you should
|
igor@402
|
348 be able to get up and running with Mercurial in less than five
|
igor@402
|
349 minutes. Even if not, it will take no more than a few minutes
|
igor@402
|
350 longer. Mercurial's command and feature sets are generally uniform
|
igor@402
|
351 and consistent, so you can keep track of a few general rules instead
|
igor@402
|
352 of a host of exceptions.
|
igor@402
|
353
|
igor@402
|
354 On a small project, you can start working with Mercurial in moments.
|
igor@402
|
355 Creating new changes and branches; transferring changes around
|
igor@402
|
356 (whether locally or over a network); and history and status operations
|
igor@402
|
357 are all fast. Mercurial attempts to stay nimble and largely out of
|
igor@402
|
358 your way by combining low cognitive overhead with blazingly fast
|
igor@402
|
359 operations.
|
igor@402
|
360
|
igor@402
|
361 The usefulness of Mercurial is not limited to small projects: it is
|
igor@402
|
362 used by projects with hundreds to thousands of contributors, each
|
igor@402
|
363 containing tens of thousands of files and hundreds of megabytes of
|
igor@402
|
364 source code.
|
igor@402
|
365
|
igor@402
|
366 If the core functionality of Mercurial is not enough for you, it's
|
igor@402
|
367 easy to build on. Mercurial is well suited to scripting tasks, and
|
igor@402
|
368 its clean internals and implementation in Python make it easy to add
|
igor@402
|
369 features in the form of extensions. There are a number of popular and
|
igor@402
|
370 useful extensions already available, ranging from helping to identify
|
igor@402
|
371 bugs to improving performance.
|
igor@402
|
372
|
igor@402
|
373 \section{Mercurial compared with other tools}
|
igor@402
|
374
|
igor@402
|
375 Before you read on, please understand that this section necessarily
|
igor@402
|
376 reflects my own experiences, interests, and (dare I say it) biases. I
|
igor@402
|
377 have used every one of the revision control tools listed below, in
|
igor@402
|
378 most cases for several years at a time.
|
igor@402
|
379
|
igor@402
|
380
|
igor@402
|
381 \subsection{Subversion}
|
igor@402
|
382
|
igor@402
|
383 Subversion is a popular revision control tool, developed to replace
|
igor@402
|
384 CVS. It has a centralised client/server architecture.
|
igor@402
|
385
|
igor@402
|
386 Subversion and Mercurial have similarly named commands for performing
|
igor@402
|
387 the same operations, so if you're familiar with one, it is easy to
|
igor@402
|
388 learn to use the other. Both tools are portable to all popular
|
igor@402
|
389 operating systems.
|
igor@402
|
390
|
igor@402
|
391 Prior to version 1.5, Subversion had no useful support for merges.
|
igor@402
|
392 At the time of writing, its merge tracking capability is new, and known to be
|
igor@402
|
393 \href{http://svnbook.red-bean.com/nightly/en/svn.branchmerge.advanced.html#svn.branchmerge.advanced.finalword}{complicated
|
igor@402
|
394 and buggy}.
|
igor@402
|
395
|
igor@402
|
396 Mercurial has a substantial performance advantage over Subversion on
|
igor@402
|
397 every revision control operation I have benchmarked. I have measured
|
igor@402
|
398 its advantage as ranging from a factor of two to a factor of six when
|
igor@402
|
399 compared with Subversion~1.4.3's \emph{ra\_local} file store, which is
|
igor@402
|
400 the fastest access method available. In more realistic deployments
|
igor@402
|
401 involving a network-based store, Subversion will be at a substantially
|
igor@402
|
402 larger disadvantage. Because many Subversion commands must talk to
|
igor@402
|
403 the server and Subversion does not have useful replication facilities,
|
igor@402
|
404 server capacity and network bandwidth become bottlenecks for modestly
|
igor@402
|
405 large projects.
|
igor@402
|
406
|
igor@402
|
407 Additionally, Subversion incurs substantial storage overhead to avoid
|
igor@402
|
408 network transactions for a few common operations, such as finding
|
igor@402
|
409 modified files (\texttt{status}) and displaying modifications against
|
igor@402
|
410 the current revision (\texttt{diff}). As a result, a Subversion
|
igor@402
|
411 working copy is often the same size as, or larger than, a Mercurial
|
igor@402
|
412 repository and working directory, even though the Mercurial repository
|
igor@402
|
413 contains a complete history of the project.
|
igor@402
|
414
|
igor@402
|
415 Subversion is widely supported by third party tools. Mercurial
|
igor@402
|
416 currently lags considerably in this area. This gap is closing,
|
igor@402
|
417 however, and indeed some of Mercurial's GUI tools now outshine their
|
igor@402
|
418 Subversion equivalents. Like Mercurial, Subversion has an excellent
|
igor@402
|
419 user manual.
|
igor@402
|
420
|
igor@402
|
421 Because Subversion doesn't store revision history on the client, it is
|
igor@402
|
422 well suited to managing projects that deal with lots of large, opaque
|
igor@402
|
423 binary files. If you check in fifty revisions to an incompressible
|
igor@402
|
424 10MB file, Subversion's client-side space usage stays constant The
|
igor@402
|
425 space used by any distributed SCM will grow rapidly in proportion to
|
igor@402
|
426 the number of revisions, because the differences between each revision
|
igor@402
|
427 are large.
|
igor@402
|
428
|
igor@402
|
429 In addition, it's often difficult or, more usually, impossible to
|
igor@402
|
430 merge different versions of a binary file. Subversion's ability to
|
igor@402
|
431 let a user lock a file, so that they temporarily have the exclusive
|
igor@402
|
432 right to commit changes to it, can be a significant advantage to a
|
igor@402
|
433 project where binary files are widely used.
|
igor@402
|
434
|
igor@402
|
435 Mercurial can import revision history from a Subversion repository.
|
igor@402
|
436 It can also export revision history to a Subversion repository. This
|
igor@402
|
437 makes it easy to ``test the waters'' and use Mercurial and Subversion
|
igor@402
|
438 in parallel before deciding to switch. History conversion is
|
igor@402
|
439 incremental, so you can perform an initial conversion, then small
|
igor@402
|
440 additional conversions afterwards to bring in new changes.
|
igor@402
|
441
|
igor@402
|
442
|
igor@402
|
443 \subsection{Git}
|
igor@402
|
444
|
igor@402
|
445 Git is a distributed revision control tool that was developed for
|
igor@402
|
446 managing the Linux kernel source tree. Like Mercurial, its early
|
igor@402
|
447 design was somewhat influenced by Monotone.
|
igor@402
|
448
|
igor@402
|
449 Git has a very large command set, with version~1.5.0 providing~139
|
igor@402
|
450 individual commands. It has something of a reputation for being
|
igor@402
|
451 difficult to learn. Compared to Git, Mercurial has a strong focus on
|
igor@402
|
452 simplicity.
|
igor@402
|
453
|
igor@402
|
454 In terms of performance, Git is extremely fast. In several cases, it
|
igor@402
|
455 is faster than Mercurial, at least on Linux, while Mercurial performs
|
igor@402
|
456 better on other operations. However, on Windows, the performance and
|
igor@402
|
457 general level of support that Git provides is, at the time of writing,
|
igor@402
|
458 far behind that of Mercurial.
|
igor@402
|
459
|
igor@402
|
460 While a Mercurial repository needs no maintenance, a Git repository
|
igor@402
|
461 requires frequent manual ``repacks'' of its metadata. Without these,
|
igor@402
|
462 performance degrades, while space usage grows rapidly. A server that
|
igor@402
|
463 contains many Git repositories that are not rigorously and frequently
|
igor@402
|
464 repacked will become heavily disk-bound during backups, and there have
|
igor@402
|
465 been instances of daily backups taking far longer than~24 hours as a
|
igor@402
|
466 result. A freshly packed Git repository is slightly smaller than a
|
igor@402
|
467 Mercurial repository, but an unpacked repository is several orders of
|
igor@402
|
468 magnitude larger.
|
igor@402
|
469
|
igor@402
|
470 The core of Git is written in C. Many Git commands are implemented as
|
igor@402
|
471 shell or Perl scripts, and the quality of these scripts varies widely.
|
igor@402
|
472 I have encountered several instances where scripts charged along
|
igor@402
|
473 blindly in the presence of errors that should have been fatal.
|
igor@402
|
474
|
igor@402
|
475 Mercurial can import revision history from a Git repository.
|
igor@402
|
476
|
igor@402
|
477
|
igor@402
|
478 \subsection{CVS}
|
igor@402
|
479
|
igor@402
|
480 CVS is probably the most widely used revision control tool in the
|
igor@402
|
481 world. Due to its age and internal untidiness, it has been only
|
igor@402
|
482 lightly maintained for many years.
|
igor@402
|
483
|
igor@402
|
484 It has a centralised client/server architecture. It does not group
|
igor@402
|
485 related file changes into atomic commits, making it easy for people to
|
igor@402
|
486 ``break the build'': one person can successfully commit part of a
|
igor@402
|
487 change and then be blocked by the need for a merge, causing other
|
igor@402
|
488 people to see only a portion of the work they intended to do. This
|
igor@402
|
489 also affects how you work with project history. If you want to see
|
igor@402
|
490 all of the modifications someone made as part of a task, you will need
|
igor@402
|
491 to manually inspect the descriptions and timestamps of the changes
|
igor@402
|
492 made to each file involved (if you even know what those files were).
|
igor@402
|
493
|
igor@402
|
494 CVS has a muddled notion of tags and branches that I will not attempt
|
igor@402
|
495 to even describe. It does not support renaming of files or
|
igor@402
|
496 directories well, making it easy to corrupt a repository. It has
|
igor@402
|
497 almost no internal consistency checking capabilities, so it is usually
|
igor@402
|
498 not even possible to tell whether or how a repository is corrupt. I
|
igor@402
|
499 would not recommend CVS for any project, existing or new.
|
igor@402
|
500
|
igor@402
|
501 Mercurial can import CVS revision history. However, there are a few
|
igor@402
|
502 caveats that apply; these are true of every other revision control
|
igor@402
|
503 tool's CVS importer, too. Due to CVS's lack of atomic changes and
|
igor@402
|
504 unversioned filesystem hierarchy, it is not possible to reconstruct
|
igor@402
|
505 CVS history completely accurately; some guesswork is involved, and
|
igor@402
|
506 renames will usually not show up. Because a lot of advanced CVS
|
igor@402
|
507 administration has to be done by hand and is hence error-prone, it's
|
igor@402
|
508 common for CVS importers to run into multiple problems with corrupted
|
igor@402
|
509 repositories (completely bogus revision timestamps and files that have
|
igor@402
|
510 remained locked for over a decade are just two of the less interesting
|
igor@402
|
511 problems I can recall from personal experience).
|
igor@402
|
512
|
igor@402
|
513 Mercurial can import revision history from a CVS repository.
|
igor@402
|
514
|
igor@402
|
515
|
igor@402
|
516 \subsection{Commercial tools}
|
igor@402
|
517
|
igor@402
|
518 Perforce has a centralised client/server architecture, with no
|
igor@402
|
519 client-side caching of any data. Unlike modern revision control
|
igor@402
|
520 tools, Perforce requires that a user run a command to inform the
|
igor@402
|
521 server about every file they intend to edit.
|
igor@402
|
522
|
igor@402
|
523 The performance of Perforce is quite good for small teams, but it
|
igor@402
|
524 falls off rapidly as the number of users grows beyond a few dozen.
|
igor@402
|
525 Modestly large Perforce installations require the deployment of
|
igor@402
|
526 proxies to cope with the load their users generate.
|
igor@402
|
527
|
igor@402
|
528
|
igor@402
|
529 \subsection{Choosing a revision control tool}
|
igor@402
|
530
|
igor@402
|
531 With the exception of CVS, all of the tools listed above have unique
|
igor@402
|
532 strengths that suit them to particular styles of work. There is no
|
igor@402
|
533 single revision control tool that is best in all situations.
|
igor@402
|
534
|
igor@402
|
535 As an example, Subversion is a good choice for working with frequently
|
igor@402
|
536 edited binary files, due to its centralised nature and support for
|
igor@402
|
537 file locking.
|
igor@402
|
538
|
igor@402
|
539 I personally find Mercurial's properties of simplicity, performance,
|
igor@402
|
540 and good merge support to be a compelling combination that has served
|
igor@402
|
541 me well for several years.
|
igor@402
|
542
|
igor@402
|
543
|
igor@402
|
544 \section{Switching from another tool to Mercurial}
|
igor@402
|
545
|
igor@402
|
546 Mercurial is bundled with an extension named \hgext{convert}, which
|
igor@402
|
547 can incrementally import revision history from several other revision
|
igor@402
|
548 control tools. By ``incremental'', I mean that you can convert all of
|
igor@402
|
549 a project's history to date in one go, then rerun the conversion later
|
igor@402
|
550 to obtain new changes that happened after the initial conversion.
|
igor@402
|
551
|
igor@402
|
552 The revision control tools supported by \hgext{convert} are as
|
igor@402
|
553 follows:
|
igor@402
|
554 \begin{itemize}
|
igor@402
|
555 \item Subversion
|
igor@402
|
556 \item CVS
|
igor@402
|
557 \item Git
|
igor@402
|
558 \item Darcs
|
igor@402
|
559 \end{itemize}
|
igor@402
|
560
|
igor@402
|
561 In addition, \hgext{convert} can export changes from Mercurial to
|
igor@402
|
562 Subversion. This makes it possible to try Subversion and Mercurial in
|
igor@402
|
563 parallel before committing to a switchover, without risking the loss
|
igor@402
|
564 of any work.
|
igor@402
|
565
|
igor@402
|
566 The \hgxcmd{conver}{convert} command is easy to use. Simply point it
|
igor@402
|
567 at the path or URL of the source repository, optionally give it the
|
igor@402
|
568 name of the destination repository, and it will start working. After
|
igor@402
|
569 the initial conversion, just run the same command again to import new
|
igor@402
|
570 changes.
|
igor@402
|
571
|
igor@402
|
572
|
igor@402
|
573 %%% Local Variables:
|
igor@402
|
574 %%% mode: latex
|
igor@402
|
575 %%% TeX-master: "00book"
|
igor@402
|
576 %%% End:
|