hgbook

diff en/intro.tex @ 219:15a6fd2ba582

Start talking about the advantages of distributed tools.
author Bryan O'Sullivan <bos@serpentine.com>
date Mon May 14 11:20:34 2007 -0700 (2007-05-14)
parents 75fd236d736b
children 0ca9045035f7
line diff
     1.1 --- a/en/intro.tex	Thu May 10 17:21:09 2007 -0700
     1.2 +++ b/en/intro.tex	Mon May 14 11:20:34 2007 -0700
     1.3 @@ -3,37 +3,35 @@
     1.4  
     1.5  \section{About revision control}
     1.6  
     1.7 -Revision control is the management of multiple versions of a piece of
     1.8 -information.  In its simplest form, it's a process that many people
     1.9 -perform by hand: every time you modify a file, save it under a new
    1.10 -name that contains a number, each one higher than the number of the
    1.11 -preceding version.
    1.12 +Revision control is the process of managing multiple versions of a
    1.13 +piece of information.  In its simplest form, this is something that
    1.14 +many people do by hand: every time you modify a file, save it under a
    1.15 +new name that contains a number, each one higher than the number of
    1.16 +the preceding version.
    1.17  
    1.18  Manually managing multiple versions of even a single file is an
    1.19  error-prone task, though, so software tools to help automate this
    1.20  process have long been available.  The earliest automated revision
    1.21  control tools were intended to help a single user to manage revisions
    1.22 -to a single file.  Over the past several decades, the scope of
    1.23 -revision control tools has expanded greatly; they now manage multiple
    1.24 -files, and help multiple people to work together.  The best modern
    1.25 -revision control tools will have no problem coping with thousands of
    1.26 -people working together on a single project, which might consist of
    1.27 -hundreds of thousands of files.
    1.28 +of a single file.  Over the past few decades, the scope of revision
    1.29 +control tools has expanded greatly; they now manage multiple files,
    1.30 +and help multiple people to work together.  The best modern revision
    1.31 +control tools have no problem coping with thousands of people working
    1.32 +together on projects that consist of hundreds of thousands of files.
    1.33  
    1.34  \subsection{Why use revision control?}
    1.35  
    1.36  There are a number of reasons why you or your team might want to use
    1.37  an automated revision control tool for a project.
    1.38  \begin{itemize}
    1.39 -\item The software gives you a unified way of working with your
    1.40 -  project's files.
    1.41 -\item When you're working with other people, it makes it easier for
    1.42 -  you to collaborate.  For example, when people more or less
    1.43 -  simultaneously make potentially incompatible changes, the software
    1.44 -  will help you to identify and resolve those conflicts.
    1.45 -\item It will track the history of your project.  For every change,
    1.46 -  you'll have a log of \emph{who} made it; \emph{why} they made it;
    1.47 -  \emph{when} they made it; and \emph{what} the change was.
    1.48 +\item It will track the history and evolution of your project, so you
    1.49 +  don't have to.  For every change, you'll have a log of \emph{who}
    1.50 +  made it; \emph{why} they made it; \emph{when} they made it; and
    1.51 +  \emph{what} the change was.
    1.52 +\item When you're working with other people, revision control software
    1.53 +  makes it easier for you to collaborate.  For example, when people
    1.54 +  more or less simultaneously make potentially incompatible changes,
    1.55 +  the software will help you to identify and resolve those conflicts.
    1.56  \item It can help you to recover from mistakes.  If you make a change
    1.57    that later turns out to be in error, you can revert to an earlier
    1.58    version of one or more files.  In fact, a \emph{really} good
    1.59 @@ -52,11 +50,11 @@
    1.60  \emph{benefits} compare to its \emph{costs}.  A revision control tool
    1.61  that's difficult to understand or use is going to impose a high cost.
    1.62  
    1.63 -For example, a five-hundred-person project is likely to collapse under
    1.64 -its own weight almost immediately without a revision control tool and
    1.65 -process.  In this case, the cost of using revision control might
    1.66 -hardly seem worth considering, since \emph{without} it, failure is
    1.67 -almost guaranteed.
    1.68 +A five-hundred-person project is likely to collapse under its own
    1.69 +weight almost immediately without a revision control tool and process.
    1.70 +In this case, the cost of using revision control might hardly seem
    1.71 +worth considering, since \emph{without} it, failure is almost
    1.72 +guaranteed.
    1.73  
    1.74  On the other hand, a one-person ``quick hack'' might seem like a poor
    1.75  place to use a revision control tool, because surely the cost of using
    1.76 @@ -71,24 +69,27 @@
    1.77  Mercurial's high performance and peer-to-peer nature let you scale
    1.78  painlessly to handle large projects.
    1.79  
    1.80 +No revision control tool can rescue a poorly run project, but a good
    1.81 +choice of tools can make a huge difference to the fluidity with which
    1.82 +you can work on a project.
    1.83 +
    1.84  \subsection{The many names of revision control}
    1.85  
    1.86  Revision control is a diverse field, so much so that it doesn't
    1.87  actually have a single name or acronym.  Here are a few of the more
    1.88  common names and acronyms you'll encounter:
    1.89  \begin{itemize}
    1.90 -\item Configuration management (CM)
    1.91  \item Revision control (RCS)
    1.92 -\item Software configuration management (SCM)
    1.93 +\item Software configuration management (SCM), or configuration management
    1.94  \item Source code management
    1.95 -\item Source control
    1.96 +\item Source code control, or source control
    1.97  \item Version control (VCS)
    1.98  \end{itemize}
    1.99  Some people claim that these terms actually have different meanings,
   1.100  but in practice they overlap so much that there's no agreed or even
   1.101  useful way to tease them apart.
   1.102  
   1.103 -\section{A short history and hierarchy of revision control}
   1.104 +\section{A short history of revision control}
   1.105  
   1.106  The best known of the old-time revision control tools is SCCS (Source
   1.107  Code Control System), which Marc Rochkind wrote at Bell Labs, in the
   1.108 @@ -159,14 +160,84 @@
   1.109  influenced by Monotone, Mercurial focuses on ease of use, high
   1.110  performance, and scalability to very large projects.
   1.111  
   1.112 -\subsection{On a single system}
   1.113 -
   1.114 -\subsection{Network-based, but centralised}
   1.115 -
   1.116 -\subsection{Fully distributed}
   1.117 -
   1.118 -
   1.119 -\section{Advantages of distributed revision control}
   1.120 +\section{Trends in revision control}
   1.121 +
   1.122 +There has been an unmistakable trend in the development and use of
   1.123 +revision control tools over the past four decades, as people have
   1.124 +become familiar with the capabilities of their tools and constrained
   1.125 +by their limitations.
   1.126 +
   1.127 +The first generation began by managing single files on individual
   1.128 +computers.  Although these tools represented a huge advance over
   1.129 +ad-hoc manual revision control, their locking model and reliance on a
   1.130 +single computer limited them to small, tightly-knit teams.
   1.131 +
   1.132 +The second generation loosened these constraints by moving to
   1.133 +network-centered architectures, and managing entire projects at a
   1.134 +time.  As projects grew larger, they ran into new problems.  With
   1.135 +clients needing to talk to servers very frequently, server scaling
   1.136 +became an issue for large projects.  An unreliable network connection
   1.137 +could prevent remote users from being able to talk to the server at
   1.138 +all.  As open source projects started making read-only access
   1.139 +available anonymously to anyone, people without commit privileges
   1.140 +found that they could not use the tools to interact with a project in
   1.141 +a natural way, as they could not record their changes.
   1.142 +
   1.143 +The current generation of revision control tools is peer-to-peer in
   1.144 +nature.  All of these systems have dropped the dependency on a single
   1.145 +central server, and allow people to distribute their revision control
   1.146 +data to where it's actually needed.  Collaboration over the Internet
   1.147 +has moved from constrained by technology to a matter of choice and
   1.148 +consensus.  Modern tools can operate offline indefinitely and
   1.149 +autonomously, with a network connection only needed when syncing
   1.150 +changes with another repository.
   1.151 +
   1.152 +\section{A few of the advantages of distributed revision control}
   1.153 +
   1.154 +Even though distributed revision control tools have for several years
   1.155 +been as robust and usable as their previous-generation counterparts,
   1.156 +people using older tools have not yet necessarily woken up to their
   1.157 +advantages.  There are a number of ways in which distributed tools
   1.158 +shine relative to centralised ones.
   1.159 +
   1.160 +For an individual developer, distributed tools are almost always much
   1.161 +faster than centralised tools.  This is for a simple reason: a
   1.162 +centralised tool needs to talk over the network for many common
   1.163 +operations, because most metadata is stored in a single copy on the
   1.164 +central server.  A distributed tool stores all of its metadata
   1.165 +locally.  All else being equal, talking over the network adds overhead
   1.166 +to a centralised tool.  Don't underestimate the value of a snappy,
   1.167 +responsive tool: you're going to spend a lot of time interacting with
   1.168 +your revision control software.
   1.169 +
   1.170 +Distributed tools are indifferent to the vagaries of your server
   1.171 +infrastructure, again because they replicate metadata to so many
   1.172 +locations.  If you use a centralised system and your server catches
   1.173 +fire, you'd better hope that your backup media are reliable, and that
   1.174 +your last backup was recent and actually worked.  With a distributed
   1.175 +tool, you have many backups available on every contributor's computer.
   1.176 +
   1.177 +The reliability of your network will affect distributed tools far less
   1.178 +than it will centralised tools.  You can't even use a centralised tool
   1.179 +without a network connection, except for a few highly constrained
   1.180 +commands.  With a distributed tool, if your network connection goes
   1.181 +down while you're working, you may not even notice.  The only thing
   1.182 +you won't be able to do is talk to repositories on other computers,
   1.183 +something that is relatively rare compared with local operations.  If
   1.184 +you have a far-flung team of collaborators, this may be significant.
   1.185 +
   1.186 +If you take a shine to an open source project and decide that you
   1.187 +would like to start hacking on it, and that project uses a distributed
   1.188 +revision control tool, you are at once a peer with the people who
   1.189 +consider themselves the ``core'' of that project.  If they publish
   1.190 +their repositories, you can immediately copy their project history,
   1.191 +start making changes, and record your work, using the same tools in
   1.192 +the same ways as insiders.  By contrast, with a centralised tool, you
   1.193 +must use the software in a ``read only'' mode unless someone grants
   1.194 +you permission to commit changes to their central server.  Until then,
   1.195 +you won't be able to record changes, and your local modifications will
   1.196 +be at risk of corruption any time you try to update your client's view
   1.197 +of the repository.
   1.198  
   1.199  \subsection{For open source projects}
   1.200  
   1.201 @@ -174,6 +245,8 @@
   1.202  
   1.203  \subsection{Myths about distributed revision control}
   1.204  
   1.205 +\subsubsection{Distributed tools encourage projects to fork}
   1.206 +
   1.207  \section{Why choose Mercurial?}
   1.208  
   1.209