hgbook: en/collab.tex annotate

hgbook

annotate en/collab.tex @ 184:7b812c428074

Document the ssh protocol, URL syntax, and configuration.

author	Bryan O'Sullivan <bos@serpentine.com>
date	Thu Apr 05 23:28:06 2007 -0700 (2007-04-05)
parents	5fc4a45c069f
children	b60e2de6dbc3

rev	line source
bos@159	1 \chapter{Collaborating with other people}
bos@159	2 \label{cha:collab}
bos@159	3
bos@159	4 As a completely decentralised tool, Mercurial doesn't impose any
bos@159	5 policy on how people ought to work with each other. However, if
bos@159	6 you're new to distributed revision control, it helps to have some
bos@159	7 tools and examples in mind when you're thinking about possible
bos@159	8 workflow models.
bos@159	9
bos@159	10 \section{Collaboration models}
bos@159	11
bos@159	12 With a suitably flexible tool, making decisions about workflow is much
bos@159	13 more of a social engineering challenge than a technical one.
bos@159	14 Mercurial imposes few limitations on how you can structure the flow of
bos@159	15 work in a project, so it's up to you and your group to set up and live
bos@159	16 with a model that matches your own particular needs.
bos@159	17
bos@159	18 \subsection{Factors to keep in mind}
bos@159	19
bos@159	20 The most important aspect of any model that you must keep in mind is
bos@159	21 how well it matches the needs and capabilities of the people who will
bos@159	22 be using it. This might seem self-evident; even so, you still can't
bos@159	23 afford to forget it for a moment.
bos@159	24
bos@159	25 I once put together a workflow model that seemed to make perfect sense
bos@159	26 to me, but that caused a considerable amount of consternation and
bos@159	27 strife within my development team. In spite of my attempts to explain
bos@159	28 why we needed a complex set of branches, and how changes ought to flow
bos@159	29 between them, a few team members revolted. Even though they were
bos@159	30 smart people, they didn't want to pay attention to the constraints we
bos@159	31 were operating under, or face the consequences of those constraints in
bos@159	32 the details of the model that I was advocating.
bos@159	33
bos@159	34 Don't sweep foreseeable social or technical problems under the rug.
bos@159	35 Whatever scheme you put into effect, you should plan for mistakes and
bos@159	36 problem scenarios. Consider adding automated machinery to prevent, or
bos@159	37 quickly recover from, trouble that you can anticipate. As an example,
bos@159	38 if you intend to have a branch with not-for-release changes in it,
bos@159	39 you'd do well to think early about the possibility that someone might
bos@159	40 accidentally merge those changes into a release branch. You could
bos@159	41 avoid this particular problem by writing a hook that prevents changes
bos@159	42 from being merged from an inappropriate branch.
bos@159	43
bos@159	44 \subsection{Informal anarchy}
bos@159	45
bos@159	46 I wouldn't suggest an ``anything goes'' approach as something
bos@159	47 sustainable, but it's a model that's easy to grasp, and it works
bos@159	48 perfectly well in a few unusual situations.
bos@159	49
bos@159	50 As one example, many projects have a loose-knit group of collaborators
bos@159	51 who rarely physically meet each other. Some groups like to overcome
bos@159	52 the isolation of working at a distance by organising occasional
bos@159	53 ``sprints''. In a sprint, a number of people get together in a single
bos@159	54 location (a company's conference room, a hotel meeting room, that kind
bos@159	55 of place) and spend several days more or less locked in there, hacking
bos@159	56 intensely on a handful of projects.
bos@159	57
bos@159	58 A sprint is the perfect place to use the \hgcmd{serve} command, since
bos@159	59 \hgcmd{serve} does not requires any fancy server infrastructure. You
bos@159	60 can get started with \hgcmd{serve} in moments, by reading
bos@159	61 section~\ref{sec:collab:serve} below. Then simply tell the person
bos@159	62 next to you that you're running a server, send the URL to them in an
bos@159	63 instant message, and you immediately have a quick-turnaround way to
bos@159	64 work together. They can type your URL into their web browser and
bos@159	65 quickly review your changes; or they can pull a bugfix from you and
bos@159	66 verify it; or they can clone a branch containing a new feature and try
bos@159	67 it out.
bos@159	68
bos@159	69 The charm, and the problem, with doing things in an ad hoc fashion
bos@159	70 like this is that only people who know about your changes, and where
bos@159	71 they are, can see them. Such an informal approach simply doesn't
bos@159	72 scale beyond a handful people, because each individual needs to know
bos@159	73 about $n$ different repositories to pull from.
bos@159	74
bos@159	75 \subsection{A single central repository}
bos@159	76
bos@179	77 For smaller projects migrating from a centralised revision control
bos@159	78 tool, perhaps the easiest way to get started is to have changes flow
bos@159	79 through a single shared central repository. This is also the
bos@159	80 most common ``building block'' for more ambitious workflow schemes.
bos@159	81
bos@159	82 Contributors start by cloning a copy of this repository. They can
bos@159	83 pull changes from it whenever they need to, and some (perhaps all)
bos@159	84 developers have permission to push a change back when they're ready
bos@159	85 for other people to see it.
bos@159	86
bos@179	87 Under this model, it can still often make sense for people to pull
bos@159	88 changes directly from each other, without going through the central
bos@159	89 repository. Consider a case in which I have a tentative bug fix, but
bos@159	90 I am worried that if I were to publish it to the central repository,
bos@159	91 it might subsequently break everyone else's trees as they pull it. To
bos@159	92 reduce the potential for damage, I can ask you to clone my repository
bos@159	93 into a temporary repository of your own and test it. This lets us put
bos@159	94 off publishing the potentially unsafe change until it has had a little
bos@159	95 testing.
bos@159	96
bos@159	97 In this kind of scenario, people usually use the \command{ssh}
bos@159	98 protocol to securely push changes to the central repository, as
bos@159	99 documented in section~\ref{sec:collab:ssh}. It's also usual to
bos@159	100 publish a read-only copy of the repository over HTTP using CGI, as in
bos@159	101 section~\ref{sec:collab:cgi}. Publishing over HTTP satisfies the
bos@159	102 needs of people who don't have push access, and those who want to use
bos@159	103 web browsers to browse the repository's history.
bos@159	104
bos@179	105 \subsection{Working with multiple branches}
bos@179	106
bos@179	107 Projects of any significant size naturally tend to make progress on
bos@179	108 several fronts simultaneously. In the case of software, it's common
bos@179	109 for a project to go through periodic official releases. A release
bos@179	110 might then go into ``maintenance mode'' for a while after its first
bos@179	111 publication; maintenance releases tend to contain only bug fixes, not
bos@179	112 new features. In parallel with these maintenance releases, one or
bos@179	113 more future releases may be under development. People normally use
bos@179	114 the word ``branch'' to refer to one of these many slightly different
bos@179	115 directions in which development is proceeding.
bos@179	116
bos@179	117 Mercurial is particularly well suited to managing a number of
bos@179	118 simultaneous, but not identical, branches. Each ``development
bos@179	119 direction'' can live in its own central repository, and you can merge
bos@179	120 changes from one to another as the need arises. Because repositories
bos@179	121 are independent of each other, unstable changes in a development
bos@179	122 branch will never affect a stable branch unless someone explicitly
bos@179	123 merges those changes in.
bos@179	124
bos@179	125 Here's an example of how this can work in practice. Let's say you
bos@179	126 have one ``main branch'' on a central server.
bos@179	127 \interaction{branching.init}
bos@179	128 People clone it, make changes locally, test them, and push them back.
bos@179	129
bos@179	130 Once the main branch reaches a release milestone, you can use the
bos@179	131 \hgcmd{tag} command to give a permanent name to the milestone
bos@179	132 revision.
bos@179	133 \interaction{branching.tag}
bos@179	134 Let's say some ongoing development occurs on the main branch.
bos@179	135 \interaction{branching.main}
bos@179	136 Using the tag that was recorded at the milestone, people who clone
bos@179	137 that repository at any time in the future can use \hgcmd{update} to
bos@179	138 get a copy of the working directory exactly as it was when that tagged
bos@179	139 revision was committed.
bos@179	140 \interaction{branching.update}
bos@179	141
bos@179	142 In addition, immediately after the main branch is tagged, someone can
bos@179	143 then clone the main branch on the server to a new ``stable'' branch,
bos@179	144 also on the server.
bos@179	145 \interaction{branching.clone}
bos@179	146
bos@179	147 Someone who needs to make a change to the stable branch can then clone
bos@179	148 \emph{that} repository, make their changes, commit, and push their
bos@179	149 changes back there.
bos@179	150 \interaction{branching.stable}
bos@179	151 Because Mercurial repositories are independent, and Mercurial doesn't
bos@179	152 move changes around automatically, the stable and main branches are
bos@179	153 \emph{isolated} from each other. The changes that you made on the
bos@179	154 main branch don't ``leak'' to the stable branch, and vice versa.
bos@179	155
bos@179	156 You'll often want all of your bugfixes on the stable branch to show up
bos@179	157 on the main branch, too. Rather than rewrite a bugfix on the main
bos@179	158 branch, you can simply pull and merge changes from the stable to the
bos@179	159 main branch, and Mercurial will bring those bugfixes in for you.
bos@179	160 \interaction{branching.merge}
bos@179	161 The main branch will still contain changes that are not on the stable
bos@179	162 branch, but it will also contain all of the bugfixes from the stable
bos@179	163 branch. The stable branch remains unaffected by these changes.
bos@179	164
bos@179	165 \subsection{Feature branches}
bos@179	166
bos@179	167 For larger projects, an effective way to manage change is to break up
bos@179	168 a team into smaller groups. Each group has a shared branch of its
bos@179	169 own, cloned from a single ``master'' branch used by the entire
bos@179	170 project. People working on an individual branch are typically quite
bos@179	171 isolated from developments on other branches.
bos@179	172
bos@179	173 \begin{figure}[ht]
bos@179	174 \centering
bos@179	175 \grafix{feature-branches}
bos@179	176 \caption{Feature branches}
bos@179	177 \label{fig:collab:feature-branches}
bos@179	178 \end{figure}
bos@179	179
bos@179	180 When a particular feature is deemed to be in suitable shape, someone
bos@179	181 on that feature team pulls and merges from the master branch into the
bos@179	182 feature branch, then pushes back up to the master branch.
bos@179	183
bos@179	184 \subsection{The release train}
bos@179	185
bos@179	186 Some projects are organised on a ``train'' basis: a release is
bos@179	187 scheduled to happen every few months, and whatever features are ready
bos@179	188 when the ``train'' is ready to leave are allowed in.
bos@179	189
bos@179	190 This model resembles working with feature branches. The difference is
bos@179	191 that when a feature branch misses a train, someone on the feature team
bos@184	192 pulls and merges the changes that went out on that train release into
bos@184	193 the feature branch, and the team continues its work on top of that
bos@184	194 release so that their feature can make the next release.
bos@179	195
bos@159	196 \subsection{The Linux kernel model}
bos@159	197
bos@159	198 The development of the Linux kernel has a shallow hierarchical
bos@159	199 structure, surrounded by a cloud of apparent chaos. Because most
bos@159	200 Linux developers use \command{git}, a distributed revision control
bos@159	201 tool with capabilities similar to Mercurial, it's useful to describe
bos@159	202 the way work flows in that environment; if you like the ideas, the
bos@159	203 approach translates well across tools.
bos@159	204
bos@159	205 At the center of the community sits Linus Torvalds, the creator of
bos@159	206 Linux. He publishes a single source repository that is considered the
bos@159	207 ``authoritative'' current tree by the entire developer community.
bos@159	208 Anyone can clone Linus's tree, but he is very choosy about whose trees
bos@159	209 he pulls from.
bos@159	210
bos@159	211 Linus has a number of ``trusted lieutenants''. As a general rule, he
bos@159	212 pulls whatever changes they publish, in most cases without even
bos@159	213 reviewing those changes. Some of those lieutenants are generally
bos@159	214 agreed to be ``maintainers'', responsible for specific subsystems
bos@159	215 within the kernel. If a random kernel hacker wants to make a change
bos@159	216 to a subsystem that they want to end up in Linus's tree, they must
bos@159	217 find out who the subsystem's maintainer is, and ask that maintainer to
bos@159	218 take their change. If the maintainer reviews their changes and agrees
bos@159	219 to take them, they'll pass them along to Linus in due course.
bos@159	220
bos@159	221 Individual lieutenants have their own approaches to reviewing,
bos@159	222 accepting, and publishing changes; and for deciding when to feed them
bos@159	223 to Linus. In addition, there are several well known branches that
bos@159	224 people use for different purposes. For example, a few people maintain
bos@159	225 ``stable'' repositories of older versions of the kernel, to which they
bos@184	226 apply critical fixes as needed. Some maintainers publish multiple
bos@184	227 trees: one for experimental changes; one for changes that they are
bos@184	228 about to feed upstream; and so on. Others just publish a single
bos@184	229 tree.
bos@159	230
bos@159	231 This model has two notable features. The first is that it's ``pull
bos@159	232 only''. You have to ask, convince, or beg another developer to take a
bos@184	233 change from you, because there are almost no trees to which more than
bos@184	234 one person can push, and there's no way to push changes into a tree
bos@184	235 that someone else controls.
bos@159	236
bos@159	237 The second is that it's based on reputation and acclaim. If you're an
bos@159	238 unknown, Linus will probably ignore changes from you without even
bos@159	239 responding. But a subsystem maintainer will probably review them, and
bos@159	240 will likely take them if they pass their criteria for suitability.
bos@159	241 The more ``good'' changes you contribute to a maintainer, the more
bos@159	242 likely they are to trust your judgment and accept your changes. If
bos@159	243 you're well-known and maintain a long-lived branch for something Linus
bos@159	244 hasn't yet accepted, people with similar interests may pull your
bos@159	245 changes regularly to keep up with your work.
bos@159	246
bos@159	247 Reputation and acclaim don't necessarily cross subsystem or ``people''
bos@159	248 boundaries. If you're a respected but specialised storage hacker, and
bos@159	249 you try to fix a networking bug, that change will receive a level of
bos@159	250 scrutiny from a network maintainer comparable to a change from a
bos@159	251 complete stranger.
bos@159	252
bos@159	253 To people who come from more orderly project backgrounds, the
bos@159	254 comparatively chaotic Linux kernel development process often seems
bos@159	255 completely insane. It's subject to the whims of individuals; people
bos@159	256 make sweeping changes whenever they deem it appropriate; and the pace
bos@159	257 of development is astounding. And yet Linux is a highly successful,
bos@159	258 well-regarded piece of software.
bos@159	259
bos@159	260 \section{The technical side of sharing}
bos@159	261
bos@159	262 \subsection{Informal sharing with \hgcmd{serve}}
bos@159	263 \label{sec:collab:serve}
bos@159	264
bos@159	265 Mercurial's \hgcmd{serve} command is wonderfully suited to small,
bos@159	266 tight-knit, and fast-paced group environments. It also provides a
bos@159	267 great way to get a feel for using Mercurial commands over a network.
bos@159	268
bos@159	269 Run \hgcmd{serve} inside a repository, and in under a second it will
bos@159	270 bring up a specialised HTTP server; this will accept connections from
bos@159	271 any client, and serve up data for that repository until you terminate
bos@159	272 it. Anyone who knows the URL of the server you just started, and can
bos@159	273 talk to your computer over the network, can then use a web browser or
bos@159	274 Mercurial to read data from that repository. A URL for a
bos@159	275 \hgcmd{serve} instance running on a laptop is likely to look something
bos@159	276 like \Verb\|http://my-laptop.local:8000/\|.
bos@159	277
bos@159	278 The \hgcmd{serve} command is \emph{not} a general-purpose web server.
bos@159	279 It can do only two things:
bos@159	280 \begin{itemize}
bos@159	281 \item Allow people to browse the history of the repository it's
bos@159	282 serving, from their normal web browsers.
bos@159	283 \item Speak Mercurial's wire protocol, so that people can
bos@159	284 \hgcmd{clone} or \hgcmd{pull} changes from that repository.
bos@159	285 \end{itemize}
bos@159	286 In particular, \hgcmd{serve} won't allow remote users to \emph{modify}
bos@159	287 your repository. It's intended for read-only use.
bos@159	288
bos@159	289 If you're getting started with Mercurial, there's nothing to prevent
bos@159	290 you from using \hgcmd{serve} to serve up a repository on your own
bos@159	291 computer, then use commands like \hgcmd{clone}, \hgcmd{incoming}, and
bos@159	292 so on to talk to that server as if the repository was hosted remotely.
bos@159	293 This can help you to quickly get acquainted with using commands on
bos@159	294 network-hosted repositories.
bos@159	295
bos@159	296 \subsubsection{A few things to keep in mind}
bos@159	297
bos@159	298 Because it provides unauthenticated read access to all clients, you
bos@159	299 should only use \hgcmd{serve} in an environment where you either don't
bos@159	300 care, or have complete control over, who can access your network and
bos@159	301 pull data from your repository.
bos@159	302
bos@159	303 The \hgcmd{serve} command knows nothing about any firewall software
bos@159	304 you might have installed on your system or network. It cannot detect
bos@159	305 or control your firewall software. If other people are unable to talk
bos@159	306 to a running \hgcmd{serve} instance, the second thing you should do
bos@159	307 (\emph{after} you make sure that they're using the correct URL) is
bos@159	308 check your firewall configuration.
bos@159	309
bos@159	310 By default, \hgcmd{serve} listens for incoming connections on
bos@159	311 port~8000. If another process is already listening on the port you
bos@159	312 want to use, you can specify a different port to listen on using the
bos@159	313 \hgopt{serve}{-p} option.
bos@159	314
bos@159	315 Normally, when \hgcmd{serve} starts, it prints no output, which can be
bos@159	316 a bit unnerving. If you'd like to confirm that it is indeed running
bos@159	317 correctly, and find out what URL you should send to your
bos@159	318 collaborators, start it with the \hggopt{-v} option.
bos@159	319
bos@184	320 \subsection{Using the Secure Shell (ssh) protocol}
bos@159	321 \label{sec:collab:ssh}
bos@159	322
bos@184	323 You can pull and push changes securely over a network connection using
bos@184	324 the Secure Shell (\texttt{ssh}) protocol. To use this successfully,
bos@184	325 you may have to do a little bit of configuration on the client or
bos@184	326 server sides.
bos@184	327
bos@184	328 If you're not familiar with ssh, it's a network protocol that lets you
bos@184	329 securely communicate with another computer. To use it with Mercurial,
bos@184	330 you'll be setting up one or more user accounts on a server so that
bos@184	331 remote users can log in and execute commands.
bos@184	332
bos@184	333 (If you \emph{are} familiar with ssh, you'll probably find some of the
bos@184	334 material that follows to be elementary in nature.)
bos@184	335
bos@184	336 \subsubsection{How to read and write ssh URLs}
bos@184	337
bos@184	338 An ssh URL tends to look like this:
bos@184	339 \begin{codesample2}
bos@184	340 ssh://bos@hg.serpentine.com:22/hg/hgbook
bos@184	341 \end{codesample2}
bos@184	342 \begin{enumerate}
bos@184	343 \item The ``\texttt{ssh://}'' part tells Mercurial to use the ssh
bos@184	344 protocol.
bos@184	345 \item The ``\texttt{bos@}'' component indicates what username to log
bos@184	346 into the server as. You can leave this out if the remote username
bos@184	347 is the same as your local username.
bos@184	348 \item The ``\texttt{hg.serpentine.com}'' gives the hostname of the
bos@184	349 server to log into.
bos@184	350 \item The ``:22'' identifies the port number to connect to the server
bos@184	351 on. The default port is~22, so you only need to specify this part
bos@184	352 if you're \emph{not} using port~22.
bos@184	353 \item The remainder of the URL is the local path to the repository on
bos@184	354 the server.
bos@184	355 \end{enumerate}
bos@184	356
bos@184	357 There's plenty of scope for confusion with the path component of ssh
bos@184	358 URLs, as there is no standard way for tools to interpret it. Some
bos@184	359 programs behave differently than others when dealing with these paths.
bos@184	360 This isn't an ideal situation, but it's unlikely to change. Please
bos@184	361 read the following paragraphs carefully.
bos@184	362
bos@184	363 Mercurial treats the path to a repository on the server as relative to
bos@184	364 the remote user's home directory. For example, if user \texttt{foo}
bos@184	365 on the server has a home directory of \dirname{/home/foo}, then an ssh
bos@184	366 URL that contains a path component of \dirname{bar}
bos@184	367 \emph{really} refers to the directory \dirname{/home/foo/bar}.
bos@184	368
bos@184	369 If you want to specify a path relative to another user's home
bos@184	370 directory, you can use a path that starts with a tilde character
bos@184	371 followed by the user's name (let's call them \texttt{otheruser}), like
bos@184	372 this.
bos@184	373 \begin{codesample2}
bos@184	374 ssh://server/~otheruser/hg/repo
bos@184	375 \end{codesample2}
bos@184	376
bos@184	377 And if you really want to specify an \emph{absolute} path on the
bos@184	378 server, begin the path component with two slashes, as in this example.
bos@184	379 \begin{codesample2}
bos@184	380 ssh://server//absolute/path
bos@184	381 \end{codesample2}
bos@184	382
bos@184	383 \subsubsection{Finding an ssh client for your system}
bos@184	384
bos@184	385 Almost every Unix-like system comes with OpenSSH preinstalled. If
bos@184	386 you're using such a system, run \Verb\|which ssh\| to find out if
bos@184	387 the \command{ssh} command is installed (it's usually in
bos@184	388 \dirname{/usr/bin}). In the unlikely event that it isn't present,
bos@184	389 take a look at your system documentation to figure out how to install
bos@184	390 it.
bos@184	391
bos@184	392 On Windows, you'll first need to choose download a suitable ssh
bos@184	393 client. There are two alternatives.
bos@184	394 \begin{itemize}
bos@184	395 \item Simon Tatham's excellent PuTTY package~\cite{web:putty} provides
bos@184	396 a complete suite of ssh client commands.
bos@184	397 \item If you have a high tolerance for pain, you can use the Cygwin
bos@184	398 port of OpenSSH.
bos@184	399 \end{itemize}
bos@184	400 In either case, you'll need to edit your \hgini\ file to tell
bos@184	401 Mercurial where to find the actual client command. For example, if
bos@184	402 you're using PuTTY, you'll need to use the \command{plink} command as
bos@184	403 a command-line ssh client.
bos@184	404 \begin{codesample2}
bos@184	405 [ui]
bos@184	406 ssh = C:/path/to/plink.exe -ssh -i "C:/path/to/my/private/key"
bos@184	407 \end{codesample2}
bos@184	408
bos@184	409 \begin{note}
bos@184	410 The path to \command{plink} shouldn't contain any whitespace
bos@184	411 characters, or Mercurial may not be able to run it correctly (so
bos@184	412 putting it in \dirname{C:\\Program Files} is probably not be a good
bos@184	413 idea).
bos@184	414 \end{note}
bos@184	415
bos@184	416 \subsubsection{Generating a key pair}
bos@184	417
bos@184	418 To avoid the need to repetitively type a password every time you need
bos@184	419 to use your ssh client, I recommend generating a key pair. On a
bos@184	420 Unix-like system, the \command{ssh-keygen} command will do the trick.
bos@184	421 On Windows, if you're using PuTTY, the \command{puttygen} command is
bos@184	422 what you'll need.
bos@184	423
bos@184	424 When you generate a key pair, it's usually \emph{highly} advisable to
bos@184	425 protect it with a passphrase. (The only time that you might not want
bos@184	426 to do this id when you're using the ssh protocol for automated tasks
bos@184	427 on a secure network.)
bos@184	428
bos@184	429 Simply generating a key pair isn't enough, however. You'll need to
bos@184	430 add the public key to the set of authorised keys for whatever user
bos@184	431 you're logging in remotely as. For servers using OpenSSH (the vast
bos@184	432 majority), this will mean adding the public key to a list in a file
bos@184	433 called \sfilename{authorized\_keys} in their \sdirname{.ssh}
bos@184	434 directory.
bos@184	435
bos@184	436 On a Unix-like system, your public key will have a \filename{.pub}
bos@184	437 extension. If you're using \command{puttygen} on Windows, you can
bos@184	438 save the public key to a file of your choosing, or paste it from the
bos@184	439 window it's displayed in straight into the
bos@184	440 \sfilename{authorized\_keys} file.
bos@184	441
bos@184	442 \subsubsection{Using an authentication agent}
bos@184	443
bos@184	444 An authentication agent is a daemon that stores passphrases in memory
bos@184	445 (so it will forget passphrases if you log out and log back in again).
bos@184	446 An ssh client will notice if it's running, and query it for a
bos@184	447 passphrase. If there's no authentication agent running, or the agent
bos@184	448 doesn't store the necessary passphrase, you'll have to type your
bos@184	449 passphrase every time Mercurial tries to communicate with a server on
bos@184	450 your behalf (e.g.~whenever you pull or push changes).
bos@184	451
bos@184	452 The downside of storing passphrases in an agent is that it's possible
bos@184	453 for a well-prepared attacker to recover the plain text of your
bos@184	454 passphrases, in some cases even if your system has been power-cycled.
bos@184	455 You should make your own judgment as to whether this is an acceptable
bos@184	456 risk. It certainly saves a lot of repeated typing.
bos@184	457
bos@184	458 On Unix-like systems, the agent is called \command{ssh-agent}, and
bos@184	459 it's often run automatically for you when you log in. You'll need to
bos@184	460 use the \command{ssh-add} command to add passphrases to the agent's
bos@184	461 store. On Windows, if you're using PuTTY, the \command{pageant}
bos@184	462 command acts as the agent. It adds an icon to your system tray that
bos@184	463 will let you manage stored passphrases.
bos@184	464
bos@184	465 \subsubsection{Configuring the server side properly}
bos@184	466
bos@184	467 Because ssh can be fiddly to set up if you're new to it, there's a
bos@184	468 variety of things that can go wrong. Add Mercurial on top, and
bos@184	469 there's plenty more scope for head-scratching. Most of these
bos@184	470 potential problems occur on the server side, not the client side. The
bos@184	471 good news is that once you've gotten a configuration working, it will
bos@184	472 usually continue to work indefinitely.
bos@184	473
bos@184	474 Before you try using Mercurial to talk to an ssh server, it's best to
bos@184	475 make sure that you can use the normal \command{ssh} or \command{putty}
bos@184	476 command to talk to the server first. If you run into problems with
bos@184	477 using these commands directly, Mercurial surely won't work. Worse, it
bos@184	478 will obscure the underlying problem. Any time you want to debug
bos@184	479 ssh-related Mercurial problems, you should drop back to making sure
bos@184	480 that plain ssh client commands work first, \emph{before} you worry
bos@184	481 about whether there's a problem with Mercurial.
bos@184	482
bos@184	483 The first thing to be sure of on the server side is that you can
bos@184	484 actually log in from another machine at all. If you can't use
bos@184	485 \command{ssh} or \command{putty} to log in, the error message you get
bos@184	486 may give you a few hints as to what's wrong. The most common problems
bos@184	487 are as follows.
bos@184	488 \begin{itemize}
bos@184	489 \item If you get a ``connection refused'' error, either there isn't an
bos@184	490 SSH daemon running on the server at all, or it's inaccessible due to
bos@184	491 firewall configuration.
bos@184	492 \item If you get a ``no route to host'' error, you either have an
bos@184	493 incorrect address for the server or a seriously locked down firewall
bos@184	494 that won't admit its existence at all.
bos@184	495 \item If you get a ``permission denied'' error, you may have mistyped
bos@184	496 the username on the server, or you could have mistyped your key's
bos@184	497 passphrase or the remote user's password.
bos@184	498 \end{itemize}
bos@184	499 In summary, if you're having trouble talking to the server's ssh
bos@184	500 daemon, first make sure that one is running at all. On many systems
bos@184	501 it will be installed, but disabled, by default. Once you're done with
bos@184	502 this step, you should then check that the server's firewall is
bos@184	503 configured to allow incoming connections on the port the ssh daemon is
bos@184	504 listening on (usually~22). Don't worry about more exotic
bos@184	505 possibilities for misconfiguration until you've checked these two
bos@184	506 first.
bos@184	507
bos@184	508 If you're using an authentication agent on the client side to store
bos@184	509 passphrases for your keys, you ought to be able to log into the server
bos@184	510 without being prompted for a passphrase or a password. If you're
bos@184	511 prompted for a passphrase, there are a few possible culprits.
bos@184	512 \begin{itemize}
bos@184	513 \item You might have forgotten to use \command{ssh-add} or
bos@184	514 \command{pageant} to store the passphrase.
bos@184	515 \item You might have stored the passphrase for the wrong key.
bos@184	516 \end{itemize}
bos@184	517 If you're being prompted for the remote user's password, there are
bos@184	518 another few possible problems to check.
bos@184	519 \begin{itemize}
bos@184	520 \item Either the user's home directory or their \sdirname{.ssh}
bos@184	521 directory might have excessively liberal permissions. As a result,
bos@184	522 the ssh daemon will not trust or read their
bos@184	523 \sfilename{authorized\_keys} file. For example, a group-writable
bos@184	524 home or \sdirname{.ssh} directory will often cause this symptom.
bos@184	525 \item The user's \sfilename{authorized\_keys} file may have a problem.
bos@184	526 If anyone other than the user owns or can write to that file, the
bos@184	527 ssh daemon will not trust or read it.
bos@184	528 \end{itemize}
bos@184	529
bos@184	530 In the ideal world, you should be able to run the following command
bos@184	531 successfully, and it should print exactly one line of output, the
bos@184	532 current date and time.
bos@184	533 \begin{codesample2}
bos@184	534 ssh myserver date
bos@184	535 \end{codesample2}
bos@184	536
bos@184	537 If on your server you have login scripts that print banners or other
bos@184	538 junk even when running non-interactive commands like this, you should
bos@184	539 fix them before you continue, so that they only print output if
bos@184	540 they're run interactively. Otherwise these banners will at least
bos@184	541 clutter up Mercurial's output. Worse, they could potentially cause
bos@184	542 problems with running Mercurial commands remotely. (The usual way to
bos@184	543 see if a login script is running in an interactive shell is to check
bos@184	544 the return code from the command \Verb\|tty -s\|.)
bos@184	545
bos@184	546 Once you've verified that plain old ssh is working with your server,
bos@184	547 the next step is to ensure that Mercurial runs on the server. The
bos@184	548 following command should run successfully:
bos@184	549 \begin{codesample2}
bos@184	550 ssh myserver hg version
bos@184	551 \end{codesample2}
bos@184	552 If you see an error message instead of normal \hgcmd{version} output,
bos@184	553 this is usually because you haven't installed Mercurial to
bos@184	554 \dirname{/usr/bin}. Don't worry if this is the case; you don't need
bos@184	555 to do that. But you should check for a few possible problems.
bos@184	556 \begin{itemize}
bos@184	557 \item Is Mercurial really installed on the server at all? I know this
bos@184	558 sounds trivial, but it's worth checking!
bos@184	559 \item Maybe your shell's search path (usually set via the \envar{PATH}
bos@184	560 environment variable) is simply misconfigured.
bos@184	561 \item Perhaps your \envar{PATH} environment variable is only being set
bos@184	562 to point to the location of the \command{hg} executable if the login
bos@184	563 session is interactive. This can happen if you're setting the path
bos@184	564 in the wrong shell login script. See your shell's documentation for
bos@184	565 details.
bos@184	566 \item The \envar{PYTHONPATH} environment variable may need to contain
bos@184	567 the path to the Mercurial Python modules. It might not be set at
bos@184	568 all; it could be incorrect; or it may be set only if the login is
bos@184	569 interactive.
bos@184	570 \end{itemize}
bos@184	571
bos@184	572 If you can run \hgcmd{version} over an ssh connection, well done!
bos@184	573 You've got the server and client sorted out. You should now be able
bos@184	574 to use Mercurial to access repositories hosted by that username on
bos@184	575 that server. If you run into problems with Mercurial and ssh at this
bos@184	576 point, try using the \hggopt{--debug} option to get a clearer picture
bos@184	577 of what's going on.
bos@184	578
bos@184	579 \subsubsection{Using compression with ssh}
bos@184	580
bos@184	581 Mercurial does not compress data when it uses the ssh protocol,
bos@184	582 because the ssh protocol can transparently compress data. However,
bos@184	583 the default behaviour of ssh clients is \emph{not} to request
bos@184	584 compression.
bos@184	585
bos@184	586 Over any network other than a fast LAN (even a wireless network),
bos@184	587 using compression is likely to significantly speed up Mercurial's
bos@184	588 network operations. For example, over a WAN, someone measured
bos@184	589 compression as reducing the amount of time required to clone a
bos@184	590 particularly large repository from~51 minutes to~17 minutes.
bos@184	591
bos@184	592 Both \command{ssh} and \command{plink} accept a \cmdopt{ssh}{-C}
bos@184	593 option which turns on compression. You can easily edit your \hgrc\ to
bos@184	594 enable compression for all of Mercurial's uses of the ssh protocol.
bos@184	595 \begin{codesample2}
bos@184	596 [ui]
bos@184	597 ssh = ssh -C
bos@184	598 \end{codesample2}
bos@184	599
bos@184	600 \subsection{Serving over HTTP with a CGI script}
bos@159	601 \label{sec:collab:cgi}
bos@159	602
bos@159	603
bos@159	604
bos@159	605 %%% Local Variables:
bos@159	606 %%% mode: latex
bos@159	607 %%% TeX-master: "00book"
bos@159	608 %%% End: