rev |
line source |
igor@402
|
1 \chapter{Collaborating with other people}
|
igor@402
|
2 \label{cha:collab}
|
igor@402
|
3
|
igor@402
|
4 As a completely decentralised tool, Mercurial doesn't impose any
|
igor@402
|
5 policy on how people ought to work with each other. However, if
|
igor@402
|
6 you're new to distributed revision control, it helps to have some
|
igor@402
|
7 tools and examples in mind when you're thinking about possible
|
igor@402
|
8 workflow models.
|
igor@402
|
9
|
igor@402
|
10 \section{Mercurial's web interface}
|
igor@402
|
11
|
igor@402
|
12 Mercurial has a powerful web interface that provides several
|
igor@402
|
13 useful capabilities.
|
igor@402
|
14
|
igor@402
|
15 For interactive use, the web interface lets you browse a single
|
igor@402
|
16 repository or a collection of repositories. You can view the history
|
igor@402
|
17 of a repository, examine each change (comments and diffs), and view
|
igor@402
|
18 the contents of each directory and file.
|
igor@402
|
19
|
igor@402
|
20 Also for human consumption, the web interface provides an RSS feed of
|
igor@402
|
21 the changes in a repository. This lets you ``subscribe'' to a
|
igor@402
|
22 repository using your favourite feed reader, and be automatically
|
igor@402
|
23 notified of activity in that repository as soon as it happens. I find
|
igor@402
|
24 this capability much more convenient than the model of subscribing to
|
igor@402
|
25 a mailing list to which notifications are sent, as it requires no
|
igor@402
|
26 additional configuration on the part of whoever is serving the
|
igor@402
|
27 repository.
|
igor@402
|
28
|
igor@402
|
29 The web interface also lets remote users clone a repository, pull
|
igor@402
|
30 changes from it, and (when the server is configured to permit it) push
|
igor@402
|
31 changes back to it. Mercurial's HTTP tunneling protocol aggressively
|
igor@402
|
32 compresses data, so that it works efficiently even over low-bandwidth
|
igor@402
|
33 network connections.
|
igor@402
|
34
|
igor@402
|
35 The easiest way to get started with the web interface is to use your
|
igor@402
|
36 web browser to visit an existing repository, such as the master
|
igor@402
|
37 Mercurial repository at
|
igor@402
|
38 \url{http://www.selenic.com/repo/hg?style=gitweb}.
|
igor@402
|
39
|
igor@402
|
40 If you're interested in providing a web interface to your own
|
igor@402
|
41 repositories, Mercurial provides two ways to do this. The first is
|
igor@402
|
42 using the \hgcmd{serve} command, which is best suited to short-term
|
igor@402
|
43 ``lightweight'' serving. See section~\ref{sec:collab:serve} below for
|
igor@402
|
44 details of how to use this command. If you have a long-lived
|
igor@402
|
45 repository that you'd like to make permanently available, Mercurial
|
igor@402
|
46 has built-in support for the CGI (Common Gateway Interface) standard,
|
igor@402
|
47 which all common web servers support. See
|
igor@402
|
48 section~\ref{sec:collab:cgi} for details of CGI configuration.
|
igor@402
|
49
|
igor@402
|
50 \section{Collaboration models}
|
igor@402
|
51
|
igor@402
|
52 With a suitably flexible tool, making decisions about workflow is much
|
igor@402
|
53 more of a social engineering challenge than a technical one.
|
igor@402
|
54 Mercurial imposes few limitations on how you can structure the flow of
|
igor@402
|
55 work in a project, so it's up to you and your group to set up and live
|
igor@402
|
56 with a model that matches your own particular needs.
|
igor@402
|
57
|
igor@402
|
58 \subsection{Factors to keep in mind}
|
igor@402
|
59
|
igor@402
|
60 The most important aspect of any model that you must keep in mind is
|
igor@402
|
61 how well it matches the needs and capabilities of the people who will
|
igor@402
|
62 be using it. This might seem self-evident; even so, you still can't
|
igor@402
|
63 afford to forget it for a moment.
|
igor@402
|
64
|
igor@402
|
65 I once put together a workflow model that seemed to make perfect sense
|
igor@402
|
66 to me, but that caused a considerable amount of consternation and
|
igor@402
|
67 strife within my development team. In spite of my attempts to explain
|
igor@402
|
68 why we needed a complex set of branches, and how changes ought to flow
|
igor@402
|
69 between them, a few team members revolted. Even though they were
|
igor@402
|
70 smart people, they didn't want to pay attention to the constraints we
|
igor@402
|
71 were operating under, or face the consequences of those constraints in
|
igor@402
|
72 the details of the model that I was advocating.
|
igor@402
|
73
|
igor@402
|
74 Don't sweep foreseeable social or technical problems under the rug.
|
igor@402
|
75 Whatever scheme you put into effect, you should plan for mistakes and
|
igor@402
|
76 problem scenarios. Consider adding automated machinery to prevent, or
|
igor@402
|
77 quickly recover from, trouble that you can anticipate. As an example,
|
igor@402
|
78 if you intend to have a branch with not-for-release changes in it,
|
igor@402
|
79 you'd do well to think early about the possibility that someone might
|
igor@402
|
80 accidentally merge those changes into a release branch. You could
|
igor@402
|
81 avoid this particular problem by writing a hook that prevents changes
|
igor@402
|
82 from being merged from an inappropriate branch.
|
igor@402
|
83
|
igor@402
|
84 \subsection{Informal anarchy}
|
igor@402
|
85
|
igor@402
|
86 I wouldn't suggest an ``anything goes'' approach as something
|
igor@402
|
87 sustainable, but it's a model that's easy to grasp, and it works
|
igor@402
|
88 perfectly well in a few unusual situations.
|
igor@402
|
89
|
igor@402
|
90 As one example, many projects have a loose-knit group of collaborators
|
igor@402
|
91 who rarely physically meet each other. Some groups like to overcome
|
igor@402
|
92 the isolation of working at a distance by organising occasional
|
igor@402
|
93 ``sprints''. In a sprint, a number of people get together in a single
|
igor@402
|
94 location (a company's conference room, a hotel meeting room, that kind
|
igor@402
|
95 of place) and spend several days more or less locked in there, hacking
|
igor@402
|
96 intensely on a handful of projects.
|
igor@402
|
97
|
igor@402
|
98 A sprint is the perfect place to use the \hgcmd{serve} command, since
|
igor@402
|
99 \hgcmd{serve} does not requires any fancy server infrastructure. You
|
igor@402
|
100 can get started with \hgcmd{serve} in moments, by reading
|
igor@402
|
101 section~\ref{sec:collab:serve} below. Then simply tell the person
|
igor@402
|
102 next to you that you're running a server, send the URL to them in an
|
igor@402
|
103 instant message, and you immediately have a quick-turnaround way to
|
igor@402
|
104 work together. They can type your URL into their web browser and
|
igor@402
|
105 quickly review your changes; or they can pull a bugfix from you and
|
igor@402
|
106 verify it; or they can clone a branch containing a new feature and try
|
igor@402
|
107 it out.
|
igor@402
|
108
|
igor@402
|
109 The charm, and the problem, with doing things in an ad hoc fashion
|
igor@402
|
110 like this is that only people who know about your changes, and where
|
igor@402
|
111 they are, can see them. Such an informal approach simply doesn't
|
igor@402
|
112 scale beyond a handful people, because each individual needs to know
|
igor@402
|
113 about $n$ different repositories to pull from.
|
igor@402
|
114
|
igor@402
|
115 \subsection{A single central repository}
|
igor@402
|
116
|
igor@402
|
117 For smaller projects migrating from a centralised revision control
|
igor@402
|
118 tool, perhaps the easiest way to get started is to have changes flow
|
igor@402
|
119 through a single shared central repository. This is also the
|
igor@402
|
120 most common ``building block'' for more ambitious workflow schemes.
|
igor@402
|
121
|
igor@402
|
122 Contributors start by cloning a copy of this repository. They can
|
igor@402
|
123 pull changes from it whenever they need to, and some (perhaps all)
|
igor@402
|
124 developers have permission to push a change back when they're ready
|
igor@402
|
125 for other people to see it.
|
igor@402
|
126
|
igor@402
|
127 Under this model, it can still often make sense for people to pull
|
igor@402
|
128 changes directly from each other, without going through the central
|
igor@402
|
129 repository. Consider a case in which I have a tentative bug fix, but
|
igor@402
|
130 I am worried that if I were to publish it to the central repository,
|
igor@402
|
131 it might subsequently break everyone else's trees as they pull it. To
|
igor@402
|
132 reduce the potential for damage, I can ask you to clone my repository
|
igor@402
|
133 into a temporary repository of your own and test it. This lets us put
|
igor@402
|
134 off publishing the potentially unsafe change until it has had a little
|
igor@402
|
135 testing.
|
igor@402
|
136
|
igor@402
|
137 In this kind of scenario, people usually use the \command{ssh}
|
igor@402
|
138 protocol to securely push changes to the central repository, as
|
igor@402
|
139 documented in section~\ref{sec:collab:ssh}. It's also usual to
|
igor@402
|
140 publish a read-only copy of the repository over HTTP using CGI, as in
|
igor@402
|
141 section~\ref{sec:collab:cgi}. Publishing over HTTP satisfies the
|
igor@402
|
142 needs of people who don't have push access, and those who want to use
|
igor@402
|
143 web browsers to browse the repository's history.
|
igor@402
|
144
|
igor@402
|
145 \subsection{Working with multiple branches}
|
igor@402
|
146
|
igor@402
|
147 Projects of any significant size naturally tend to make progress on
|
igor@402
|
148 several fronts simultaneously. In the case of software, it's common
|
igor@402
|
149 for a project to go through periodic official releases. A release
|
igor@402
|
150 might then go into ``maintenance mode'' for a while after its first
|
igor@402
|
151 publication; maintenance releases tend to contain only bug fixes, not
|
igor@402
|
152 new features. In parallel with these maintenance releases, one or
|
igor@402
|
153 more future releases may be under development. People normally use
|
igor@402
|
154 the word ``branch'' to refer to one of these many slightly different
|
igor@402
|
155 directions in which development is proceeding.
|
igor@402
|
156
|
igor@402
|
157 Mercurial is particularly well suited to managing a number of
|
igor@402
|
158 simultaneous, but not identical, branches. Each ``development
|
igor@402
|
159 direction'' can live in its own central repository, and you can merge
|
igor@402
|
160 changes from one to another as the need arises. Because repositories
|
igor@402
|
161 are independent of each other, unstable changes in a development
|
igor@402
|
162 branch will never affect a stable branch unless someone explicitly
|
igor@402
|
163 merges those changes in.
|
igor@402
|
164
|
igor@402
|
165 Here's an example of how this can work in practice. Let's say you
|
igor@402
|
166 have one ``main branch'' on a central server.
|
igor@402
|
167 \interaction{branching.init}
|
igor@402
|
168 People clone it, make changes locally, test them, and push them back.
|
igor@402
|
169
|
igor@402
|
170 Once the main branch reaches a release milestone, you can use the
|
igor@402
|
171 \hgcmd{tag} command to give a permanent name to the milestone
|
igor@402
|
172 revision.
|
igor@402
|
173 \interaction{branching.tag}
|
igor@402
|
174 Let's say some ongoing development occurs on the main branch.
|
igor@402
|
175 \interaction{branching.main}
|
igor@402
|
176 Using the tag that was recorded at the milestone, people who clone
|
igor@402
|
177 that repository at any time in the future can use \hgcmd{update} to
|
igor@402
|
178 get a copy of the working directory exactly as it was when that tagged
|
igor@402
|
179 revision was committed.
|
igor@402
|
180 \interaction{branching.update}
|
igor@402
|
181
|
igor@402
|
182 In addition, immediately after the main branch is tagged, someone can
|
igor@402
|
183 then clone the main branch on the server to a new ``stable'' branch,
|
igor@402
|
184 also on the server.
|
igor@402
|
185 \interaction{branching.clone}
|
igor@402
|
186
|
igor@402
|
187 Someone who needs to make a change to the stable branch can then clone
|
igor@402
|
188 \emph{that} repository, make their changes, commit, and push their
|
igor@402
|
189 changes back there.
|
igor@402
|
190 \interaction{branching.stable}
|
igor@402
|
191 Because Mercurial repositories are independent, and Mercurial doesn't
|
igor@402
|
192 move changes around automatically, the stable and main branches are
|
igor@402
|
193 \emph{isolated} from each other. The changes that you made on the
|
igor@402
|
194 main branch don't ``leak'' to the stable branch, and vice versa.
|
igor@402
|
195
|
igor@402
|
196 You'll often want all of your bugfixes on the stable branch to show up
|
igor@402
|
197 on the main branch, too. Rather than rewrite a bugfix on the main
|
igor@402
|
198 branch, you can simply pull and merge changes from the stable to the
|
igor@402
|
199 main branch, and Mercurial will bring those bugfixes in for you.
|
igor@402
|
200 \interaction{branching.merge}
|
igor@402
|
201 The main branch will still contain changes that are not on the stable
|
igor@402
|
202 branch, but it will also contain all of the bugfixes from the stable
|
igor@402
|
203 branch. The stable branch remains unaffected by these changes.
|
igor@402
|
204
|
igor@402
|
205 \subsection{Feature branches}
|
igor@402
|
206
|
igor@402
|
207 For larger projects, an effective way to manage change is to break up
|
igor@402
|
208 a team into smaller groups. Each group has a shared branch of its
|
igor@402
|
209 own, cloned from a single ``master'' branch used by the entire
|
igor@402
|
210 project. People working on an individual branch are typically quite
|
igor@402
|
211 isolated from developments on other branches.
|
igor@402
|
212
|
igor@402
|
213 \begin{figure}[ht]
|
igor@402
|
214 \centering
|
igor@402
|
215 \grafix{feature-branches}
|
igor@402
|
216 \caption{Feature branches}
|
igor@402
|
217 \label{fig:collab:feature-branches}
|
igor@402
|
218 \end{figure}
|
igor@402
|
219
|
igor@402
|
220 When a particular feature is deemed to be in suitable shape, someone
|
igor@402
|
221 on that feature team pulls and merges from the master branch into the
|
igor@402
|
222 feature branch, then pushes back up to the master branch.
|
igor@402
|
223
|
igor@402
|
224 \subsection{The release train}
|
igor@402
|
225
|
igor@402
|
226 Some projects are organised on a ``train'' basis: a release is
|
igor@402
|
227 scheduled to happen every few months, and whatever features are ready
|
igor@402
|
228 when the ``train'' is ready to leave are allowed in.
|
igor@402
|
229
|
igor@402
|
230 This model resembles working with feature branches. The difference is
|
igor@402
|
231 that when a feature branch misses a train, someone on the feature team
|
igor@402
|
232 pulls and merges the changes that went out on that train release into
|
igor@402
|
233 the feature branch, and the team continues its work on top of that
|
igor@402
|
234 release so that their feature can make the next release.
|
igor@402
|
235
|
igor@402
|
236 \subsection{The Linux kernel model}
|
igor@402
|
237
|
igor@402
|
238 The development of the Linux kernel has a shallow hierarchical
|
igor@402
|
239 structure, surrounded by a cloud of apparent chaos. Because most
|
igor@402
|
240 Linux developers use \command{git}, a distributed revision control
|
igor@402
|
241 tool with capabilities similar to Mercurial, it's useful to describe
|
igor@402
|
242 the way work flows in that environment; if you like the ideas, the
|
igor@402
|
243 approach translates well across tools.
|
igor@402
|
244
|
igor@402
|
245 At the center of the community sits Linus Torvalds, the creator of
|
igor@402
|
246 Linux. He publishes a single source repository that is considered the
|
igor@402
|
247 ``authoritative'' current tree by the entire developer community.
|
igor@402
|
248 Anyone can clone Linus's tree, but he is very choosy about whose trees
|
igor@402
|
249 he pulls from.
|
igor@402
|
250
|
igor@402
|
251 Linus has a number of ``trusted lieutenants''. As a general rule, he
|
igor@402
|
252 pulls whatever changes they publish, in most cases without even
|
igor@402
|
253 reviewing those changes. Some of those lieutenants are generally
|
igor@402
|
254 agreed to be ``maintainers'', responsible for specific subsystems
|
igor@402
|
255 within the kernel. If a random kernel hacker wants to make a change
|
igor@402
|
256 to a subsystem that they want to end up in Linus's tree, they must
|
igor@402
|
257 find out who the subsystem's maintainer is, and ask that maintainer to
|
igor@402
|
258 take their change. If the maintainer reviews their changes and agrees
|
igor@402
|
259 to take them, they'll pass them along to Linus in due course.
|
igor@402
|
260
|
igor@402
|
261 Individual lieutenants have their own approaches to reviewing,
|
igor@402
|
262 accepting, and publishing changes; and for deciding when to feed them
|
igor@402
|
263 to Linus. In addition, there are several well known branches that
|
igor@402
|
264 people use for different purposes. For example, a few people maintain
|
igor@402
|
265 ``stable'' repositories of older versions of the kernel, to which they
|
igor@402
|
266 apply critical fixes as needed. Some maintainers publish multiple
|
igor@402
|
267 trees: one for experimental changes; one for changes that they are
|
igor@402
|
268 about to feed upstream; and so on. Others just publish a single
|
igor@402
|
269 tree.
|
igor@402
|
270
|
igor@402
|
271 This model has two notable features. The first is that it's ``pull
|
igor@402
|
272 only''. You have to ask, convince, or beg another developer to take a
|
igor@402
|
273 change from you, because there are almost no trees to which more than
|
igor@402
|
274 one person can push, and there's no way to push changes into a tree
|
igor@402
|
275 that someone else controls.
|
igor@402
|
276
|
igor@402
|
277 The second is that it's based on reputation and acclaim. If you're an
|
igor@402
|
278 unknown, Linus will probably ignore changes from you without even
|
igor@402
|
279 responding. But a subsystem maintainer will probably review them, and
|
igor@402
|
280 will likely take them if they pass their criteria for suitability.
|
igor@402
|
281 The more ``good'' changes you contribute to a maintainer, the more
|
igor@402
|
282 likely they are to trust your judgment and accept your changes. If
|
igor@402
|
283 you're well-known and maintain a long-lived branch for something Linus
|
igor@402
|
284 hasn't yet accepted, people with similar interests may pull your
|
igor@402
|
285 changes regularly to keep up with your work.
|
igor@402
|
286
|
igor@402
|
287 Reputation and acclaim don't necessarily cross subsystem or ``people''
|
igor@402
|
288 boundaries. If you're a respected but specialised storage hacker, and
|
igor@402
|
289 you try to fix a networking bug, that change will receive a level of
|
igor@402
|
290 scrutiny from a network maintainer comparable to a change from a
|
igor@402
|
291 complete stranger.
|
igor@402
|
292
|
igor@402
|
293 To people who come from more orderly project backgrounds, the
|
igor@402
|
294 comparatively chaotic Linux kernel development process often seems
|
igor@402
|
295 completely insane. It's subject to the whims of individuals; people
|
igor@402
|
296 make sweeping changes whenever they deem it appropriate; and the pace
|
igor@402
|
297 of development is astounding. And yet Linux is a highly successful,
|
igor@402
|
298 well-regarded piece of software.
|
igor@402
|
299
|
igor@402
|
300 \subsection{Pull-only versus shared-push collaboration}
|
igor@402
|
301
|
igor@402
|
302 A perpetual source of heat in the open source community is whether a
|
igor@402
|
303 development model in which people only ever pull changes from others
|
igor@402
|
304 is ``better than'' one in which multiple people can push changes to a
|
igor@402
|
305 shared repository.
|
igor@402
|
306
|
igor@402
|
307 Typically, the backers of the shared-push model use tools that
|
igor@402
|
308 actively enforce this approach. If you're using a centralised
|
igor@402
|
309 revision control tool such as Subversion, there's no way to make a
|
igor@402
|
310 choice over which model you'll use: the tool gives you shared-push,
|
igor@402
|
311 and if you want to do anything else, you'll have to roll your own
|
igor@402
|
312 approach on top (such as applying a patch by hand).
|
igor@402
|
313
|
igor@402
|
314 A good distributed revision control tool, such as Mercurial, will
|
igor@402
|
315 support both models. You and your collaborators can then structure
|
igor@402
|
316 how you work together based on your own needs and preferences, not on
|
igor@402
|
317 what contortions your tools force you into.
|
igor@402
|
318
|
igor@402
|
319 \subsection{Where collaboration meets branch management}
|
igor@402
|
320
|
igor@402
|
321 Once you and your team set up some shared repositories and start
|
igor@402
|
322 propagating changes back and forth between local and shared repos, you
|
igor@402
|
323 begin to face a related, but slightly different challenge: that of
|
igor@402
|
324 managing the multiple directions in which your team may be moving at
|
igor@402
|
325 once. Even though this subject is intimately related to how your team
|
igor@402
|
326 collaborates, it's dense enough to merit treatment of its own, in
|
igor@402
|
327 chapter~\ref{chap:branch}.
|
igor@402
|
328
|
igor@402
|
329 \section{The technical side of sharing}
|
igor@402
|
330
|
igor@402
|
331 The remainder of this chapter is devoted to the question of serving
|
igor@402
|
332 data to your collaborators.
|
igor@402
|
333
|
igor@402
|
334 \section{Informal sharing with \hgcmd{serve}}
|
igor@402
|
335 \label{sec:collab:serve}
|
igor@402
|
336
|
igor@402
|
337 Mercurial's \hgcmd{serve} command is wonderfully suited to small,
|
igor@402
|
338 tight-knit, and fast-paced group environments. It also provides a
|
igor@402
|
339 great way to get a feel for using Mercurial commands over a network.
|
igor@402
|
340
|
igor@402
|
341 Run \hgcmd{serve} inside a repository, and in under a second it will
|
igor@402
|
342 bring up a specialised HTTP server; this will accept connections from
|
igor@402
|
343 any client, and serve up data for that repository until you terminate
|
igor@402
|
344 it. Anyone who knows the URL of the server you just started, and can
|
igor@402
|
345 talk to your computer over the network, can then use a web browser or
|
igor@402
|
346 Mercurial to read data from that repository. A URL for a
|
igor@402
|
347 \hgcmd{serve} instance running on a laptop is likely to look something
|
igor@402
|
348 like \Verb|http://my-laptop.local:8000/|.
|
igor@402
|
349
|
igor@402
|
350 The \hgcmd{serve} command is \emph{not} a general-purpose web server.
|
igor@402
|
351 It can do only two things:
|
igor@402
|
352 \begin{itemize}
|
igor@402
|
353 \item Allow people to browse the history of the repository it's
|
igor@402
|
354 serving, from their normal web browsers.
|
igor@402
|
355 \item Speak Mercurial's wire protocol, so that people can
|
igor@402
|
356 \hgcmd{clone} or \hgcmd{pull} changes from that repository.
|
igor@402
|
357 \end{itemize}
|
igor@402
|
358 In particular, \hgcmd{serve} won't allow remote users to \emph{modify}
|
igor@402
|
359 your repository. It's intended for read-only use.
|
igor@402
|
360
|
igor@402
|
361 If you're getting started with Mercurial, there's nothing to prevent
|
igor@402
|
362 you from using \hgcmd{serve} to serve up a repository on your own
|
igor@402
|
363 computer, then use commands like \hgcmd{clone}, \hgcmd{incoming}, and
|
igor@402
|
364 so on to talk to that server as if the repository was hosted remotely.
|
igor@402
|
365 This can help you to quickly get acquainted with using commands on
|
igor@402
|
366 network-hosted repositories.
|
igor@402
|
367
|
igor@402
|
368 \subsection{A few things to keep in mind}
|
igor@402
|
369
|
igor@402
|
370 Because it provides unauthenticated read access to all clients, you
|
igor@402
|
371 should only use \hgcmd{serve} in an environment where you either don't
|
igor@402
|
372 care, or have complete control over, who can access your network and
|
igor@402
|
373 pull data from your repository.
|
igor@402
|
374
|
igor@402
|
375 The \hgcmd{serve} command knows nothing about any firewall software
|
igor@402
|
376 you might have installed on your system or network. It cannot detect
|
igor@402
|
377 or control your firewall software. If other people are unable to talk
|
igor@402
|
378 to a running \hgcmd{serve} instance, the second thing you should do
|
igor@402
|
379 (\emph{after} you make sure that they're using the correct URL) is
|
igor@402
|
380 check your firewall configuration.
|
igor@402
|
381
|
igor@402
|
382 By default, \hgcmd{serve} listens for incoming connections on
|
igor@402
|
383 port~8000. If another process is already listening on the port you
|
igor@402
|
384 want to use, you can specify a different port to listen on using the
|
igor@402
|
385 \hgopt{serve}{-p} option.
|
igor@402
|
386
|
igor@402
|
387 Normally, when \hgcmd{serve} starts, it prints no output, which can be
|
igor@402
|
388 a bit unnerving. If you'd like to confirm that it is indeed running
|
igor@402
|
389 correctly, and find out what URL you should send to your
|
igor@402
|
390 collaborators, start it with the \hggopt{-v} option.
|
igor@402
|
391
|
igor@402
|
392 \section{Using the Secure Shell (ssh) protocol}
|
igor@402
|
393 \label{sec:collab:ssh}
|
igor@402
|
394
|
igor@402
|
395 You can pull and push changes securely over a network connection using
|
igor@402
|
396 the Secure Shell (\texttt{ssh}) protocol. To use this successfully,
|
igor@402
|
397 you may have to do a little bit of configuration on the client or
|
igor@402
|
398 server sides.
|
igor@402
|
399
|
igor@402
|
400 If you're not familiar with ssh, it's a network protocol that lets you
|
igor@402
|
401 securely communicate with another computer. To use it with Mercurial,
|
igor@402
|
402 you'll be setting up one or more user accounts on a server so that
|
igor@402
|
403 remote users can log in and execute commands.
|
igor@402
|
404
|
igor@402
|
405 (If you \emph{are} familiar with ssh, you'll probably find some of the
|
igor@402
|
406 material that follows to be elementary in nature.)
|
igor@402
|
407
|
igor@402
|
408 \subsection{How to read and write ssh URLs}
|
igor@402
|
409
|
igor@402
|
410 An ssh URL tends to look like this:
|
igor@402
|
411 \begin{codesample2}
|
igor@402
|
412 ssh://bos@hg.serpentine.com:22/hg/hgbook
|
igor@402
|
413 \end{codesample2}
|
igor@402
|
414 \begin{enumerate}
|
igor@402
|
415 \item The ``\texttt{ssh://}'' part tells Mercurial to use the ssh
|
igor@402
|
416 protocol.
|
igor@402
|
417 \item The ``\texttt{bos@}'' component indicates what username to log
|
igor@402
|
418 into the server as. You can leave this out if the remote username
|
igor@402
|
419 is the same as your local username.
|
igor@402
|
420 \item The ``\texttt{hg.serpentine.com}'' gives the hostname of the
|
igor@402
|
421 server to log into.
|
igor@402
|
422 \item The ``:22'' identifies the port number to connect to the server
|
igor@402
|
423 on. The default port is~22, so you only need to specify this part
|
igor@402
|
424 if you're \emph{not} using port~22.
|
igor@402
|
425 \item The remainder of the URL is the local path to the repository on
|
igor@402
|
426 the server.
|
igor@402
|
427 \end{enumerate}
|
igor@402
|
428
|
igor@402
|
429 There's plenty of scope for confusion with the path component of ssh
|
igor@402
|
430 URLs, as there is no standard way for tools to interpret it. Some
|
igor@402
|
431 programs behave differently than others when dealing with these paths.
|
igor@402
|
432 This isn't an ideal situation, but it's unlikely to change. Please
|
igor@402
|
433 read the following paragraphs carefully.
|
igor@402
|
434
|
igor@402
|
435 Mercurial treats the path to a repository on the server as relative to
|
igor@402
|
436 the remote user's home directory. For example, if user \texttt{foo}
|
igor@402
|
437 on the server has a home directory of \dirname{/home/foo}, then an ssh
|
igor@402
|
438 URL that contains a path component of \dirname{bar}
|
igor@402
|
439 \emph{really} refers to the directory \dirname{/home/foo/bar}.
|
igor@402
|
440
|
igor@402
|
441 If you want to specify a path relative to another user's home
|
igor@402
|
442 directory, you can use a path that starts with a tilde character
|
igor@402
|
443 followed by the user's name (let's call them \texttt{otheruser}), like
|
igor@402
|
444 this.
|
igor@402
|
445 \begin{codesample2}
|
igor@402
|
446 ssh://server/~otheruser/hg/repo
|
igor@402
|
447 \end{codesample2}
|
igor@402
|
448
|
igor@402
|
449 And if you really want to specify an \emph{absolute} path on the
|
igor@402
|
450 server, begin the path component with two slashes, as in this example.
|
igor@402
|
451 \begin{codesample2}
|
igor@402
|
452 ssh://server//absolute/path
|
igor@402
|
453 \end{codesample2}
|
igor@402
|
454
|
igor@402
|
455 \subsection{Finding an ssh client for your system}
|
igor@402
|
456
|
igor@402
|
457 Almost every Unix-like system comes with OpenSSH preinstalled. If
|
igor@402
|
458 you're using such a system, run \Verb|which ssh| to find out if
|
igor@402
|
459 the \command{ssh} command is installed (it's usually in
|
igor@402
|
460 \dirname{/usr/bin}). In the unlikely event that it isn't present,
|
igor@402
|
461 take a look at your system documentation to figure out how to install
|
igor@402
|
462 it.
|
igor@402
|
463
|
igor@402
|
464 On Windows, you'll first need to choose download a suitable ssh
|
igor@402
|
465 client. There are two alternatives.
|
igor@402
|
466 \begin{itemize}
|
igor@402
|
467 \item Simon Tatham's excellent PuTTY package~\cite{web:putty} provides
|
igor@402
|
468 a complete suite of ssh client commands.
|
igor@402
|
469 \item If you have a high tolerance for pain, you can use the Cygwin
|
igor@402
|
470 port of OpenSSH.
|
igor@402
|
471 \end{itemize}
|
igor@402
|
472 In either case, you'll need to edit your \hgini\ file to tell
|
igor@402
|
473 Mercurial where to find the actual client command. For example, if
|
igor@402
|
474 you're using PuTTY, you'll need to use the \command{plink} command as
|
igor@402
|
475 a command-line ssh client.
|
igor@402
|
476 \begin{codesample2}
|
igor@402
|
477 [ui]
|
igor@402
|
478 ssh = C:/path/to/plink.exe -ssh -i "C:/path/to/my/private/key"
|
igor@402
|
479 \end{codesample2}
|
igor@402
|
480
|
igor@402
|
481 \begin{note}
|
igor@402
|
482 The path to \command{plink} shouldn't contain any whitespace
|
igor@402
|
483 characters, or Mercurial may not be able to run it correctly (so
|
igor@402
|
484 putting it in \dirname{C:\\Program Files} is probably not a good
|
igor@402
|
485 idea).
|
igor@402
|
486 \end{note}
|
igor@402
|
487
|
igor@402
|
488 \subsection{Generating a key pair}
|
igor@402
|
489
|
igor@402
|
490 To avoid the need to repetitively type a password every time you need
|
igor@402
|
491 to use your ssh client, I recommend generating a key pair. On a
|
igor@402
|
492 Unix-like system, the \command{ssh-keygen} command will do the trick.
|
igor@402
|
493 On Windows, if you're using PuTTY, the \command{puttygen} command is
|
igor@402
|
494 what you'll need.
|
igor@402
|
495
|
igor@402
|
496 When you generate a key pair, it's usually \emph{highly} advisable to
|
igor@402
|
497 protect it with a passphrase. (The only time that you might not want
|
igor@402
|
498 to do this id when you're using the ssh protocol for automated tasks
|
igor@402
|
499 on a secure network.)
|
igor@402
|
500
|
igor@402
|
501 Simply generating a key pair isn't enough, however. You'll need to
|
igor@402
|
502 add the public key to the set of authorised keys for whatever user
|
igor@402
|
503 you're logging in remotely as. For servers using OpenSSH (the vast
|
igor@402
|
504 majority), this will mean adding the public key to a list in a file
|
igor@402
|
505 called \sfilename{authorized\_keys} in their \sdirname{.ssh}
|
igor@402
|
506 directory.
|
igor@402
|
507
|
igor@402
|
508 On a Unix-like system, your public key will have a \filename{.pub}
|
igor@402
|
509 extension. If you're using \command{puttygen} on Windows, you can
|
igor@402
|
510 save the public key to a file of your choosing, or paste it from the
|
igor@402
|
511 window it's displayed in straight into the
|
igor@402
|
512 \sfilename{authorized\_keys} file.
|
igor@402
|
513
|
igor@402
|
514 \subsection{Using an authentication agent}
|
igor@402
|
515
|
igor@402
|
516 An authentication agent is a daemon that stores passphrases in memory
|
igor@402
|
517 (so it will forget passphrases if you log out and log back in again).
|
igor@402
|
518 An ssh client will notice if it's running, and query it for a
|
igor@402
|
519 passphrase. If there's no authentication agent running, or the agent
|
igor@402
|
520 doesn't store the necessary passphrase, you'll have to type your
|
igor@402
|
521 passphrase every time Mercurial tries to communicate with a server on
|
igor@402
|
522 your behalf (e.g.~whenever you pull or push changes).
|
igor@402
|
523
|
igor@402
|
524 The downside of storing passphrases in an agent is that it's possible
|
igor@402
|
525 for a well-prepared attacker to recover the plain text of your
|
igor@402
|
526 passphrases, in some cases even if your system has been power-cycled.
|
igor@402
|
527 You should make your own judgment as to whether this is an acceptable
|
igor@402
|
528 risk. It certainly saves a lot of repeated typing.
|
igor@402
|
529
|
igor@402
|
530 On Unix-like systems, the agent is called \command{ssh-agent}, and
|
igor@402
|
531 it's often run automatically for you when you log in. You'll need to
|
igor@402
|
532 use the \command{ssh-add} command to add passphrases to the agent's
|
igor@402
|
533 store. On Windows, if you're using PuTTY, the \command{pageant}
|
igor@402
|
534 command acts as the agent. It adds an icon to your system tray that
|
igor@402
|
535 will let you manage stored passphrases.
|
igor@402
|
536
|
igor@402
|
537 \subsection{Configuring the server side properly}
|
igor@402
|
538
|
igor@402
|
539 Because ssh can be fiddly to set up if you're new to it, there's a
|
igor@402
|
540 variety of things that can go wrong. Add Mercurial on top, and
|
igor@402
|
541 there's plenty more scope for head-scratching. Most of these
|
igor@402
|
542 potential problems occur on the server side, not the client side. The
|
igor@402
|
543 good news is that once you've gotten a configuration working, it will
|
igor@402
|
544 usually continue to work indefinitely.
|
igor@402
|
545
|
igor@402
|
546 Before you try using Mercurial to talk to an ssh server, it's best to
|
igor@402
|
547 make sure that you can use the normal \command{ssh} or \command{putty}
|
igor@402
|
548 command to talk to the server first. If you run into problems with
|
igor@402
|
549 using these commands directly, Mercurial surely won't work. Worse, it
|
igor@402
|
550 will obscure the underlying problem. Any time you want to debug
|
igor@402
|
551 ssh-related Mercurial problems, you should drop back to making sure
|
igor@402
|
552 that plain ssh client commands work first, \emph{before} you worry
|
igor@402
|
553 about whether there's a problem with Mercurial.
|
igor@402
|
554
|
igor@402
|
555 The first thing to be sure of on the server side is that you can
|
igor@402
|
556 actually log in from another machine at all. If you can't use
|
igor@402
|
557 \command{ssh} or \command{putty} to log in, the error message you get
|
igor@402
|
558 may give you a few hints as to what's wrong. The most common problems
|
igor@402
|
559 are as follows.
|
igor@402
|
560 \begin{itemize}
|
igor@402
|
561 \item If you get a ``connection refused'' error, either there isn't an
|
igor@402
|
562 SSH daemon running on the server at all, or it's inaccessible due to
|
igor@402
|
563 firewall configuration.
|
igor@402
|
564 \item If you get a ``no route to host'' error, you either have an
|
igor@402
|
565 incorrect address for the server or a seriously locked down firewall
|
igor@402
|
566 that won't admit its existence at all.
|
igor@402
|
567 \item If you get a ``permission denied'' error, you may have mistyped
|
igor@402
|
568 the username on the server, or you could have mistyped your key's
|
igor@402
|
569 passphrase or the remote user's password.
|
igor@402
|
570 \end{itemize}
|
igor@402
|
571 In summary, if you're having trouble talking to the server's ssh
|
igor@402
|
572 daemon, first make sure that one is running at all. On many systems
|
igor@402
|
573 it will be installed, but disabled, by default. Once you're done with
|
igor@402
|
574 this step, you should then check that the server's firewall is
|
igor@402
|
575 configured to allow incoming connections on the port the ssh daemon is
|
igor@402
|
576 listening on (usually~22). Don't worry about more exotic
|
igor@402
|
577 possibilities for misconfiguration until you've checked these two
|
igor@402
|
578 first.
|
igor@402
|
579
|
igor@402
|
580 If you're using an authentication agent on the client side to store
|
igor@402
|
581 passphrases for your keys, you ought to be able to log into the server
|
igor@402
|
582 without being prompted for a passphrase or a password. If you're
|
igor@402
|
583 prompted for a passphrase, there are a few possible culprits.
|
igor@402
|
584 \begin{itemize}
|
igor@402
|
585 \item You might have forgotten to use \command{ssh-add} or
|
igor@402
|
586 \command{pageant} to store the passphrase.
|
igor@402
|
587 \item You might have stored the passphrase for the wrong key.
|
igor@402
|
588 \end{itemize}
|
igor@402
|
589 If you're being prompted for the remote user's password, there are
|
igor@402
|
590 another few possible problems to check.
|
igor@402
|
591 \begin{itemize}
|
igor@402
|
592 \item Either the user's home directory or their \sdirname{.ssh}
|
igor@402
|
593 directory might have excessively liberal permissions. As a result,
|
igor@402
|
594 the ssh daemon will not trust or read their
|
igor@402
|
595 \sfilename{authorized\_keys} file. For example, a group-writable
|
igor@402
|
596 home or \sdirname{.ssh} directory will often cause this symptom.
|
igor@402
|
597 \item The user's \sfilename{authorized\_keys} file may have a problem.
|
igor@402
|
598 If anyone other than the user owns or can write to that file, the
|
igor@402
|
599 ssh daemon will not trust or read it.
|
igor@402
|
600 \end{itemize}
|
igor@402
|
601
|
igor@402
|
602 In the ideal world, you should be able to run the following command
|
igor@402
|
603 successfully, and it should print exactly one line of output, the
|
igor@402
|
604 current date and time.
|
igor@402
|
605 \begin{codesample2}
|
igor@402
|
606 ssh myserver date
|
igor@402
|
607 \end{codesample2}
|
igor@402
|
608
|
igor@402
|
609 If, on your server, you have login scripts that print banners or other
|
igor@402
|
610 junk even when running non-interactive commands like this, you should
|
igor@402
|
611 fix them before you continue, so that they only print output if
|
igor@402
|
612 they're run interactively. Otherwise these banners will at least
|
igor@402
|
613 clutter up Mercurial's output. Worse, they could potentially cause
|
igor@402
|
614 problems with running Mercurial commands remotely. Mercurial makes
|
igor@402
|
615 tries to detect and ignore banners in non-interactive \command{ssh}
|
igor@402
|
616 sessions, but it is not foolproof. (If you're editing your login
|
igor@402
|
617 scripts on your server, the usual way to see if a login script is
|
igor@402
|
618 running in an interactive shell is to check the return code from the
|
igor@402
|
619 command \Verb|tty -s|.)
|
igor@402
|
620
|
igor@402
|
621 Once you've verified that plain old ssh is working with your server,
|
igor@402
|
622 the next step is to ensure that Mercurial runs on the server. The
|
igor@402
|
623 following command should run successfully:
|
igor@402
|
624 \begin{codesample2}
|
igor@402
|
625 ssh myserver hg version
|
igor@402
|
626 \end{codesample2}
|
igor@402
|
627 If you see an error message instead of normal \hgcmd{version} output,
|
igor@402
|
628 this is usually because you haven't installed Mercurial to
|
igor@402
|
629 \dirname{/usr/bin}. Don't worry if this is the case; you don't need
|
igor@402
|
630 to do that. But you should check for a few possible problems.
|
igor@402
|
631 \begin{itemize}
|
igor@402
|
632 \item Is Mercurial really installed on the server at all? I know this
|
igor@402
|
633 sounds trivial, but it's worth checking!
|
igor@402
|
634 \item Maybe your shell's search path (usually set via the \envar{PATH}
|
igor@402
|
635 environment variable) is simply misconfigured.
|
igor@402
|
636 \item Perhaps your \envar{PATH} environment variable is only being set
|
igor@402
|
637 to point to the location of the \command{hg} executable if the login
|
igor@402
|
638 session is interactive. This can happen if you're setting the path
|
igor@402
|
639 in the wrong shell login script. See your shell's documentation for
|
igor@402
|
640 details.
|
igor@402
|
641 \item The \envar{PYTHONPATH} environment variable may need to contain
|
igor@402
|
642 the path to the Mercurial Python modules. It might not be set at
|
igor@402
|
643 all; it could be incorrect; or it may be set only if the login is
|
igor@402
|
644 interactive.
|
igor@402
|
645 \end{itemize}
|
igor@402
|
646
|
igor@402
|
647 If you can run \hgcmd{version} over an ssh connection, well done!
|
igor@402
|
648 You've got the server and client sorted out. You should now be able
|
igor@402
|
649 to use Mercurial to access repositories hosted by that username on
|
igor@402
|
650 that server. If you run into problems with Mercurial and ssh at this
|
igor@402
|
651 point, try using the \hggopt{--debug} option to get a clearer picture
|
igor@402
|
652 of what's going on.
|
igor@402
|
653
|
igor@402
|
654 \subsection{Using compression with ssh}
|
igor@402
|
655
|
igor@402
|
656 Mercurial does not compress data when it uses the ssh protocol,
|
igor@402
|
657 because the ssh protocol can transparently compress data. However,
|
igor@402
|
658 the default behaviour of ssh clients is \emph{not} to request
|
igor@402
|
659 compression.
|
igor@402
|
660
|
igor@402
|
661 Over any network other than a fast LAN (even a wireless network),
|
igor@402
|
662 using compression is likely to significantly speed up Mercurial's
|
igor@402
|
663 network operations. For example, over a WAN, someone measured
|
igor@402
|
664 compression as reducing the amount of time required to clone a
|
igor@402
|
665 particularly large repository from~51 minutes to~17 minutes.
|
igor@402
|
666
|
igor@402
|
667 Both \command{ssh} and \command{plink} accept a \cmdopt{ssh}{-C}
|
igor@402
|
668 option which turns on compression. You can easily edit your \hgrc\ to
|
igor@402
|
669 enable compression for all of Mercurial's uses of the ssh protocol.
|
igor@402
|
670 \begin{codesample2}
|
igor@402
|
671 [ui]
|
igor@402
|
672 ssh = ssh -C
|
igor@402
|
673 \end{codesample2}
|
igor@402
|
674
|
igor@402
|
675 If you use \command{ssh}, you can configure it to always use
|
igor@402
|
676 compression when talking to your server. To do this, edit your
|
igor@402
|
677 \sfilename{.ssh/config} file (which may not yet exist), as follows.
|
igor@402
|
678 \begin{codesample2}
|
igor@402
|
679 Host hg
|
igor@402
|
680 Compression yes
|
igor@402
|
681 HostName hg.example.com
|
igor@402
|
682 \end{codesample2}
|
igor@402
|
683 This defines an alias, \texttt{hg}. When you use it on the
|
igor@402
|
684 \command{ssh} command line or in a Mercurial \texttt{ssh}-protocol
|
igor@402
|
685 URL, it will cause \command{ssh} to connect to \texttt{hg.example.com}
|
igor@402
|
686 and use compression. This gives you both a shorter name to type and
|
igor@402
|
687 compression, each of which is a good thing in its own right.
|
igor@402
|
688
|
igor@402
|
689 \section{Serving over HTTP using CGI}
|
igor@402
|
690 \label{sec:collab:cgi}
|
igor@402
|
691
|
igor@402
|
692 Depending on how ambitious you are, configuring Mercurial's CGI
|
igor@402
|
693 interface can take anything from a few moments to several hours.
|
igor@402
|
694
|
igor@402
|
695 We'll begin with the simplest of examples, and work our way towards a
|
igor@402
|
696 more complex configuration. Even for the most basic case, you're
|
igor@402
|
697 almost certainly going to need to read and modify your web server's
|
igor@402
|
698 configuration.
|
igor@402
|
699
|
igor@402
|
700 \begin{note}
|
igor@402
|
701 Configuring a web server is a complex, fiddly, and highly
|
igor@402
|
702 system-dependent activity. I can't possibly give you instructions
|
igor@402
|
703 that will cover anything like all of the cases you will encounter.
|
igor@402
|
704 Please use your discretion and judgment in following the sections
|
igor@402
|
705 below. Be prepared to make plenty of mistakes, and to spend a lot
|
igor@402
|
706 of time reading your server's error logs.
|
igor@402
|
707 \end{note}
|
igor@402
|
708
|
igor@402
|
709 \subsection{Web server configuration checklist}
|
igor@402
|
710
|
igor@402
|
711 Before you continue, do take a few moments to check a few aspects of
|
igor@402
|
712 your system's setup.
|
igor@402
|
713
|
igor@402
|
714 \begin{enumerate}
|
igor@402
|
715 \item Do you have a web server installed at all? Mac OS X ships with
|
igor@402
|
716 Apache, but many other systems may not have a web server installed.
|
igor@402
|
717 \item If you have a web server installed, is it actually running? On
|
igor@402
|
718 most systems, even if one is present, it will be disabled by
|
igor@402
|
719 default.
|
igor@402
|
720 \item Is your server configured to allow you to run CGI programs in
|
igor@402
|
721 the directory where you plan to do so? Most servers default to
|
igor@402
|
722 explicitly disabling the ability to run CGI programs.
|
igor@402
|
723 \end{enumerate}
|
igor@402
|
724
|
igor@402
|
725 If you don't have a web server installed, and don't have substantial
|
igor@402
|
726 experience configuring Apache, you should consider using the
|
igor@402
|
727 \texttt{lighttpd} web server instead of Apache. Apache has a
|
igor@402
|
728 well-deserved reputation for baroque and confusing configuration.
|
igor@402
|
729 While \texttt{lighttpd} is less capable in some ways than Apache, most
|
igor@402
|
730 of these capabilities are not relevant to serving Mercurial
|
igor@402
|
731 repositories. And \texttt{lighttpd} is undeniably \emph{much} easier
|
igor@402
|
732 to get started with than Apache.
|
igor@402
|
733
|
igor@402
|
734 \subsection{Basic CGI configuration}
|
igor@402
|
735
|
igor@402
|
736 On Unix-like systems, it's common for users to have a subdirectory
|
igor@402
|
737 named something like \dirname{public\_html} in their home directory,
|
igor@402
|
738 from which they can serve up web pages. A file named \filename{foo}
|
igor@402
|
739 in this directory will be accessible at a URL of the form
|
igor@402
|
740 \texttt{http://www.example.com/\~username/foo}.
|
igor@402
|
741
|
igor@402
|
742 To get started, find the \sfilename{hgweb.cgi} script that should be
|
igor@402
|
743 present in your Mercurial installation. If you can't quickly find a
|
igor@402
|
744 local copy on your system, simply download one from the master
|
igor@402
|
745 Mercurial repository at
|
igor@402
|
746 \url{http://www.selenic.com/repo/hg/raw-file/tip/hgweb.cgi}.
|
igor@402
|
747
|
igor@402
|
748 You'll need to copy this script into your \dirname{public\_html}
|
igor@402
|
749 directory, and ensure that it's executable.
|
igor@402
|
750 \begin{codesample2}
|
igor@402
|
751 cp .../hgweb.cgi ~/public_html
|
igor@402
|
752 chmod 755 ~/public_html/hgweb.cgi
|
igor@402
|
753 \end{codesample2}
|
igor@402
|
754 The \texttt{755} argument to \command{chmod} is a little more general
|
igor@402
|
755 than just making the script executable: it ensures that the script is
|
igor@402
|
756 executable by anyone, and that ``group'' and ``other'' write
|
igor@402
|
757 permissions are \emph{not} set. If you were to leave those write
|
igor@402
|
758 permissions enabled, Apache's \texttt{suexec} subsystem would likely
|
igor@402
|
759 refuse to execute the script. In fact, \texttt{suexec} also insists
|
igor@402
|
760 that the \emph{directory} in which the script resides must not be
|
igor@402
|
761 writable by others.
|
igor@402
|
762 \begin{codesample2}
|
igor@402
|
763 chmod 755 ~/public_html
|
igor@402
|
764 \end{codesample2}
|
igor@402
|
765
|
igor@402
|
766 \subsubsection{What could \emph{possibly} go wrong?}
|
igor@402
|
767 \label{sec:collab:wtf}
|
igor@402
|
768
|
igor@402
|
769 Once you've copied the CGI script into place, go into a web browser,
|
igor@402
|
770 and try to open the URL \url{http://myhostname/~myuser/hgweb.cgi},
|
igor@402
|
771 \emph{but} brace yourself for instant failure. There's a high
|
igor@402
|
772 probability that trying to visit this URL will fail, and there are
|
igor@402
|
773 many possible reasons for this. In fact, you're likely to stumble
|
igor@402
|
774 over almost every one of the possible errors below, so please read
|
igor@402
|
775 carefully. The following are all of the problems I ran into on a
|
igor@402
|
776 system running Fedora~7, with a fresh installation of Apache, and a
|
igor@402
|
777 user account that I created specially to perform this exercise.
|
igor@402
|
778
|
igor@402
|
779 Your web server may have per-user directories disabled. If you're
|
igor@402
|
780 using Apache, search your config file for a \texttt{UserDir}
|
igor@402
|
781 directive. If there's none present, per-user directories will be
|
igor@402
|
782 disabled. If one exists, but its value is \texttt{disabled}, then
|
igor@402
|
783 per-user directories will be disabled. Otherwise, the string after
|
igor@402
|
784 \texttt{UserDir} gives the name of the subdirectory that Apache will
|
igor@402
|
785 look in under your home directory, for example \dirname{public\_html}.
|
igor@402
|
786
|
igor@402
|
787 Your file access permissions may be too restrictive. The web server
|
igor@402
|
788 must be able to traverse your home directory and directories under
|
igor@402
|
789 your \dirname{public\_html} directory, and read files under the latter
|
igor@402
|
790 too. Here's a quick recipe to help you to make your permissions more
|
igor@402
|
791 appropriate.
|
igor@402
|
792 \begin{codesample2}
|
igor@402
|
793 chmod 755 ~
|
igor@402
|
794 find ~/public_html -type d -print0 | xargs -0r chmod 755
|
igor@402
|
795 find ~/public_html -type f -print0 | xargs -0r chmod 644
|
igor@402
|
796 \end{codesample2}
|
igor@402
|
797
|
igor@402
|
798 The other possibility with permissions is that you might get a
|
igor@402
|
799 completely empty window when you try to load the script. In this
|
igor@402
|
800 case, it's likely that your access permissions are \emph{too
|
igor@402
|
801 permissive}. Apache's \texttt{suexec} subsystem won't execute a
|
igor@402
|
802 script that's group-~or world-writable, for example.
|
igor@402
|
803
|
igor@402
|
804 Your web server may be configured to disallow execution of CGI
|
igor@402
|
805 programs in your per-user web directory. Here's Apache's
|
igor@402
|
806 default per-user configuration from my Fedora system.
|
igor@402
|
807 \begin{codesample2}
|
igor@402
|
808 <Directory /home/*/public_html>
|
igor@402
|
809 AllowOverride FileInfo AuthConfig Limit
|
igor@402
|
810 Options MultiViews Indexes SymLinksIfOwnerMatch IncludesNoExec
|
igor@402
|
811 <Limit GET POST OPTIONS>
|
igor@402
|
812 Order allow,deny
|
igor@402
|
813 Allow from all
|
igor@402
|
814 </Limit>
|
igor@402
|
815 <LimitExcept GET POST OPTIONS>
|
igor@402
|
816 Order deny,allow
|
igor@402
|
817 Deny from all
|
igor@402
|
818 </LimitExcept>
|
igor@402
|
819 </Directory>
|
igor@402
|
820 \end{codesample2}
|
igor@402
|
821 If you find a similar-looking \texttt{Directory} group in your Apache
|
igor@402
|
822 configuration, the directive to look at inside it is \texttt{Options}.
|
igor@402
|
823 Add \texttt{ExecCGI} to the end of this list if it's missing, and
|
igor@402
|
824 restart the web server.
|
igor@402
|
825
|
igor@402
|
826 If you find that Apache serves you the text of the CGI script instead
|
igor@402
|
827 of executing it, you may need to either uncomment (if already present)
|
igor@402
|
828 or add a directive like this.
|
igor@402
|
829 \begin{codesample2}
|
igor@402
|
830 AddHandler cgi-script .cgi
|
igor@402
|
831 \end{codesample2}
|
igor@402
|
832
|
igor@402
|
833 The next possibility is that you might be served with a colourful
|
igor@402
|
834 Python backtrace claiming that it can't import a
|
igor@402
|
835 \texttt{mercurial}-related module. This is actually progress! The
|
igor@402
|
836 server is now capable of executing your CGI script. This error is
|
igor@402
|
837 only likely to occur if you're running a private installation of
|
igor@402
|
838 Mercurial, instead of a system-wide version. Remember that the web
|
igor@402
|
839 server runs the CGI program without any of the environment variables
|
igor@402
|
840 that you take for granted in an interactive session. If this error
|
igor@402
|
841 happens to you, edit your copy of \sfilename{hgweb.cgi} and follow the
|
igor@402
|
842 directions inside it to correctly set your \envar{PYTHONPATH}
|
igor@402
|
843 environment variable.
|
igor@402
|
844
|
igor@402
|
845 Finally, you are \emph{certain} to by served with another colourful
|
igor@402
|
846 Python backtrace: this one will complain that it can't find
|
igor@402
|
847 \dirname{/path/to/repository}. Edit your \sfilename{hgweb.cgi} script
|
igor@402
|
848 and replace the \dirname{/path/to/repository} string with the complete
|
igor@402
|
849 path to the repository you want to serve up.
|
igor@402
|
850
|
igor@402
|
851 At this point, when you try to reload the page, you should be
|
igor@402
|
852 presented with a nice HTML view of your repository's history. Whew!
|
igor@402
|
853
|
igor@402
|
854 \subsubsection{Configuring lighttpd}
|
igor@402
|
855
|
igor@402
|
856 To be exhaustive in my experiments, I tried configuring the
|
igor@402
|
857 increasingly popular \texttt{lighttpd} web server to serve the same
|
igor@402
|
858 repository as I described with Apache above. I had already overcome
|
igor@402
|
859 all of the problems I outlined with Apache, many of which are not
|
igor@402
|
860 server-specific. As a result, I was fairly sure that my file and
|
igor@402
|
861 directory permissions were good, and that my \sfilename{hgweb.cgi}
|
igor@402
|
862 script was properly edited.
|
igor@402
|
863
|
igor@402
|
864 Once I had Apache running, getting \texttt{lighttpd} to serve the
|
igor@402
|
865 repository was a snap (in other words, even if you're trying to use
|
igor@402
|
866 \texttt{lighttpd}, you should read the Apache section). I first had
|
igor@402
|
867 to edit the \texttt{mod\_access} section of its config file to enable
|
igor@402
|
868 \texttt{mod\_cgi} and \texttt{mod\_userdir}, both of which were
|
igor@402
|
869 disabled by default on my system. I then added a few lines to the end
|
igor@402
|
870 of the config file, to configure these modules.
|
igor@402
|
871 \begin{codesample2}
|
igor@402
|
872 userdir.path = "public_html"
|
igor@402
|
873 cgi.assign = ( ".cgi" => "" )
|
igor@402
|
874 \end{codesample2}
|
igor@402
|
875 With this done, \texttt{lighttpd} ran immediately for me. If I had
|
igor@402
|
876 configured \texttt{lighttpd} before Apache, I'd almost certainly have
|
igor@402
|
877 run into many of the same system-level configuration problems as I did
|
igor@402
|
878 with Apache. However, I found \texttt{lighttpd} to be noticeably
|
igor@402
|
879 easier to configure than Apache, even though I've used Apache for over
|
igor@402
|
880 a decade, and this was my first exposure to \texttt{lighttpd}.
|
igor@402
|
881
|
igor@402
|
882 \subsection{Sharing multiple repositories with one CGI script}
|
igor@402
|
883
|
igor@402
|
884 The \sfilename{hgweb.cgi} script only lets you publish a single
|
igor@402
|
885 repository, which is an annoying restriction. If you want to publish
|
igor@402
|
886 more than one without wracking yourself with multiple copies of the
|
igor@402
|
887 same script, each with different names, a better choice is to use the
|
igor@402
|
888 \sfilename{hgwebdir.cgi} script.
|
igor@402
|
889
|
igor@402
|
890 The procedure to configure \sfilename{hgwebdir.cgi} is only a little
|
igor@402
|
891 more involved than for \sfilename{hgweb.cgi}. First, you must obtain
|
igor@402
|
892 a copy of the script. If you don't have one handy, you can download a
|
igor@402
|
893 copy from the master Mercurial repository at
|
igor@402
|
894 \url{http://www.selenic.com/repo/hg/raw-file/tip/hgwebdir.cgi}.
|
igor@402
|
895
|
igor@402
|
896 You'll need to copy this script into your \dirname{public\_html}
|
igor@402
|
897 directory, and ensure that it's executable.
|
igor@402
|
898 \begin{codesample2}
|
igor@402
|
899 cp .../hgwebdir.cgi ~/public_html
|
igor@402
|
900 chmod 755 ~/public_html ~/public_html/hgwebdir.cgi
|
igor@402
|
901 \end{codesample2}
|
igor@402
|
902 With basic configuration out of the way, try to visit
|
igor@402
|
903 \url{http://myhostname/~myuser/hgwebdir.cgi} in your browser. It
|
igor@402
|
904 should display an empty list of repositories. If you get a blank
|
igor@402
|
905 window or error message, try walking through the list of potential
|
igor@402
|
906 problems in section~\ref{sec:collab:wtf}.
|
igor@402
|
907
|
igor@402
|
908 The \sfilename{hgwebdir.cgi} script relies on an external
|
igor@402
|
909 configuration file. By default, it searches for a file named
|
igor@402
|
910 \sfilename{hgweb.config} in the same directory as itself. You'll need
|
igor@402
|
911 to create this file, and make it world-readable. The format of the
|
igor@402
|
912 file is similar to a Windows ``ini'' file, as understood by Python's
|
igor@402
|
913 \texttt{ConfigParser}~\cite{web:configparser} module.
|
igor@402
|
914
|
igor@402
|
915 The easiest way to configure \sfilename{hgwebdir.cgi} is with a
|
igor@402
|
916 section named \texttt{collections}. This will automatically publish
|
igor@402
|
917 \emph{every} repository under the directories you name. The section
|
igor@402
|
918 should look like this:
|
igor@402
|
919 \begin{codesample2}
|
igor@402
|
920 [collections]
|
igor@402
|
921 /my/root = /my/root
|
igor@402
|
922 \end{codesample2}
|
igor@402
|
923 Mercurial interprets this by looking at the directory name on the
|
igor@402
|
924 \emph{right} hand side of the ``\texttt{=}'' sign; finding
|
igor@402
|
925 repositories in that directory hierarchy; and using the text on the
|
igor@402
|
926 \emph{left} to strip off matching text from the names it will actually
|
igor@402
|
927 list in the web interface. The remaining component of a path after
|
igor@402
|
928 this stripping has occurred is called a ``virtual path''.
|
igor@402
|
929
|
igor@402
|
930 Given the example above, if we have a repository whose local path is
|
igor@402
|
931 \dirname{/my/root/this/repo}, the CGI script will strip the leading
|
igor@402
|
932 \dirname{/my/root} from the name, and publish the repository with a
|
igor@402
|
933 virtual path of \dirname{this/repo}. If the base URL for our CGI
|
igor@402
|
934 script is \url{http://myhostname/~myuser/hgwebdir.cgi}, the complete
|
igor@402
|
935 URL for that repository will be
|
igor@402
|
936 \url{http://myhostname/~myuser/hgwebdir.cgi/this/repo}.
|
igor@402
|
937
|
igor@402
|
938 If we replace \dirname{/my/root} on the left hand side of this example
|
igor@402
|
939 with \dirname{/my}, then \sfilename{hgwebdir.cgi} will only strip off
|
igor@402
|
940 \dirname{/my} from the repository name, and will give us a virtual
|
igor@402
|
941 path of \dirname{root/this/repo} instead of \dirname{this/repo}.
|
igor@402
|
942
|
igor@402
|
943 The \sfilename{hgwebdir.cgi} script will recursively search each
|
igor@402
|
944 directory listed in the \texttt{collections} section of its
|
igor@402
|
945 configuration file, but it will \texttt{not} recurse into the
|
igor@402
|
946 repositories it finds.
|
igor@402
|
947
|
igor@402
|
948 The \texttt{collections} mechanism makes it easy to publish many
|
igor@402
|
949 repositories in a ``fire and forget'' manner. You only need to set up
|
igor@402
|
950 the CGI script and configuration file one time. Afterwards, you can
|
igor@402
|
951 publish or unpublish a repository at any time by simply moving it
|
igor@402
|
952 into, or out of, the directory hierarchy in which you've configured
|
igor@402
|
953 \sfilename{hgwebdir.cgi} to look.
|
igor@402
|
954
|
igor@402
|
955 \subsubsection{Explicitly specifying which repositories to publish}
|
igor@402
|
956
|
igor@402
|
957 In addition to the \texttt{collections} mechanism, the
|
igor@402
|
958 \sfilename{hgwebdir.cgi} script allows you to publish a specific list
|
igor@402
|
959 of repositories. To do so, create a \texttt{paths} section, with
|
igor@402
|
960 contents of the following form.
|
igor@402
|
961 \begin{codesample2}
|
igor@402
|
962 [paths]
|
igor@402
|
963 repo1 = /my/path/to/some/repo
|
igor@402
|
964 repo2 = /some/path/to/another
|
igor@402
|
965 \end{codesample2}
|
igor@402
|
966 In this case, the virtual path (the component that will appear in a
|
igor@402
|
967 URL) is on the left hand side of each definition, while the path to
|
igor@402
|
968 the repository is on the right. Notice that there does not need to be
|
igor@402
|
969 any relationship between the virtual path you choose and the location
|
igor@402
|
970 of a repository in your filesystem.
|
igor@402
|
971
|
igor@402
|
972 If you wish, you can use both the \texttt{collections} and
|
igor@402
|
973 \texttt{paths} mechanisms simultaneously in a single configuration
|
igor@402
|
974 file.
|
igor@402
|
975
|
igor@402
|
976 \begin{note}
|
igor@402
|
977 If multiple repositories have the same virtual path,
|
igor@402
|
978 \sfilename{hgwebdir.cgi} will not report an error. Instead, it will
|
igor@402
|
979 behave unpredictably.
|
igor@402
|
980 \end{note}
|
igor@402
|
981
|
igor@402
|
982 \subsection{Downloading source archives}
|
igor@402
|
983
|
igor@402
|
984 Mercurial's web interface lets users download an archive of any
|
igor@402
|
985 revision. This archive will contain a snapshot of the working
|
igor@402
|
986 directory as of that revision, but it will not contain a copy of the
|
igor@402
|
987 repository data.
|
igor@402
|
988
|
igor@402
|
989 By default, this feature is not enabled. To enable it, you'll need to
|
igor@402
|
990 add an \rcitem{web}{allow\_archive} item to the \rcsection{web}
|
igor@402
|
991 section of your \hgrc.
|
igor@402
|
992
|
igor@402
|
993 \subsection{Web configuration options}
|
igor@402
|
994
|
igor@402
|
995 Mercurial's web interfaces (the \hgcmd{serve} command, and the
|
igor@402
|
996 \sfilename{hgweb.cgi} and \sfilename{hgwebdir.cgi} scripts) have a
|
igor@402
|
997 number of configuration options that you can set. These belong in a
|
igor@402
|
998 section named \rcsection{web}.
|
igor@402
|
999 \begin{itemize}
|
igor@402
|
1000 \item[\rcitem{web}{allow\_archive}] Determines which (if any) archive
|
igor@402
|
1001 download mechanisms Mercurial supports. If you enable this
|
igor@402
|
1002 feature, users of the web interface will be able to download an
|
igor@402
|
1003 archive of whatever revision of a repository they are viewing.
|
igor@402
|
1004 To enable the archive feature, this item must take the form of a
|
igor@402
|
1005 sequence of words drawn from the list below.
|
igor@402
|
1006 \begin{itemize}
|
igor@402
|
1007 \item[\texttt{bz2}] A \command{tar} archive, compressed using
|
igor@402
|
1008 \texttt{bzip2} compression. This has the best compression ratio,
|
igor@402
|
1009 but uses the most CPU time on the server.
|
igor@402
|
1010 \item[\texttt{gz}] A \command{tar} archive, compressed using
|
igor@402
|
1011 \texttt{gzip} compression.
|
igor@402
|
1012 \item[\texttt{zip}] A \command{zip} archive, compressed using LZW
|
igor@402
|
1013 compression. This format has the worst compression ratio, but is
|
igor@402
|
1014 widely used in the Windows world.
|
igor@402
|
1015 \end{itemize}
|
igor@402
|
1016 If you provide an empty list, or don't have an
|
igor@402
|
1017 \rcitem{web}{allow\_archive} entry at all, this feature will be
|
igor@402
|
1018 disabled. Here is an example of how to enable all three supported
|
igor@402
|
1019 formats.
|
igor@402
|
1020 \begin{codesample4}
|
igor@402
|
1021 [web]
|
igor@402
|
1022 allow_archive = bz2 gz zip
|
igor@402
|
1023 \end{codesample4}
|
igor@402
|
1024 \item[\rcitem{web}{allowpull}] Boolean. Determines whether the web
|
igor@402
|
1025 interface allows remote users to \hgcmd{pull} and \hgcmd{clone} this
|
igor@402
|
1026 repository over~HTTP. If set to \texttt{no} or \texttt{false}, only
|
igor@402
|
1027 the ``human-oriented'' portion of the web interface is available.
|
igor@402
|
1028 \item[\rcitem{web}{contact}] String. A free-form (but preferably
|
igor@402
|
1029 brief) string identifying the person or group in charge of the
|
igor@402
|
1030 repository. This often contains the name and email address of a
|
igor@402
|
1031 person or mailing list. It often makes sense to place this entry in
|
igor@402
|
1032 a repository's own \sfilename{.hg/hgrc} file, but it can make sense
|
igor@402
|
1033 to use in a global \hgrc\ if every repository has a single
|
igor@402
|
1034 maintainer.
|
igor@402
|
1035 \item[\rcitem{web}{maxchanges}] Integer. The default maximum number
|
igor@402
|
1036 of changesets to display in a single page of output.
|
igor@402
|
1037 \item[\rcitem{web}{maxfiles}] Integer. The default maximum number
|
igor@402
|
1038 of modified files to display in a single page of output.
|
igor@402
|
1039 \item[\rcitem{web}{stripes}] Integer. If the web interface displays
|
igor@402
|
1040 alternating ``stripes'' to make it easier to visually align rows
|
igor@402
|
1041 when you are looking at a table, this number controls the number of
|
igor@402
|
1042 rows in each stripe.
|
igor@402
|
1043 \item[\rcitem{web}{style}] Controls the template Mercurial uses to
|
igor@402
|
1044 display the web interface. Mercurial ships with two web templates,
|
igor@402
|
1045 named \texttt{default} and \texttt{gitweb} (the latter is much more
|
igor@402
|
1046 visually attractive). You can also specify a custom template of
|
igor@402
|
1047 your own; see chapter~\ref{chap:template} for details. Here, you
|
igor@402
|
1048 can see how to enable the \texttt{gitweb} style.
|
igor@402
|
1049 \begin{codesample4}
|
igor@402
|
1050 [web]
|
igor@402
|
1051 style = gitweb
|
igor@402
|
1052 \end{codesample4}
|
igor@402
|
1053 \item[\rcitem{web}{templates}] Path. The directory in which to search
|
igor@402
|
1054 for template files. By default, Mercurial searches in the directory
|
igor@402
|
1055 in which it was installed.
|
igor@402
|
1056 \end{itemize}
|
igor@402
|
1057 If you are using \sfilename{hgwebdir.cgi}, you can place a few
|
igor@402
|
1058 configuration items in a \rcsection{web} section of the
|
igor@402
|
1059 \sfilename{hgweb.config} file instead of a \hgrc\ file, for
|
igor@402
|
1060 convenience. These items are \rcitem{web}{motd} and
|
igor@402
|
1061 \rcitem{web}{style}.
|
igor@402
|
1062
|
igor@402
|
1063 \subsubsection{Options specific to an individual repository}
|
igor@402
|
1064
|
igor@402
|
1065 A few \rcsection{web} configuration items ought to be placed in a
|
igor@402
|
1066 repository's local \sfilename{.hg/hgrc}, rather than a user's or
|
igor@402
|
1067 global \hgrc.
|
igor@402
|
1068 \begin{itemize}
|
igor@402
|
1069 \item[\rcitem{web}{description}] String. A free-form (but preferably
|
igor@402
|
1070 brief) string that describes the contents or purpose of the
|
igor@402
|
1071 repository.
|
igor@402
|
1072 \item[\rcitem{web}{name}] String. The name to use for the repository
|
igor@402
|
1073 in the web interface. This overrides the default name, which is the
|
igor@402
|
1074 last component of the repository's path.
|
igor@402
|
1075 \end{itemize}
|
igor@402
|
1076
|
igor@402
|
1077 \subsubsection{Options specific to the \hgcmd{serve} command}
|
igor@402
|
1078
|
igor@402
|
1079 Some of the items in the \rcsection{web} section of a \hgrc\ file are
|
igor@402
|
1080 only for use with the \hgcmd{serve} command.
|
igor@402
|
1081 \begin{itemize}
|
igor@402
|
1082 \item[\rcitem{web}{accesslog}] Path. The name of a file into which to
|
igor@402
|
1083 write an access log. By default, the \hgcmd{serve} command writes
|
igor@402
|
1084 this information to standard output, not to a file. Log entries are
|
igor@402
|
1085 written in the standard ``combined'' file format used by almost all
|
igor@402
|
1086 web servers.
|
igor@402
|
1087 \item[\rcitem{web}{address}] String. The local address on which the
|
igor@402
|
1088 server should listen for incoming connections. By default, the
|
igor@402
|
1089 server listens on all addresses.
|
igor@402
|
1090 \item[\rcitem{web}{errorlog}] Path. The name of a file into which to
|
igor@402
|
1091 write an error log. By default, the \hgcmd{serve} command writes this
|
igor@402
|
1092 information to standard error, not to a file.
|
igor@402
|
1093 \item[\rcitem{web}{ipv6}] Boolean. Whether to use the IPv6 protocol.
|
igor@402
|
1094 By default, IPv6 is not used.
|
igor@402
|
1095 \item[\rcitem{web}{port}] Integer. The TCP~port number on which the
|
igor@402
|
1096 server should listen. The default port number used is~8000.
|
igor@402
|
1097 \end{itemize}
|
igor@402
|
1098
|
igor@402
|
1099 \subsubsection{Choosing the right \hgrc\ file to add \rcsection{web}
|
igor@402
|
1100 items to}
|
igor@402
|
1101
|
igor@402
|
1102 It is important to remember that a web server like Apache or
|
igor@402
|
1103 \texttt{lighttpd} will run under a user~ID that is different to yours.
|
igor@402
|
1104 CGI scripts run by your server, such as \sfilename{hgweb.cgi}, will
|
igor@402
|
1105 usually also run under that user~ID.
|
igor@402
|
1106
|
igor@402
|
1107 If you add \rcsection{web} items to your own personal \hgrc\ file, CGI
|
igor@402
|
1108 scripts won't read that \hgrc\ file. Those settings will thus only
|
igor@402
|
1109 affect the behaviour of the \hgcmd{serve} command when you run it. To
|
igor@402
|
1110 cause CGI scripts to see your settings, either create a \hgrc\ file in
|
igor@402
|
1111 the home directory of the user ID that runs your web server, or add
|
igor@402
|
1112 those settings to a system-wide \hgrc\ file.
|
igor@402
|
1113
|
igor@402
|
1114
|
igor@402
|
1115 %%% Local Variables:
|
igor@402
|
1116 %%% mode: latex
|
igor@402
|
1117 %%% TeX-master: "00book"
|
igor@402
|
1118 %%% End:
|