hgbook: 0eda2936ef77 es/collab.tex

hgbook

view es/collab.tex @ 415:0eda2936ef77

translated a couple of paragraphs

author	Javier Rojas <jerojasro@devnull.li>
date	Tue Nov 11 23:21:00 2008 -0500 (2008-11-11)
parents	a9ea523446cc
children	7e838acf7350

line source

1 \chapter{Colaborar con otros}

2 \label{cha:collab}

4 Debido a su naturaleza descentralizada, Mercurial no impone política

5 alguna de cómo deben trabajar los grupos de personas. Sin embargo, si

6 usted es nuevo al control distribuido de versiones, es bueno tener

7 herramientas y ejemplos a la mano al pensar en posibles modelos de

8 flujo de trabajo.

10 \section{La interfaz web de Mercurial}

12 Mercurial tiene una poderosa interfaz web que provee bastantes

13 capacidades útiles.

15 Para uso interactivo, la interfaz le permite visualizar uno o varios

16 repositorios. Puede ver la historia de un repositorio, examinar cada

17 cambio(comentarios y diferencias), y ver los contenidos de cada

18 directorio y fichero.

20 Adicionalmente la interfaz provee feeds de RSS de los cambios de los

21 repositorios. Que le permite ``subscribirse''a un repositorio usando

22 su herramienta de lectura de feeds favorita, y ser notificado

23 automáticamente de la actividad en el repositorio tan pronto como

24 sucede. Me gusta mucho más este modelo que el estar suscrito a una

25 lista de correo a la cual se envían las notificaciones, dado que no

26 requiere configuración adicional de parte de quien sea que está

27 administrando el repositorio.

29 La interfaz web también permite clonar repositorios a los usuarios

30 remotos, jalar cambios, y (cuando el servidor está configurado para

31 permitirlo) publicar cambios en el mismo. El protocolo de tunneling

32 de Mercurial comprime datos agresivamente, de forma que trabaja

33 eficientemente incluso con conexiones de red con poco ancho de banda.

35 La forma más sencilla de iniciarse con la interfaz web es usar su

36 navegador para visitar un repositorio existente, como por ejemplo el

37 repositorio principal de Mercurial \url{http://www.selenic.com/repo/hg?style=gitweb}.

39 Si está interesado en proveer una interfaz web a sus propios

40 repositorios, Mercurial provee dos formas de hacerlo. La primera es

41 usando la orden \hgcmd{serve}, que está enfocada a servir ``de forma

42 liviana'' y por intervalos cortos. Para más detalles de cómo usar

43 esta orden vea la sección~\ref{sec:collab:serve} más adelante. Si

44 tiene un repositorio que desea hacer permanente, Mercurial tiene

45 soporte embebido del \command{ssh} para publicar cambios con seguridad

46 al repositorio central, como se documenta en la

47 sección~\ref{sec:collab:ssh}. Es muy usual que se publique una copia

48 de sólo lectura en el repositorio que está corriendo sobre HTTP usando

49 CGI, como en la sección~\ref{sec:collab:cgi}. Publicar sobre HTTP

50 satisface las necesidades de la gente que no tiene permisos de

51 publicación y de aquellos que quieren usar navegadores web para

52 visualizar la historia del repositorio.

54 \subsection{Trabajo con muchas ramas}

56 Los proyectos de cierta talla tienden naturlamente a progresar de

57 forma simultánea en varios frentes. En el caso del software, es común

58 que un proyecto tenga versiones periódicas oficiales. Una versión

59 puede entrar a ``modo mantenimiento'' por un tiempo después de su

60 primera publicación; las versiones de mantenimiento tienden a contener

61 solamente arreglos de fallos, pero no nuevas características. En

62 paralelo con las versiones de mantenimiento puede haber una o muchas

63 versiones futuras pueden estar en desarrollo. La gente usa normalmente

64 la palabra ``rama'' para referirse a una de las direcciones

65 ligeramente distintas en las cuales procede el desarrollo.

67 Mercurial está especialmente preparado para administrar un buen número

68 de ramas simultáneas pero no idénticas. Cada ``dirección de

69 desarrollo'' puede vivir en su propio repositorio central, y puede

70 mezclar los cambios de una a otra de acuerdo con las necesidades. Dado

71 que los repositorios son independientes, uno del otro, los cambios

72 inestables de una rama de desarrollo nunca afectarán una rama estable

73 a menos que alguien explícitamente mezcle los cambios.

75 A continuación un ejemplo de cómo podría hacerse esto en la

76 práctica. Digamos que tiene una ``rama principal'' en un servidor

77 central.

78 \interaction{branching.init}

79 Alguien lo clona, hace cambios locales, los prueba, y los publica allí

80 mismo.

82 Una vez que la rama principal alcanza una estado de versión se puede

83 usar la orden \hgcmd{tag} para dar un nombre permanente a la revisión.

84 \interaction{branching.tag}

85 Digamos que en la rama principal ocurre más desarrollo.

86 \interaction{branching.main}

87 Cuando se usa la etiqueta con que se identificó la versión, la gente

88 puede clonar el repositorio en cualquier momento en el futuro

89 empleando \hgcmd{update} para obtener una copia del directorio de

90 trabajo exacta como cuando se creó la etiqueta de la revisión que se

91 consignó.

92 \interaction{branching.update}

94 Adicionalmente, justo después de que la rama principal se etiquete,

95 alguien puede clonarla en el servidor a una nueva rama ``estable'',

96 también en el servidor.

97 \interaction{branching.clone}

99 Alguien que requiera hacer un cambio en la rama estable puede clonar

100 \emph{ese} repositorio, hacer sus cambios, consignar y publicarlos

101 posteriormente al inicial.

102 \interaction{branching.stable}

103 Puesto que los repositorios de Mercurial son independientes, y que

104 Mercurial no mueve los cambios de un lado a otro automáticamente, las

105 ramas estable y principal están \emph{aisladas} la una de la otra.

106 Los cambios que haga en la rama principal no ``se filtran'' a la rama

107 estable o vice versa.

108

109 Es usual que los arreglos de fallos de la rama estable deban hacerse

110 aparecer en la rama principal también. En lugar de reescribir el

111 arreglo del fallo en la rama principal, puede jalar y mezclar los

112 cambios de la rama estable a la principal, Mercurial traerá tales

113 arreglos por usted.

114 \interaction{branching.merge}

115 La rama principal contendtrá aún los cambios que no están en la

116 estable y contendrá además todos los arreglos de fallos de la rama

117 estable. La rama estable permanece incólume a tales cambios.

118

119 \subsection{Feature branches}

120

121 For larger projects, an effective way to manage change is to break up

122 a team into smaller groups. Each group has a shared branch of its

123 own, cloned from a single ``master'' branch used by the entire

124 project. People working on an individual branch are typically quite

125 isolated from developments on other branches.

126

127 \begin{figure}[ht]

128 \centering

129 \grafix{feature-branches}

130 \caption{Feature branches}

131 \label{fig:collab:feature-branches}

132 \end{figure}

133

134 When a particular feature is deemed to be in suitable shape, someone

135 on that feature team pulls and merges from the master branch into the

136 feature branch, then pushes back up to the master branch.

137

138 \subsection{The release train}

139

140 Some projects are organised on a ``train'' basis: a release is

141 scheduled to happen every few months, and whatever features are ready

142 when the ``train'' is ready to leave are allowed in.

143

144 This model resembles working with feature branches. The difference is

145 that when a feature branch misses a train, someone on the feature team

146 pulls and merges the changes that went out on that train release into

147 the feature branch, and the team continues its work on top of that

148 release so that their feature can make the next release.

149

150 \subsection{The Linux kernel model}

151

152 The development of the Linux kernel has a shallow hierarchical

153 structure, surrounded by a cloud of apparent chaos. Because most

154 Linux developers use \command{git}, a distributed revision control

155 tool with capabilities similar to Mercurial, it's useful to describe

156 the way work flows in that environment; if you like the ideas, the

157 approach translates well across tools.

158

159 At the center of the community sits Linus Torvalds, the creator of

160 Linux. He publishes a single source repository that is considered the

161 ``authoritative'' current tree by the entire developer community.

162 Anyone can clone Linus's tree, but he is very choosy about whose trees

163 he pulls from.

164

165 Linus has a number of ``trusted lieutenants''. As a general rule, he

166 pulls whatever changes they publish, in most cases without even

167 reviewing those changes. Some of those lieutenants are generally

168 agreed to be ``maintainers'', responsible for specific subsystems

169 within the kernel. If a random kernel hacker wants to make a change

170 to a subsystem that they want to end up in Linus's tree, they must

171 find out who the subsystem's maintainer is, and ask that maintainer to

172 take their change. If the maintainer reviews their changes and agrees

173 to take them, they'll pass them along to Linus in due course.

174

175 Individual lieutenants have their own approaches to reviewing,

176 accepting, and publishing changes; and for deciding when to feed them

177 to Linus. In addition, there are several well known branches that

178 people use for different purposes. For example, a few people maintain

179 ``stable'' repositories of older versions of the kernel, to which they

180 apply critical fixes as needed. Some maintainers publish multiple

181 trees: one for experimental changes; one for changes that they are

182 about to feed upstream; and so on. Others just publish a single

183 tree.

184

185 This model has two notable features. The first is that it's ``pull

186 only''. You have to ask, convince, or beg another developer to take a

187 change from you, because there are almost no trees to which more than

188 one person can push, and there's no way to push changes into a tree

189 that someone else controls.

190

191 The second is that it's based on reputation and acclaim. If you're an

192 unknown, Linus will probably ignore changes from you without even

193 responding. But a subsystem maintainer will probably review them, and

194 will likely take them if they pass their criteria for suitability.

195 The more ``good'' changes you contribute to a maintainer, the more

196 likely they are to trust your judgment and accept your changes. If

197 you're well-known and maintain a long-lived branch for something Linus

198 hasn't yet accepted, people with similar interests may pull your

199 changes regularly to keep up with your work.

200

201 Reputation and acclaim don't necessarily cross subsystem or ``people''

202 boundaries. If you're a respected but specialised storage hacker, and

203 you try to fix a networking bug, that change will receive a level of

204 scrutiny from a network maintainer comparable to a change from a

205 complete stranger.

206

207 To people who come from more orderly project backgrounds, the

208 comparatively chaotic Linux kernel development process often seems

209 completely insane. It's subject to the whims of individuals; people

210 make sweeping changes whenever they deem it appropriate; and the pace

211 of development is astounding. And yet Linux is a highly successful,

212 well-regarded piece of software.

213

214 \subsection{Pull-only versus shared-push collaboration}

215

216 A perpetual source of heat in the open source community is whether a

217 development model in which people only ever pull changes from others

218 is ``better than'' one in which multiple people can push changes to a

219 shared repository.

220

221 Typically, the backers of the shared-push model use tools that

222 actively enforce this approach. If you're using a centralised

223 revision control tool such as Subversion, there's no way to make a

224 choice over which model you'll use: the tool gives you shared-push,

225 and if you want to do anything else, you'll have to roll your own

226 approach on top (such as applying a patch by hand).

227

228 A good distributed revision control tool, such as Mercurial, will

229 support both models. You and your collaborators can then structure

230 how you work together based on your own needs and preferences, not on

231 what contortions your tools force you into.

232

233 \subsection{Where collaboration meets branch management}

234

235 Once you and your team set up some shared repositories and start

236 propagating changes back and forth between local and shared repos, you

237 begin to face a related, but slightly different challenge: that of

238 managing the multiple directions in which your team may be moving at

239 once. Even though this subject is intimately related to how your team

240 collaborates, it's dense enough to merit treatment of its own, in

241 chapter~\ref{chap:branch}.

242

243 \section{The technical side of sharing}

244

245 The remainder of this chapter is devoted to the question of serving

246 data to your collaborators.

247

248 \section{Informal sharing with \hgcmd{serve}}

249 \label{sec:collab:serve}

250

251 Mercurial's \hgcmd{serve} command is wonderfully suited to small,

252 tight-knit, and fast-paced group environments. It also provides a

253 great way to get a feel for using Mercurial commands over a network.

254

255 Run \hgcmd{serve} inside a repository, and in under a second it will

256 bring up a specialised HTTP server; this will accept connections from

257 any client, and serve up data for that repository until you terminate

258 it. Anyone who knows the URL of the server you just started, and can

259 talk to your computer over the network, can then use a web browser or

260 Mercurial to read data from that repository. A URL for a

261 \hgcmd{serve} instance running on a laptop is likely to look something

262 like \Verb|http://my-laptop.local:8000/|.

263

264 The \hgcmd{serve} command is \emph{not} a general-purpose web server.

265 It can do only two things:

266 \begin{itemize}

267 \item Allow people to browse the history of the repository it's

268 serving, from their normal web browsers.

269 \item Speak Mercurial's wire protocol, so that people can

270 \hgcmd{clone} or \hgcmd{pull} changes from that repository.

271 \end{itemize}

272 In particular, \hgcmd{serve} won't allow remote users to \emph{modify}

273 your repository. It's intended for read-only use.

274

275 If you're getting started with Mercurial, there's nothing to prevent

276 you from using \hgcmd{serve} to serve up a repository on your own

277 computer, then use commands like \hgcmd{clone}, \hgcmd{incoming}, and

278 so on to talk to that server as if the repository was hosted remotely.

279 This can help you to quickly get acquainted with using commands on

280 network-hosted repositories.

281

282 \subsection{A few things to keep in mind}

283

284 Because it provides unauthenticated read access to all clients, you

285 should only use \hgcmd{serve} in an environment where you either don't

286 care, or have complete control over, who can access your network and

287 pull data from your repository.

288

289 The \hgcmd{serve} command knows nothing about any firewall software

290 you might have installed on your system or network. It cannot detect

291 or control your firewall software. If other people are unable to talk

292 to a running \hgcmd{serve} instance, the second thing you should do

293 (\emph{after} you make sure that they're using the correct URL) is

294 check your firewall configuration.

295

296 By default, \hgcmd{serve} listens for incoming connections on

297 port~8000. If another process is already listening on the port you

298 want to use, you can specify a different port to listen on using the

299 \hgopt{serve}{-p} option.

300

301 Normally, when \hgcmd{serve} starts, it prints no output, which can be

302 a bit unnerving. If you'd like to confirm that it is indeed running

303 correctly, and find out what URL you should send to your

304 collaborators, start it with the \hggopt{-v} option.

305

306 \section{Using the Secure Shell (ssh) protocol}

307 \label{sec:collab:ssh}

308

309 You can pull and push changes securely over a network connection using

310 the Secure Shell (\texttt{ssh}) protocol. To use this successfully,

311 you may have to do a little bit of configuration on the client or

312 server sides.

313

314 If you're not familiar with ssh, it's a network protocol that lets you

315 securely communicate with another computer. To use it with Mercurial,

316 you'll be setting up one or more user accounts on a server so that

317 remote users can log in and execute commands.

318

319 (If you \emph{are} familiar with ssh, you'll probably find some of the

320 material that follows to be elementary in nature.)

321

322 \subsection{How to read and write ssh URLs}

323

324 An ssh URL tends to look like this:

325 \begin{codesample2}

326 ssh://bos@hg.serpentine.com:22/hg/hgbook

327 \end{codesample2}

328 \begin{enumerate}

329 \item The ``\texttt{ssh://}'' part tells Mercurial to use the ssh

330 protocol.

331 \item The ``\texttt{bos@}'' component indicates what username to log

332 into the server as. You can leave this out if the remote username

333 is the same as your local username.

334 \item The ``\texttt{hg.serpentine.com}'' gives the hostname of the

335 server to log into.

336 \item The ``:22'' identifies the port number to connect to the server

337 on. The default port is~22, so you only need to specify this part

338 if you're \emph{not} using port~22.

339 \item The remainder of the URL is the local path to the repository on

340 the server.

341 \end{enumerate}

342

343 There's plenty of scope for confusion with the path component of ssh

344 URLs, as there is no standard way for tools to interpret it. Some

345 programs behave differently than others when dealing with these paths.

346 This isn't an ideal situation, but it's unlikely to change. Please

347 read the following paragraphs carefully.

348

349 Mercurial treats the path to a repository on the server as relative to

350 the remote user's home directory. For example, if user \texttt{foo}

351 on the server has a home directory of \dirname{/home/foo}, then an ssh

352 URL that contains a path component of \dirname{bar}

353 \emph{really} refers to the directory \dirname{/home/foo/bar}.

354

355 If you want to specify a path relative to another user's home

356 directory, you can use a path that starts with a tilde character

357 followed by the user's name (let's call them \texttt{otheruser}), like

358 this.

359 \begin{codesample2}

360 ssh://server/~otheruser/hg/repo

361 \end{codesample2}

362

363 And if you really want to specify an \emph{absolute} path on the

364 server, begin the path component with two slashes, as in this example.

365 \begin{codesample2}

366 ssh://server//absolute/path

367 \end{codesample2}

368

369 \subsection{Finding an ssh client for your system}

370

371 Almost every Unix-like system comes with OpenSSH preinstalled. If

372 you're using such a system, run \Verb|which ssh| to find out if

373 the \command{ssh} command is installed (it's usually in

374 \dirname{/usr/bin}). In the unlikely event that it isn't present,

375 take a look at your system documentation to figure out how to install

376 it.

377

378 On Windows, you'll first need to choose download a suitable ssh

379 client. There are two alternatives.

380 \begin{itemize}

381 \item Simon Tatham's excellent PuTTY package~\cite{web:putty} provides

382 a complete suite of ssh client commands.

383 \item If you have a high tolerance for pain, you can use the Cygwin

384 port of OpenSSH.

385 \end{itemize}

386 In either case, you'll need to edit your \hgini\ file to tell

387 Mercurial where to find the actual client command. For example, if

388 you're using PuTTY, you'll need to use the \command{plink} command as

389 a command-line ssh client.

390 \begin{codesample2}

391 [ui]

392 ssh = C:/path/to/plink.exe -ssh -i "C:/path/to/my/private/key"

393 \end{codesample2}

394

395 \begin{note}

396 The path to \command{plink} shouldn't contain any whitespace

397 characters, or Mercurial may not be able to run it correctly (so

398 putting it in \dirname{C:\\Program Files} is probably not a good

399 idea).

400 \end{note}

401

402 \subsection{Generating a key pair}

403

404 To avoid the need to repetitively type a password every time you need

405 to use your ssh client, I recommend generating a key pair. On a

406 Unix-like system, the \command{ssh-keygen} command will do the trick.

407 On Windows, if you're using PuTTY, the \command{puttygen} command is

408 what you'll need.

409

410 When you generate a key pair, it's usually \emph{highly} advisable to

411 protect it with a passphrase. (The only time that you might not want

412 to do this id when you're using the ssh protocol for automated tasks

413 on a secure network.)

414

415 Simply generating a key pair isn't enough, however. You'll need to

416 add the public key to the set of authorised keys for whatever user

417 you're logging in remotely as. For servers using OpenSSH (the vast

418 majority), this will mean adding the public key to a list in a file

419 called \sfilename{authorized\_keys} in their \sdirname{.ssh}

420 directory.

421

422 On a Unix-like system, your public key will have a \filename{.pub}

423 extension. If you're using \command{puttygen} on Windows, you can

424 save the public key to a file of your choosing, or paste it from the

425 window it's displayed in straight into the

426 \sfilename{authorized\_keys} file.

427

428 \subsection{Using an authentication agent}

429

430 An authentication agent is a daemon that stores passphrases in memory

431 (so it will forget passphrases if you log out and log back in again).

432 An ssh client will notice if it's running, and query it for a

433 passphrase. If there's no authentication agent running, or the agent

434 doesn't store the necessary passphrase, you'll have to type your

435 passphrase every time Mercurial tries to communicate with a server on

436 your behalf (e.g.~whenever you pull or push changes).

437

438 The downside of storing passphrases in an agent is that it's possible

439 for a well-prepared attacker to recover the plain text of your

440 passphrases, in some cases even if your system has been power-cycled.

441 You should make your own judgment as to whether this is an acceptable

442 risk. It certainly saves a lot of repeated typing.

443

444 On Unix-like systems, the agent is called \command{ssh-agent}, and

445 it's often run automatically for you when you log in. You'll need to

446 use the \command{ssh-add} command to add passphrases to the agent's

447 store. On Windows, if you're using PuTTY, the \command{pageant}

448 command acts as the agent. It adds an icon to your system tray that

449 will let you manage stored passphrases.

450

451 \subsection{Configuring the server side properly}

452

453 Because ssh can be fiddly to set up if you're new to it, there's a

454 variety of things that can go wrong. Add Mercurial on top, and

455 there's plenty more scope for head-scratching. Most of these

456 potential problems occur on the server side, not the client side. The

457 good news is that once you've gotten a configuration working, it will

458 usually continue to work indefinitely.

459

460 Before you try using Mercurial to talk to an ssh server, it's best to

461 make sure that you can use the normal \command{ssh} or \command{putty}

462 command to talk to the server first. If you run into problems with

463 using these commands directly, Mercurial surely won't work. Worse, it

464 will obscure the underlying problem. Any time you want to debug

465 ssh-related Mercurial problems, you should drop back to making sure

466 that plain ssh client commands work first, \emph{before} you worry

467 about whether there's a problem with Mercurial.

468

469 The first thing to be sure of on the server side is that you can

470 actually log in from another machine at all. If you can't use

471 \command{ssh} or \command{putty} to log in, the error message you get

472 may give you a few hints as to what's wrong. The most common problems

473 are as follows.

474 \begin{itemize}

475 \item If you get a ``connection refused'' error, either there isn't an

476 SSH daemon running on the server at all, or it's inaccessible due to

477 firewall configuration.

478 \item If you get a ``no route to host'' error, you either have an

479 incorrect address for the server or a seriously locked down firewall

480 that won't admit its existence at all.

481 \item If you get a ``permission denied'' error, you may have mistyped

482 the username on the server, or you could have mistyped your key's

483 passphrase or the remote user's password.

484 \end{itemize}

485 In summary, if you're having trouble talking to the server's ssh

486 daemon, first make sure that one is running at all. On many systems

487 it will be installed, but disabled, by default. Once you're done with

488 this step, you should then check that the server's firewall is

489 configured to allow incoming connections on the port the ssh daemon is

490 listening on (usually~22). Don't worry about more exotic

491 possibilities for misconfiguration until you've checked these two

492 first.

493

494 If you're using an authentication agent on the client side to store

495 passphrases for your keys, you ought to be able to log into the server

496 without being prompted for a passphrase or a password. If you're

497 prompted for a passphrase, there are a few possible culprits.

498 \begin{itemize}

499 \item You might have forgotten to use \command{ssh-add} or

500 \command{pageant} to store the passphrase.

501 \item You might have stored the passphrase for the wrong key.

502 \end{itemize}

503 If you're being prompted for the remote user's password, there are

504 another few possible problems to check.

505 \begin{itemize}

506 \item Either the user's home directory or their \sdirname{.ssh}

507 directory might have excessively liberal permissions. As a result,

508 the ssh daemon will not trust or read their

509 \sfilename{authorized\_keys} file. For example, a group-writable

510 home or \sdirname{.ssh} directory will often cause this symptom.

511 \item The user's \sfilename{authorized\_keys} file may have a problem.

512 If anyone other than the user owns or can write to that file, the

513 ssh daemon will not trust or read it.

514 \end{itemize}

515

516 In the ideal world, you should be able to run the following command

517 successfully, and it should print exactly one line of output, the

518 current date and time.

519 \begin{codesample2}

520 ssh myserver date

521 \end{codesample2}

522

523 If, on your server, you have login scripts that print banners or other

524 junk even when running non-interactive commands like this, you should

525 fix them before you continue, so that they only print output if

526 they're run interactively. Otherwise these banners will at least

527 clutter up Mercurial's output. Worse, they could potentially cause

528 problems with running Mercurial commands remotely. Mercurial makes

529 tries to detect and ignore banners in non-interactive \command{ssh}

530 sessions, but it is not foolproof. (If you're editing your login

531 scripts on your server, the usual way to see if a login script is

532 running in an interactive shell is to check the return code from the

533 command \Verb|tty -s|.)

534

535 Once you've verified that plain old ssh is working with your server,

536 the next step is to ensure that Mercurial runs on the server. The

537 following command should run successfully:

538 \begin{codesample2}

539 ssh myserver hg version

540 \end{codesample2}

541 If you see an error message instead of normal \hgcmd{version} output,

542 this is usually because you haven't installed Mercurial to

543 \dirname{/usr/bin}. Don't worry if this is the case; you don't need

544 to do that. But you should check for a few possible problems.

545 \begin{itemize}

546 \item Is Mercurial really installed on the server at all? I know this

547 sounds trivial, but it's worth checking!

548 \item Maybe your shell's search path (usually set via the \envar{PATH}

549 environment variable) is simply misconfigured.

550 \item Perhaps your \envar{PATH} environment variable is only being set

551 to point to the location of the \command{hg} executable if the login

552 session is interactive. This can happen if you're setting the path

553 in the wrong shell login script. See your shell's documentation for

554 details.

555 \item The \envar{PYTHONPATH} environment variable may need to contain

556 the path to the Mercurial Python modules. It might not be set at

557 all; it could be incorrect; or it may be set only if the login is

558 interactive.

559 \end{itemize}

560

561 If you can run \hgcmd{version} over an ssh connection, well done!

562 You've got the server and client sorted out. You should now be able

563 to use Mercurial to access repositories hosted by that username on

564 that server. If you run into problems with Mercurial and ssh at this

565 point, try using the \hggopt{--debug} option to get a clearer picture

566 of what's going on.

567

568 \subsection{Using compression with ssh}

569

570 Mercurial does not compress data when it uses the ssh protocol,

571 because the ssh protocol can transparently compress data. However,

572 the default behaviour of ssh clients is \emph{not} to request

573 compression.

574

575 Over any network other than a fast LAN (even a wireless network),

576 using compression is likely to significantly speed up Mercurial's

577 network operations. For example, over a WAN, someone measured

578 compression as reducing the amount of time required to clone a

579 particularly large repository from~51 minutes to~17 minutes.

580

581 Both \command{ssh} and \command{plink} accept a \cmdopt{ssh}{-C}

582 option which turns on compression. You can easily edit your \hgrc\ to

583 enable compression for all of Mercurial's uses of the ssh protocol.

584 \begin{codesample2}

585 [ui]

586 ssh = ssh -C

587 \end{codesample2}

588

589 If you use \command{ssh}, you can configure it to always use

590 compression when talking to your server. To do this, edit your

591 \sfilename{.ssh/config} file (which may not yet exist), as follows.

592 \begin{codesample2}

593 Host hg

594 Compression yes

595 HostName hg.example.com

596 \end{codesample2}

597 This defines an alias, \texttt{hg}. When you use it on the

598 \command{ssh} command line or in a Mercurial \texttt{ssh}-protocol

599 URL, it will cause \command{ssh} to connect to \texttt{hg.example.com}

600 and use compression. This gives you both a shorter name to type and

601 compression, each of which is a good thing in its own right.

602

603 \section{Serving over HTTP using CGI}

604 \label{sec:collab:cgi}

605

606 Depending on how ambitious you are, configuring Mercurial's CGI

607 interface can take anything from a few moments to several hours.

608

609 We'll begin with the simplest of examples, and work our way towards a

610 more complex configuration. Even for the most basic case, you're

611 almost certainly going to need to read and modify your web server's

612 configuration.

613

614 \begin{note}

615 Configuring a web server is a complex, fiddly, and highly

616 system-dependent activity. I can't possibly give you instructions

617 that will cover anything like all of the cases you will encounter.

618 Please use your discretion and judgment in following the sections

619 below. Be prepared to make plenty of mistakes, and to spend a lot

620 of time reading your server's error logs.

621 \end{note}

622

623 \subsection{Web server configuration checklist}

624

625 Before you continue, do take a few moments to check a few aspects of

626 your system's setup.

627

628 \begin{enumerate}

629 \item Do you have a web server installed at all? Mac OS X ships with

630 Apache, but many other systems may not have a web server installed.

631 \item If you have a web server installed, is it actually running? On

632 most systems, even if one is present, it will be disabled by

633 default.

634 \item Is your server configured to allow you to run CGI programs in

635 the directory where you plan to do so? Most servers default to

636 explicitly disabling the ability to run CGI programs.

637 \end{enumerate}

638

639 If you don't have a web server installed, and don't have substantial

640 experience configuring Apache, you should consider using the

641 \texttt{lighttpd} web server instead of Apache. Apache has a

642 well-deserved reputation for baroque and confusing configuration.

643 While \texttt{lighttpd} is less capable in some ways than Apache, most

644 of these capabilities are not relevant to serving Mercurial

645 repositories. And \texttt{lighttpd} is undeniably \emph{much} easier

646 to get started with than Apache.

647

648 \subsection{Basic CGI configuration}

649

650 On Unix-like systems, it's common for users to have a subdirectory

651 named something like \dirname{public\_html} in their home directory,

652 from which they can serve up web pages. A file named \filename{foo}

653 in this directory will be accessible at a URL of the form

654 \texttt{http://www.example.com/\~username/foo}.

655

656 To get started, find the \sfilename{hgweb.cgi} script that should be

657 present in your Mercurial installation. If you can't quickly find a

658 local copy on your system, simply download one from the master

659 Mercurial repository at

660 \url{http://www.selenic.com/repo/hg/raw-file/tip/hgweb.cgi}.

661

662 You'll need to copy this script into your \dirname{public\_html}

663 directory, and ensure that it's executable.

664 \begin{codesample2}

665 cp .../hgweb.cgi ~/public_html

666 chmod 755 ~/public_html/hgweb.cgi

667 \end{codesample2}

668 The \texttt{755} argument to \command{chmod} is a little more general

669 than just making the script executable: it ensures that the script is

670 executable by anyone, and that ``group'' and ``other'' write

671 permissions are \emph{not} set. If you were to leave those write

672 permissions enabled, Apache's \texttt{suexec} subsystem would likely

673 refuse to execute the script. In fact, \texttt{suexec} also insists

674 that the \emph{directory} in which the script resides must not be

675 writable by others.

676 \begin{codesample2}

677 chmod 755 ~/public_html

678 \end{codesample2}

679

680 \subsubsection{What could \emph{possibly} go wrong?}

681 \label{sec:collab:wtf}

682

683 Once you've copied the CGI script into place, go into a web browser,

684 and try to open the URL \url{http://myhostname/~myuser/hgweb.cgi},

685 \emph{but} brace yourself for instant failure. There's a high

686 probability that trying to visit this URL will fail, and there are

687 many possible reasons for this. In fact, you're likely to stumble

688 over almost every one of the possible errors below, so please read

689 carefully. The following are all of the problems I ran into on a

690 system running Fedora~7, with a fresh installation of Apache, and a

691 user account that I created specially to perform this exercise.

692

693 Your web server may have per-user directories disabled. If you're

694 using Apache, search your config file for a \texttt{UserDir}

695 directive. If there's none present, per-user directories will be

696 disabled. If one exists, but its value is \texttt{disabled}, then

697 per-user directories will be disabled. Otherwise, the string after

698 \texttt{UserDir} gives the name of the subdirectory that Apache will

699 look in under your home directory, for example \dirname{public\_html}.

700

701 Your file access permissions may be too restrictive. The web server

702 must be able to traverse your home directory and directories under

703 your \dirname{public\_html} directory, and read files under the latter

704 too. Here's a quick recipe to help you to make your permissions more

705 appropriate.

706 \begin{codesample2}

707 chmod 755 ~

708 find ~/public_html -type d -print0 | xargs -0r chmod 755

709 find ~/public_html -type f -print0 | xargs -0r chmod 644

710 \end{codesample2}

711

712 The other possibility with permissions is that you might get a

713 completely empty window when you try to load the script. In this

714 case, it's likely that your access permissions are \emph{too

715 permissive}. Apache's \texttt{suexec} subsystem won't execute a

716 script that's group-~or world-writable, for example.

717

718 Your web server may be configured to disallow execution of CGI

719 programs in your per-user web directory. Here's Apache's

720 default per-user configuration from my Fedora system.

721 \begin{codesample2}

722 <Directory /home/*/public_html>

723 AllowOverride FileInfo AuthConfig Limit

724 Options MultiViews Indexes SymLinksIfOwnerMatch IncludesNoExec

725 <Limit GET POST OPTIONS>

726 Order allow,deny

727 Allow from all

728 </Limit>

729 <LimitExcept GET POST OPTIONS>

730 Order deny,allow

731 Deny from all

732 </LimitExcept>

733 </Directory>

734 \end{codesample2}

735 If you find a similar-looking \texttt{Directory} group in your Apache

736 configuration, the directive to look at inside it is \texttt{Options}.

737 Add \texttt{ExecCGI} to the end of this list if it's missing, and

738 restart the web server.

739

740 If you find that Apache serves you the text of the CGI script instead

741 of executing it, you may need to either uncomment (if already present)

742 or add a directive like this.

743 \begin{codesample2}

744 AddHandler cgi-script .cgi

745 \end{codesample2}

746

747 The next possibility is that you might be served with a colourful

748 Python backtrace claiming that it can't import a

749 \texttt{mercurial}-related module. This is actually progress! The

750 server is now capable of executing your CGI script. This error is

751 only likely to occur if you're running a private installation of

752 Mercurial, instead of a system-wide version. Remember that the web

753 server runs the CGI program without any of the environment variables

754 that you take for granted in an interactive session. If this error

755 happens to you, edit your copy of \sfilename{hgweb.cgi} and follow the

756 directions inside it to correctly set your \envar{PYTHONPATH}

757 environment variable.

758

759 Finally, you are \emph{certain} to by served with another colourful

760 Python backtrace: this one will complain that it can't find

761 \dirname{/path/to/repository}. Edit your \sfilename{hgweb.cgi} script

762 and replace the \dirname{/path/to/repository} string with the complete

763 path to the repository you want to serve up.

764

765 At this point, when you try to reload the page, you should be

766 presented with a nice HTML view of your repository's history. Whew!

767

768 \subsubsection{Configuring lighttpd}

769

770 To be exhaustive in my experiments, I tried configuring the

771 increasingly popular \texttt{lighttpd} web server to serve the same

772 repository as I described with Apache above. I had already overcome

773 all of the problems I outlined with Apache, many of which are not

774 server-specific. As a result, I was fairly sure that my file and

775 directory permissions were good, and that my \sfilename{hgweb.cgi}

776 script was properly edited.

777

778 Once I had Apache running, getting \texttt{lighttpd} to serve the

779 repository was a snap (in other words, even if you're trying to use

780 \texttt{lighttpd}, you should read the Apache section). I first had

781 to edit the \texttt{mod\_access} section of its config file to enable

782 \texttt{mod\_cgi} and \texttt{mod\_userdir}, both of which were

783 disabled by default on my system. I then added a few lines to the end

784 of the config file, to configure these modules.

785 \begin{codesample2}

786 userdir.path = "public_html"

787 cgi.assign = ( ".cgi" => "" )

788 \end{codesample2}

789 With this done, \texttt{lighttpd} ran immediately for me. If I had

790 configured \texttt{lighttpd} before Apache, I'd almost certainly have

791 run into many of the same system-level configuration problems as I did

792 with Apache. However, I found \texttt{lighttpd} to be noticeably

793 easier to configure than Apache, even though I've used Apache for over

794 a decade, and this was my first exposure to \texttt{lighttpd}.

795

796 \subsection{Sharing multiple repositories with one CGI script}

797

798 The \sfilename{hgweb.cgi} script only lets you publish a single

799 repository, which is an annoying restriction. If you want to publish

800 more than one without wracking yourself with multiple copies of the

801 same script, each with different names, a better choice is to use the

802 \sfilename{hgwebdir.cgi} script.

803

804 The procedure to configure \sfilename{hgwebdir.cgi} is only a little

805 more involved than for \sfilename{hgweb.cgi}. First, you must obtain

806 a copy of the script. If you don't have one handy, you can download a

807 copy from the master Mercurial repository at

808 \url{http://www.selenic.com/repo/hg/raw-file/tip/hgwebdir.cgi}.

809

810 You'll need to copy this script into your \dirname{public\_html}

811 directory, and ensure that it's executable.

812 \begin{codesample2}

813 cp .../hgwebdir.cgi ~/public_html

814 chmod 755 ~/public_html ~/public_html/hgwebdir.cgi

815 \end{codesample2}

816 With basic configuration out of the way, try to visit

817 \url{http://myhostname/~myuser/hgwebdir.cgi} in your browser. It

818 should display an empty list of repositories. If you get a blank

819 window or error message, try walking through the list of potential

820 problems in section~\ref{sec:collab:wtf}.

821

822 The \sfilename{hgwebdir.cgi} script relies on an external

823 configuration file. By default, it searches for a file named

824 \sfilename{hgweb.config} in the same directory as itself. You'll need

825 to create this file, and make it world-readable. The format of the

826 file is similar to a Windows ``ini'' file, as understood by Python's

827 \texttt{ConfigParser}~\cite{web:configparser} module.

828

829 The easiest way to configure \sfilename{hgwebdir.cgi} is with a

830 section named \texttt{collections}. This will automatically publish

831 \emph{every} repository under the directories you name. The section

832 should look like this:

833 \begin{codesample2}

834 [collections]

835 /my/root = /my/root

836 \end{codesample2}

837 Mercurial interprets this by looking at the directory name on the

838 \emph{right} hand side of the ``\texttt{=}'' sign; finding

839 repositories in that directory hierarchy; and using the text on the

840 \emph{left} to strip off matching text from the names it will actually

841 list in the web interface. The remaining component of a path after

842 this stripping has occurred is called a ``virtual path''.

843

844 Given the example above, if we have a repository whose local path is

845 \dirname{/my/root/this/repo}, the CGI script will strip the leading

846 \dirname{/my/root} from the name, and publish the repository with a

847 virtual path of \dirname{this/repo}. If the base URL for our CGI

848 script is \url{http://myhostname/~myuser/hgwebdir.cgi}, the complete

849 URL for that repository will be

850 \url{http://myhostname/~myuser/hgwebdir.cgi/this/repo}.

851

852 If we replace \dirname{/my/root} on the left hand side of this example

853 with \dirname{/my}, then \sfilename{hgwebdir.cgi} will only strip off

854 \dirname{/my} from the repository name, and will give us a virtual

855 path of \dirname{root/this/repo} instead of \dirname{this/repo}.

856

857 The \sfilename{hgwebdir.cgi} script will recursively search each

858 directory listed in the \texttt{collections} section of its

859 configuration file, but it will \texttt{not} recurse into the

860 repositories it finds.

861

862 The \texttt{collections} mechanism makes it easy to publish many

863 repositories in a ``fire and forget'' manner. You only need to set up

864 the CGI script and configuration file one time. Afterwards, you can

865 publish or unpublish a repository at any time by simply moving it

866 into, or out of, the directory hierarchy in which you've configured

867 \sfilename{hgwebdir.cgi} to look.

868

869 \subsubsection{Explicitly specifying which repositories to publish}

870

871 In addition to the \texttt{collections} mechanism, the

872 \sfilename{hgwebdir.cgi} script allows you to publish a specific list

873 of repositories. To do so, create a \texttt{paths} section, with

874 contents of the following form.

875 \begin{codesample2}

876 [paths]

877 repo1 = /my/path/to/some/repo

878 repo2 = /some/path/to/another

879 \end{codesample2}

880 In this case, the virtual path (the component that will appear in a

881 URL) is on the left hand side of each definition, while the path to

882 the repository is on the right. Notice that there does not need to be

883 any relationship between the virtual path you choose and the location

884 of a repository in your filesystem.

885

886 If you wish, you can use both the \texttt{collections} and

887 \texttt{paths} mechanisms simultaneously in a single configuration

888 file.

889

890 \begin{note}

891 If multiple repositories have the same virtual path,

892 \sfilename{hgwebdir.cgi} will not report an error. Instead, it will

893 behave unpredictably.

894 \end{note}

895

896 \subsection{Downloading source archives}

897

898 Mercurial's web interface lets users download an archive of any

899 revision. This archive will contain a snapshot of the working

900 directory as of that revision, but it will not contain a copy of the

901 repository data.

902

903 By default, this feature is not enabled. To enable it, you'll need to

904 add an \rcitem{web}{allow\_archive} item to the \rcsection{web}

905 section of your \hgrc.

906

907 \subsection{Web configuration options}

908

909 Mercurial's web interfaces (the \hgcmd{serve} command, and the

910 \sfilename{hgweb.cgi} and \sfilename{hgwebdir.cgi} scripts) have a

911 number of configuration options that you can set. These belong in a

912 section named \rcsection{web}.

913 \begin{itemize}

914 \item[\rcitem{web}{allow\_archive}] Determines which (if any) archive

915 download mechanisms Mercurial supports. If you enable this

916 feature, users of the web interface will be able to download an

917 archive of whatever revision of a repository they are viewing.

918 To enable the archive feature, this item must take the form of a

919 sequence of words drawn from the list below.

920 \begin{itemize}

921 \item[\texttt{bz2}] A \command{tar} archive, compressed using

922 \texttt{bzip2} compression. This has the best compression ratio,

923 but uses the most CPU time on the server.

924 \item[\texttt{gz}] A \command{tar} archive, compressed using

925 \texttt{gzip} compression.

926 \item[\texttt{zip}] A \command{zip} archive, compressed using LZW

927 compression. This format has the worst compression ratio, but is

928 widely used in the Windows world.

929 \end{itemize}

930 If you provide an empty list, or don't have an

931 \rcitem{web}{allow\_archive} entry at all, this feature will be

932 disabled. Here is an example of how to enable all three supported

933 formats.

934 \begin{codesample4}

935 [web]

936 allow_archive = bz2 gz zip

937 \end{codesample4}

938 \item[\rcitem{web}{allowpull}] Boolean. Determines whether the web

939 interface allows remote users to \hgcmd{pull} and \hgcmd{clone} this

940 repository over~HTTP. If set to \texttt{no} or \texttt{false}, only

941 the ``human-oriented'' portion of the web interface is available.

942 \item[\rcitem{web}{contact}] String. A free-form (but preferably

943 brief) string identifying the person or group in charge of the

944 repository. This often contains the name and email address of a

945 person or mailing list. It often makes sense to place this entry in

946 a repository's own \sfilename{.hg/hgrc} file, but it can make sense

947 to use in a global \hgrc\ if every repository has a single

948 maintainer.

949 \item[\rcitem{web}{maxchanges}] Integer. The default maximum number

950 of changesets to display in a single page of output.

951 \item[\rcitem{web}{maxfiles}] Integer. The default maximum number

952 of modified files to display in a single page of output.

953 \item[\rcitem{web}{stripes}] Integer. If the web interface displays

954 alternating ``stripes'' to make it easier to visually align rows

955 when you are looking at a table, this number controls the number of

956 rows in each stripe.

957 \item[\rcitem{web}{style}] Controls the template Mercurial uses to

958 display the web interface. Mercurial ships with two web templates,

959 named \texttt{default} and \texttt{gitweb} (the latter is much more

960 visually attractive). You can also specify a custom template of

961 your own; see chapter~\ref{chap:template} for details. Here, you

962 can see how to enable the \texttt{gitweb} style.

963 \begin{codesample4}

964 [web]

965 style = gitweb

966 \end{codesample4}

967 \item[\rcitem{web}{templates}] Path. The directory in which to search

968 for template files. By default, Mercurial searches in the directory

969 in which it was installed.

970 \end{itemize}

971 If you are using \sfilename{hgwebdir.cgi}, you can place a few

972 configuration items in a \rcsection{web} section of the

973 \sfilename{hgweb.config} file instead of a \hgrc\ file, for

974 convenience. These items are \rcitem{web}{motd} and

975 \rcitem{web}{style}.

976

977 \subsubsection{Options specific to an individual repository}

978

979 A few \rcsection{web} configuration items ought to be placed in a

980 repository's local \sfilename{.hg/hgrc}, rather than a user's or

981 global \hgrc.

982 \begin{itemize}

983 \item[\rcitem{web}{description}] String. A free-form (but preferably

984 brief) string that describes the contents or purpose of the

985 repository.

986 \item[\rcitem{web}{name}] String. The name to use for the repository

987 in the web interface. This overrides the default name, which is the

988 last component of the repository's path.

989 \end{itemize}

990

991 \subsubsection{Options specific to the \hgcmd{serve} command}

992

993 Some of the items in the \rcsection{web} section of a \hgrc\ file are

994 only for use with the \hgcmd{serve} command.

995 \begin{itemize}

996 \item[\rcitem{web}{accesslog}] Path. The name of a file into which to

997 write an access log. By default, the \hgcmd{serve} command writes

998 this information to standard output, not to a file. Log entries are

999 written in the standard ``combined'' file format used by almost all

1000 web servers.

1001 \item[\rcitem{web}{address}] String. The local address on which the

1002 server should listen for incoming connections. By default, the

1003 server listens on all addresses.

1004 \item[\rcitem{web}{errorlog}] Path. The name of a file into which to

1005 write an error log. By default, the \hgcmd{serve} command writes this

1006 information to standard error, not to a file.

1007 \item[\rcitem{web}{ipv6}] Boolean. Whether to use the IPv6 protocol.

1008 By default, IPv6 is not used.

1009 \item[\rcitem{web}{port}] Integer. The TCP~port number on which the

1010 server should listen. The default port number used is~8000.

1011 \end{itemize}

1012

1013 \subsubsection{Choosing the right \hgrc\ file to add \rcsection{web}

1014 items to}

1015

1016 It is important to remember that a web server like Apache or

1017 \texttt{lighttpd} will run under a user~ID that is different to yours.

1018 CGI scripts run by your server, such as \sfilename{hgweb.cgi}, will

1019 usually also run under that user~ID.

1020

1021 If you add \rcsection{web} items to your own personal \hgrc\ file, CGI

1022 scripts won't read that \hgrc\ file. Those settings will thus only

1023 affect the behaviour of the \hgcmd{serve} command when you run it. To

1024 cause CGI scripts to see your settings, either create a \hgrc\ file in

1025 the home directory of the user ID that runs your web server, or add

1026 those settings to a system-wide \hgrc\ file.

1027

1028

1029 %%% Local Variables:

1030 %%% mode: latex

1031 %%% TeX-master: "00book"

1032 %%% End: