hgbook: 6413f88338df en/collab.tex

hgbook

view en/collab.tex @ 180:6413f88338df

Point to chapter on undoing mistakes.

author	Bryan O'Sullivan <bos@serpentine.com>
date	Fri Mar 30 23:20:27 2007 -0700 (2007-03-30)
parents	7355af913937
children	7b812c428074

line source

1 \chapter{Collaborating with other people}

2 \label{cha:collab}

4 As a completely decentralised tool, Mercurial doesn't impose any

5 policy on how people ought to work with each other. However, if

6 you're new to distributed revision control, it helps to have some

7 tools and examples in mind when you're thinking about possible

8 workflow models.

10 \section{Collaboration models}

12 With a suitably flexible tool, making decisions about workflow is much

13 more of a social engineering challenge than a technical one.

14 Mercurial imposes few limitations on how you can structure the flow of

15 work in a project, so it's up to you and your group to set up and live

16 with a model that matches your own particular needs.

18 \subsection{Factors to keep in mind}

20 The most important aspect of any model that you must keep in mind is

21 how well it matches the needs and capabilities of the people who will

22 be using it. This might seem self-evident; even so, you still can't

23 afford to forget it for a moment.

25 I once put together a workflow model that seemed to make perfect sense

26 to me, but that caused a considerable amount of consternation and

27 strife within my development team. In spite of my attempts to explain

28 why we needed a complex set of branches, and how changes ought to flow

29 between them, a few team members revolted. Even though they were

30 smart people, they didn't want to pay attention to the constraints we

31 were operating under, or face the consequences of those constraints in

32 the details of the model that I was advocating.

34 Don't sweep foreseeable social or technical problems under the rug.

35 Whatever scheme you put into effect, you should plan for mistakes and

36 problem scenarios. Consider adding automated machinery to prevent, or

37 quickly recover from, trouble that you can anticipate. As an example,

38 if you intend to have a branch with not-for-release changes in it,

39 you'd do well to think early about the possibility that someone might

40 accidentally merge those changes into a release branch. You could

41 avoid this particular problem by writing a hook that prevents changes

42 from being merged from an inappropriate branch.

44 \subsection{Informal anarchy}

46 I wouldn't suggest an ``anything goes'' approach as something

47 sustainable, but it's a model that's easy to grasp, and it works

48 perfectly well in a few unusual situations.

50 As one example, many projects have a loose-knit group of collaborators

51 who rarely physically meet each other. Some groups like to overcome

52 the isolation of working at a distance by organising occasional

53 ``sprints''. In a sprint, a number of people get together in a single

54 location (a company's conference room, a hotel meeting room, that kind

55 of place) and spend several days more or less locked in there, hacking

56 intensely on a handful of projects.

58 A sprint is the perfect place to use the \hgcmd{serve} command, since

59 \hgcmd{serve} does not requires any fancy server infrastructure. You

60 can get started with \hgcmd{serve} in moments, by reading

61 section~\ref{sec:collab:serve} below. Then simply tell the person

62 next to you that you're running a server, send the URL to them in an

63 instant message, and you immediately have a quick-turnaround way to

64 work together. They can type your URL into their web browser and

65 quickly review your changes; or they can pull a bugfix from you and

66 verify it; or they can clone a branch containing a new feature and try

67 it out.

69 The charm, and the problem, with doing things in an ad hoc fashion

70 like this is that only people who know about your changes, and where

71 they are, can see them. Such an informal approach simply doesn't

72 scale beyond a handful people, because each individual needs to know

73 about $n$ different repositories to pull from.

75 \subsection{A single central repository}

77 For smaller projects migrating from a centralised revision control

78 tool, perhaps the easiest way to get started is to have changes flow

79 through a single shared central repository. This is also the

80 most common ``building block'' for more ambitious workflow schemes.

82 Contributors start by cloning a copy of this repository. They can

83 pull changes from it whenever they need to, and some (perhaps all)

84 developers have permission to push a change back when they're ready

85 for other people to see it.

87 Under this model, it can still often make sense for people to pull

88 changes directly from each other, without going through the central

89 repository. Consider a case in which I have a tentative bug fix, but

90 I am worried that if I were to publish it to the central repository,

91 it might subsequently break everyone else's trees as they pull it. To

92 reduce the potential for damage, I can ask you to clone my repository

93 into a temporary repository of your own and test it. This lets us put

94 off publishing the potentially unsafe change until it has had a little

95 testing.

97 In this kind of scenario, people usually use the \command{ssh}

98 protocol to securely push changes to the central repository, as

99 documented in section~\ref{sec:collab:ssh}. It's also usual to

100 publish a read-only copy of the repository over HTTP using CGI, as in

101 section~\ref{sec:collab:cgi}. Publishing over HTTP satisfies the

102 needs of people who don't have push access, and those who want to use

103 web browsers to browse the repository's history.

104

105 \subsection{Working with multiple branches}

106

107 Projects of any significant size naturally tend to make progress on

108 several fronts simultaneously. In the case of software, it's common

109 for a project to go through periodic official releases. A release

110 might then go into ``maintenance mode'' for a while after its first

111 publication; maintenance releases tend to contain only bug fixes, not

112 new features. In parallel with these maintenance releases, one or

113 more future releases may be under development. People normally use

114 the word ``branch'' to refer to one of these many slightly different

115 directions in which development is proceeding.

116

117 Mercurial is particularly well suited to managing a number of

118 simultaneous, but not identical, branches. Each ``development

119 direction'' can live in its own central repository, and you can merge

120 changes from one to another as the need arises. Because repositories

121 are independent of each other, unstable changes in a development

122 branch will never affect a stable branch unless someone explicitly

123 merges those changes in.

124

125 Here's an example of how this can work in practice. Let's say you

126 have one ``main branch'' on a central server.

127 \interaction{branching.init}

128 People clone it, make changes locally, test them, and push them back.

129

130 Once the main branch reaches a release milestone, you can use the

131 \hgcmd{tag} command to give a permanent name to the milestone

132 revision.

133 \interaction{branching.tag}

134 Let's say some ongoing development occurs on the main branch.

135 \interaction{branching.main}

136 Using the tag that was recorded at the milestone, people who clone

137 that repository at any time in the future can use \hgcmd{update} to

138 get a copy of the working directory exactly as it was when that tagged

139 revision was committed.

140 \interaction{branching.update}

141

142 In addition, immediately after the main branch is tagged, someone can

143 then clone the main branch on the server to a new ``stable'' branch,

144 also on the server.

145 \interaction{branching.clone}

146

147 Someone who needs to make a change to the stable branch can then clone

148 \emph{that} repository, make their changes, commit, and push their

149 changes back there.

150 \interaction{branching.stable}

151 Because Mercurial repositories are independent, and Mercurial doesn't

152 move changes around automatically, the stable and main branches are

153 \emph{isolated} from each other. The changes that you made on the

154 main branch don't ``leak'' to the stable branch, and vice versa.

155

156 You'll often want all of your bugfixes on the stable branch to show up

157 on the main branch, too. Rather than rewrite a bugfix on the main

158 branch, you can simply pull and merge changes from the stable to the

159 main branch, and Mercurial will bring those bugfixes in for you.

160 \interaction{branching.merge}

161 The main branch will still contain changes that are not on the stable

162 branch, but it will also contain all of the bugfixes from the stable

163 branch. The stable branch remains unaffected by these changes.

164

165 \subsection{Feature branches}

166

167 For larger projects, an effective way to manage change is to break up

168 a team into smaller groups. Each group has a shared branch of its

169 own, cloned from a single ``master'' branch used by the entire

170 project. People working on an individual branch are typically quite

171 isolated from developments on other branches.

172

173 \begin{figure}[ht]

174 \centering

175 \grafix{feature-branches}

176 \caption{Feature branches}

177 \label{fig:collab:feature-branches}

178 \end{figure}

179

180 When a particular feature is deemed to be in suitable shape, someone

181 on that feature team pulls and merges from the master branch into the

182 feature branch, then pushes back up to the master branch.

183

184 \subsection{The release train}

185

186 Some projects are organised on a ``train'' basis: a release is

187 scheduled to happen every few months, and whatever features are ready

188 when the ``train'' is ready to leave are allowed in.

189

190 This model resembles working with feature branches. The difference is

191 that when a feature branch misses a train, someone on the feature team

192 pulls and merges the changes that went out on that train release, and

193 the team continues its work on top of that release so that their

194 feature can make the next release.

195

196 \subsection{The Linux kernel model}

197

198 The development of the Linux kernel has a shallow hierarchical

199 structure, surrounded by a cloud of apparent chaos. Because most

200 Linux developers use \command{git}, a distributed revision control

201 tool with capabilities similar to Mercurial, it's useful to describe

202 the way work flows in that environment; if you like the ideas, the

203 approach translates well across tools.

204

205 At the center of the community sits Linus Torvalds, the creator of

206 Linux. He publishes a single source repository that is considered the

207 ``authoritative'' current tree by the entire developer community.

208 Anyone can clone Linus's tree, but he is very choosy about whose trees

209 he pulls from.

210

211 Linus has a number of ``trusted lieutenants''. As a general rule, he

212 pulls whatever changes they publish, in most cases without even

213 reviewing those changes. Some of those lieutenants are generally

214 agreed to be ``maintainers'', responsible for specific subsystems

215 within the kernel. If a random kernel hacker wants to make a change

216 to a subsystem that they want to end up in Linus's tree, they must

217 find out who the subsystem's maintainer is, and ask that maintainer to

218 take their change. If the maintainer reviews their changes and agrees

219 to take them, they'll pass them along to Linus in due course.

220

221 Individual lieutenants have their own approaches to reviewing,

222 accepting, and publishing changes; and for deciding when to feed them

223 to Linus. In addition, there are several well known branches that

224 people use for different purposes. For example, a few people maintain

225 ``stable'' repositories of older versions of the kernel, to which they

226 apply critical fixes as needed.

227

228 This model has two notable features. The first is that it's ``pull

229 only''. You have to ask, convince, or beg another developer to take a

230 change from you, because there are no shared trees, and there's no way

231 to push changes into a tree that someone else controls.

232

233 The second is that it's based on reputation and acclaim. If you're an

234 unknown, Linus will probably ignore changes from you without even

235 responding. But a subsystem maintainer will probably review them, and

236 will likely take them if they pass their criteria for suitability.

237 The more ``good'' changes you contribute to a maintainer, the more

238 likely they are to trust your judgment and accept your changes. If

239 you're well-known and maintain a long-lived branch for something Linus

240 hasn't yet accepted, people with similar interests may pull your

241 changes regularly to keep up with your work.

242

243 Reputation and acclaim don't necessarily cross subsystem or ``people''

244 boundaries. If you're a respected but specialised storage hacker, and

245 you try to fix a networking bug, that change will receive a level of

246 scrutiny from a network maintainer comparable to a change from a

247 complete stranger.

248

249 To people who come from more orderly project backgrounds, the

250 comparatively chaotic Linux kernel development process often seems

251 completely insane. It's subject to the whims of individuals; people

252 make sweeping changes whenever they deem it appropriate; and the pace

253 of development is astounding. And yet Linux is a highly successful,

254 well-regarded piece of software.

255

256 \section{The technical side of sharing}

257

258 \subsection{Informal sharing with \hgcmd{serve}}

259 \label{sec:collab:serve}

260

261 Mercurial's \hgcmd{serve} command is wonderfully suited to small,

262 tight-knit, and fast-paced group environments. It also provides a

263 great way to get a feel for using Mercurial commands over a network.

264

265 Run \hgcmd{serve} inside a repository, and in under a second it will

266 bring up a specialised HTTP server; this will accept connections from

267 any client, and serve up data for that repository until you terminate

268 it. Anyone who knows the URL of the server you just started, and can

269 talk to your computer over the network, can then use a web browser or

270 Mercurial to read data from that repository. A URL for a

271 \hgcmd{serve} instance running on a laptop is likely to look something

272 like \Verb|http://my-laptop.local:8000/|.

273

274 The \hgcmd{serve} command is \emph{not} a general-purpose web server.

275 It can do only two things:

276 \begin{itemize}

277 \item Allow people to browse the history of the repository it's

278 serving, from their normal web browsers.

279 \item Speak Mercurial's wire protocol, so that people can

280 \hgcmd{clone} or \hgcmd{pull} changes from that repository.

281 \end{itemize}

282 In particular, \hgcmd{serve} won't allow remote users to \emph{modify}

283 your repository. It's intended for read-only use.

284

285 If you're getting started with Mercurial, there's nothing to prevent

286 you from using \hgcmd{serve} to serve up a repository on your own

287 computer, then use commands like \hgcmd{clone}, \hgcmd{incoming}, and

288 so on to talk to that server as if the repository was hosted remotely.

289 This can help you to quickly get acquainted with using commands on

290 network-hosted repositories.

291

292 \subsubsection{A few things to keep in mind}

293

294 Because it provides unauthenticated read access to all clients, you

295 should only use \hgcmd{serve} in an environment where you either don't

296 care, or have complete control over, who can access your network and

297 pull data from your repository.

298

299 The \hgcmd{serve} command knows nothing about any firewall software

300 you might have installed on your system or network. It cannot detect

301 or control your firewall software. If other people are unable to talk

302 to a running \hgcmd{serve} instance, the second thing you should do

303 (\emph{after} you make sure that they're using the correct URL) is

304 check your firewall configuration.

305

306 By default, \hgcmd{serve} listens for incoming connections on

307 port~8000. If another process is already listening on the port you

308 want to use, you can specify a different port to listen on using the

309 \hgopt{serve}{-p} option.

310

311 Normally, when \hgcmd{serve} starts, it prints no output, which can be

312 a bit unnerving. If you'd like to confirm that it is indeed running

313 correctly, and find out what URL you should send to your

314 collaborators, start it with the \hggopt{-v} option.

315

316 \subsection{Using \command{ssh} as a tunnel}

317 \label{sec:collab:ssh}

318

319 \subsection{Serving HTTP with a CGI script}

320 \label{sec:collab:cgi}

321

322

323

324 %%% Local Variables:

325 %%% mode: latex

326 %%% TeX-master: "00book"

327 %%% End: