hgbook: 980393101109 en/hgext.tex

hgbook

view en/hgext.tex @ 244:980393101109

Fix up some incorrect and stale directory names.
The .hg/data name problem was reported by Tim Hatch.

author	Bryan O'Sullivan <bos@serpentine.com>
date	Wed May 30 09:02:19 2007 -0700 (2007-05-30)
parents	09d5897ad935
children	7a6bd93174bd

line source

1 \chapter{Adding functionality with extensions}

2 \label{chap:hgext}

4 While the core of Mercurial is quite complete from a functionality

5 standpoint, it's deliberately shorn of fancy features. This approach

6 of preserving simplicity keeps the software easy to deal with for both

7 maintainers and users.

9 However, Mercurial doesn't box you in with an inflexible command set:

10 you can add features to it as \emph{extensions} (sometimes known as

11 \emph{plugins}). We've already discussed a few of these extensions in

12 earlier chapters.

13 \begin{itemize}

14 \item Section~\ref{sec:tour-merge:fetch} covers the \hgext{fetch}

15 extension; this combines pulling new changes and merging them with

16 local changes into a single command, \hgxcmd{fetch}{fetch}.

17 \item The \hgext{bisect} extension adds an efficient pruning search

18 for changes that introduced bugs, and we documented it in

19 chapter~\ref{sec:undo:bisect}.

20 \item In chapter~\ref{chap:hook}, we covered several extensions that

21 are useful for hook-related functionality: \hgext{acl} adds access

22 control lists; \hgext{bugzilla} adds integration with the Bugzilla

23 bug tracking system; and \hgext{notify} sends notification emails on

24 new changes.

25 \item The Mercurial Queues patch management extension is so invaluable

26 that it merits two chapters and an appendix all to itself.

27 Chapter~\ref{chap:mq} covers the basics;

28 chapter~\ref{chap:mq-collab} discusses advanced topics; and

29 appendix~\ref{chap:mqref} goes into detail on each command.

30 \end{itemize}

32 In this chapter, we'll cover some of the other extensions that are

33 available for Mercurial, and briefly touch on some of the machinery

34 you'll need to know about if you want to write an extension of your

35 own.

36 \begin{itemize}

37 \item In section~\ref{sec:hgext:inotify}, we'll discuss the

38 possibility of \emph{huge} performance improvements using the

39 \hgext{inotify} extension.

40 \end{itemize}

42 \section{Improve performance with the \hgext{inotify} extension}

43 \label{sec:hgext:inotify}

45 Are you interested in having some of the most common Mercurial

46 operations run as much as a hundred times faster? Read on!

48 Mercurial has great performance under normal circumstances. For

49 example, when you run the \hgcmd{status} command, Mercurial has to

50 scan almost every directory and file in your repository so that it can

51 display file status. Many other Mercurial commands need to do the

52 same work behind the scenes; for example, the \hgcmd{diff} command

53 uses the status machinery to avoid doing an expensive comparison

54 operation on files that obviously haven't changed.

56 Because obtaining file status is crucial to good performance, the

57 authors of Mercurial have optimised this code to within an inch of its

58 life. However, there's no avoiding the fact that when you run

59 \hgcmd{status}, Mercurial is going to have to perform at least one

60 expensive system call for each managed file to determine whether it's

61 changed since the last time Mercurial checked. For a sufficiently

62 large repository, this can take a long time.

64 To put a number on the magnitude of this effect, I created a

65 repository containing 150,000 managed files. I timed \hgcmd{status}

66 as taking ten seconds to run, even when \emph{none} of those files had

67 been modified.

69 Many modern operating systems contain a file notification facility.

70 If a program signs up to an appropriate service, the operating system

71 will notify it every time a file of interest is created, modified, or

72 deleted. On Linux systems, the kernel component that does this is

73 called \texttt{inotify}.

75 Mercurial's \hgext{inotify} extension talks to the kernel's

76 \texttt{inotify} component to optimise \hgcmd{status} commands. The

77 extension has two components. A daemon sits in the background and

78 receives notifications from the \texttt{inotify} subsystem. It also

79 listens for connections from a regular Mercurial command. The

80 extension modifies Mercurial's behaviour so that instead of scanning

81 the filesystem, it queries the daemon. Since the daemon has perfect

82 information about the state of the repository, it can respond with a

83 result instantaneously, avoiding the need to scan every directory and

84 file in the repository.

86 Recall the ten seconds that I measured plain Mercurial as taking to

87 run \hgcmd{status} on a 150,000 file repository. With the

88 \hgext{inotify} extension enabled, the time dropped to 0.1~seconds, a

89 factor of \emph{one hundred} faster.

91 Before we continue, please pay attention to some caveats.

92 \begin{itemize}

93 \item The \hgext{inotify} extension is Linux-specific. Because it

94 interfaces directly to the Linux kernel's \texttt{inotify}

95 subsystem, it does not work on other operating systems.

96 \item It should work on any Linux distribution that was released after

97 early~2005. Older distributions are likely to have a kernel that

98 lacks \texttt{inotify}, or a version of \texttt{glibc} that does not

99 have the necessary interfacing support.

100 \item Not all filesystems are suitable for use with the

101 \hgext{inotify} extension. Network filesystems such as NFS are a

102 non-starter, for example, particularly if you're running Mercurial

103 on several systems, all mounting the same network filesystem. The

104 kernel's \texttt{inotify} system has no way of knowing about changes

105 made on another system. Most local filesystems (e.g.~ext3, XFS,

106 ReiserFS) should work fine.

107 \end{itemize}

108

109 The \hgext{inotify} extension is not yet shipped with Mercurial as of

110 May~2007, so it's a little more involved to set up than other

111 extensions. But the performance improvement is worth it!

112

113 The extension currently comes in two parts: a set of patches to the

114 Mercurial source code, and a library of Python bindings to the

115 \texttt{inotify} subsystem.

116 \begin{note}

117 There are \emph{two} Python \texttt{inotify} binding libraries. One

118 of them is called \texttt{pyinotify}, and is packaged by some Linux

119 distributions as \texttt{python-inotify}. This is \emph{not} the

120 one you'll need, as it is too buggy and inefficient to be practical.

121 \end{note}

122 To get going, it's best to already have a functioning copy of

123 Mercurial installed.

124 \begin{note}

125 If you follow the instructions below, you'll be \emph{replacing} and

126 overwriting any existing installation of Mercurial that you might

127 already have, using the latest ``bleeding edge'' Mercurial code.

128 Don't say you weren't warned!

129 \end{note}

130 \begin{enumerate}

131 \item Clone the Python \texttt{inotify} binding repository. Build and

132 install it.

133 \begin{codesample4}

134 hg clone http://hg.kublai.com/python/inotify

135 cd inotify

136 python setup.py build --force

137 sudo python setup.py install --skip-build

138 \end{codesample4}

139 \item Clone the \dirname{crew} Mercurial repository. Clone the

140 \hgext{inotify} patch repository so that Mercurial Queues will be

141 able to apply patches to your cope of the \dirname{crew} repository.

142 \begin{codesample4}

143 hg clone http://hg.intevation.org/mercurial/crew

144 hg clone crew inotify

145 hg clone http://hg.kublai.com/mercurial/patches/inotify inotify/.hg/patches

146 \end{codesample4}

147 \item Make sure that you have the Mercurial Queues extension,

148 \hgext{mq}, enabled. If you've never used MQ, read

149 section~\ref{sec:mq:start} to get started quickly.

150 \item Go into the \dirname{inotify} repo, and apply all of the

151 \hgext{inotify} patches using the \hgxopt{mq}{qpush}{-a} option to

152 the \hgxcmd{mq}{qpush} command.

153 \begin{codesample4}

154 cd inotify

155 hg qpush -a

156 \end{codesample4}

157 If you get an error message from \hgxcmd{mq}{qpush}, you should not

158 continue. Instead, ask for help.

159 \item Build and install the patched version of Mercurial.

160 \begin{codesample4}

161 python setup.py build --force

162 sudo python setup.py install --skip-build

163 \end{codesample4}

164 \end{enumerate}

165 Once you've build a suitably patched version of Mercurial, all you

166 need to do to enable the \hgext{inotify} extension is add an entry to

167 your \hgrc.

168 \begin{codesample2}

169 [extensions]

170 inotify =

171 \end{codesample2}

172 When the \hgext{inotify} extension is enabled, Mercurial will

173 automatically and transparently start the status daemon the first time

174 you run a command that needs status in a repository. It runs one

175 status daemon per repository.

176

177 The status daemon is started silently, and runs in the background. If

178 you look at a list of running processes after you've enabled the

179 \hgext{inotify} extension and run a few commands in different

180 repositories, you'll thus see a few \texttt{hg} processes sitting

181 around, waiting for updates from the kernel and queries from

182 Mercurial.

183

184 The first time you run a Mercurial command in a repository when you

185 have the \hgext{inotify} extension enabled, it will run with about the

186 same performance as a normal Mercurial command. This is because the

187 status daemon needs to perform a normal status scan so that it has a

188 baseline against which to apply later updates from the kernel.

189 However, \emph{every} subsequent command that does any kind of status

190 check should be noticeably faster on repositories of even fairly

191 modest size. Better yet, the bigger your repository is, the greater a

192 performance advantage you'll see. The \hgext{inotify} daemon makes

193 status operations almost instantaneous on repositories of all sizes!

194

195 If you like, you can manually start a status daemon using the

196 \hgxcmd{inotify}{inserve} command. This gives you slightly finer

197 control over how the daemon ought to run. This command will of course

198 only be available when the \hgext{inotify} extension is enabled.

199

200 When you're using the \hgext{inotify} extension, you should notice

201 \emph{no difference at all} in Mercurial's behaviour, with the sole

202 exception of status-related commands running a whole lot faster than

203 they used to. You should specifically expect that commands will not

204 print different output; neither should they give different results.

205 If either of these situations occurs, please report a bug.

206

207 \section{Flexible diff support with the \hgext{extdiff} extension}

208 \label{sec:hgext:extdiff}

209

210 Mercurial's built-in \hgcmd{diff} command outputs plaintext unified

211 diffs.

212 \interaction{extdiff.diff}

213 If you would like to use an external tool to display modifications,

214 you'll want to use the \hgext{extdiff} extension. This will let you

215 use, for example, a graphical diff tool.

216

217 The \hgext{extdiff} extension is bundled with Mercurial, so it's easy

218 to set up. In the \rcsection{extensions} section of your \hgrc,

219 simply add a one-line entry to enable the extension.

220 \begin{codesample2}

221 [extensions]

222 extdiff =

223 \end{codesample2}

224 This introduces a command named \hgxcmd{extdiff}{extdiff}, which by

225 default uses your system's \command{diff} command to generate a

226 unified diff in the same form as the built-in \hgcmd{diff} command.

227 \interaction{extdiff.extdiff}

228 The result won't be exactly the same as with the built-in \hgcmd{diff}

229 variations, because the output of \command{diff} varies from one

230 system to another, even when passed the same options.

231

232 As the ``\texttt{making snapshot}'' lines of output above imply, the

233 \hgxcmd{extdiff}{extdiff} command works by creating two snapshots of

234 your source tree. The first snapshot is of the source revision; the

235 second, of the target revision or working directory. The

236 \hgxcmd{extdiff}{extdiff} command generates these snapshots in a

237 temporary directory, passes the name of each directory to an external

238 diff viewer, then deletes the temporary directory. For efficiency, it

239 only snapshots the directories and files that have changed between the

240 two revisions.

241

242 Snapshot directory names have the same base name as your repository.

243 If your repository path is \dirname{/quux/bar/foo}, then \dirname{foo}

244 will be the name of each snapshot directory. Each snapshot directory

245 name has its changeset ID appended, if appropriate. If a snapshot is

246 of revision \texttt{a631aca1083f}, the directory will be named

247 \dirname{foo.a631aca1083f}. A snapshot of the working directory won't

248 have a changeset ID appended, so it would just be \dirname{foo} in

249 this example. To see what this looks like in practice, look again at

250 the \hgxcmd{extdiff}{extdiff} example above. Notice that the diff has

251 the snapshot directory names embedded in its header.

252

253 The \hgxcmd{extdiff}{extdiff} command accepts two important options.

254 The \hgxopt{extdiff}{extdiff}{-p} option lets you choose a program to

255 view differences with, instead of \command{diff}. With the

256 \hgxopt{extdiff}{extdiff}{-o} option, you can change the options that

257 \hgxcmd{extdiff}{extdiff} passes to the program (by default, these

258 options are ``\texttt{-Npru}'', which only make sense if you're

259 running \command{diff}). In other respects, the

260 \hgxcmd{extdiff}{extdiff} command acts similarly to the built-in

261 \hgcmd{diff} command: you use the same option names, syntax, and

262 arguments to specify the revisions you want, the files you want, and

263 so on.

264

265 As an example, here's how to run the normal system \command{diff}

266 command, getting it to generate context diffs (using the

267 \cmdopt{diff}{-c} option) instead of unified diffs, and five lines of

268 context instead of the default three (passing \texttt{5} as the

269 argument to the \cmdopt{diff}{-C} option).

270 \interaction{extdiff.extdiff-ctx}

271

272 Launching a visual diff tool is just as easy. Here's how to launch

273 the \command{kdiff3} viewer.

274 \begin{codesample2}

275 hg extdiff -p kdiff3 -o ''

276 \end{codesample2}

277

278 If your diff viewing command can't deal with directories, you can

279 easily work around this with a little scripting. For an example of

280 such scripting in action with the \hgext{mq} extension and the

281 \command{interdiff} command, see

282 section~\ref{mq-collab:tips:interdiff}.

283

284 \subsection{Defining command aliases}

285

286 It can be cumbersome to remember the options to both the

287 \hgxcmd{extdiff}{extdiff} command and the diff viewer you want to use,

288 so the \hgext{extdiff} extension lets you define \emph{new} commands

289 that will invoke your diff viewer with exactly the right options.

290

291 All you need to do is edit your \hgrc, and add a section named

292 \rcsection{extdiff}. Inside this section, you can define multiple

293 commands. Here's how to add a \texttt{kdiff3} command. Once you've

294 defined this, you can type ``\texttt{hg kdiff3}'' and the

295 \hgext{extdiff} extension will run \command{kdiff3} for you.

296 \begin{codesample2}

297 [extdiff]

298 cmd.kdiff3 =

299 \end{codesample2}

300 If you leave the right hand side of the definition empty, as above,

301 the \hgext{extdiff} extension uses the name of the command you defined

302 as the name of the external program to run. But these names don't

303 have to be the same. Here, we define a command named ``\texttt{hg

304 wibble}'', which runs \command{kdiff3}.

305 \begin{codesample2}

306 [extdiff]

307 cmd.wibble = kdiff3

308 \end{codesample2}

309

310 You can also specify the default options that you want to invoke your

311 diff viewing program with. The prefix to use is ``\texttt{opts.}'',

312 followed by the name of the command to which the options apply. This

313 example defines a ``\texttt{hg vimdiff}'' command that runs the

314 \command{vim} editor's \texttt{DirDiff} extension.

315 \begin{codesample2}

316 [extdiff]

317 cmd.vimdiff = vim

318 opts.vimdiff = -f '+next' '+execute "DirDiff" argv(0) argv(1)'

319 \end{codesample2}

320

321 \section{Cherrypicking changes with the \hgext{transplant} extension}

322 \label{sec:hgext:transplant}

323

324 Need to have a long chat with Brendan about this.

325

326 \section{Send changes via email with the \hgext{patchbomb} extension}

327 \label{sec:hgext:patchbomb}

328

329 Many projects have a culture of ``change review'', in which people

330 send their modifications to a mailing list for others to read and

331 comment on before they commit the final version to a shared

332 repository. Some projects have people who act as gatekeepers; they

333 apply changes from other people to a repository to which those others

334 don't have access.

335

336 Mercurial makes it easy to send changes over email for review or

337 application, via its \hgext{patchbomb} extension. The extension is so

338 namd because changes are formatted as patches, and it's usual to send

339 one changeset per email message. Sending a long series of changes by

340 email is thus much like ``bombing'' the recipient's inbox, hence

341 ``patchbomb''.

342

343 As usual, the basic configuration of the \hgext{patchbomb} extension

344 takes just one or two lines in your \hgrc.

345 \begin{codesample2}

346 [extensions]

347 patchbomb =

348 \end{codesample2}

349 Once you've enabled the extension, you will have a new command

350 available, named \hgxcmd{patchbomb}{email}.

351

352 The safest and best way to invoke the \hgxcmd{patchbomb}{email}

353 command is to \emph{always} run it first with the

354 \hgxopt{patchbomb}{email}{-n} option. This will show you what the

355 command \emph{would} send, without actually sending anything. Once

356 you've had a quick glance over the changes and verified that you are

357 sending the right ones, you can rerun the same command, with the

358 \hgxopt{patchbomb}{email}{-n} option removed.

359

360 The \hgxcmd{patchbomb}{email} command accepts the same kind of

361 revision syntax as every other Mercurial command. For example, this

362 command will send every revision between 7 and \texttt{tip},

363 inclusive.

364 \begin{codesample2}

365 hg email -n 7:tip

366 \end{codesample2}

367 You can also specify a \emph{repository} to compare with. If you

368 provide a repository but no revisions, the \hgxcmd{patchbomb}{email}

369 command will send all revisions in the local repository that are not

370 present in the remote repository. If you additionally specify

371 revisions or a branch name (the latter using the

372 \hgxopt{patchbomb}{email}{-b} option), this will constrain the

373 revisions sent.

374

375 It's perfectly safe to run the \hgxcmd{patchbomb}{email} command

376 without the names of the people you want to send to: if you do this,

377 it will just prompt you for those values interactively. (If you're

378 using a Linux or Unix-like system, you should have enhanced

379 \texttt{readline}-style editing capabilities when entering those

380 headers, too, which is useful.)

381

382 When you are sending just one revision, the \hgxcmd{patchbomb}{email}

383 command will by default use the first line of the changeset

384 description as the subject of the single email message it sends.

385

386 If you send multiple revisions, the \hgxcmd{patchbomb}{email} command

387 will usually send one message per changeset. It will preface the

388 series with an introductory message, in which you should describe the

389 purpose of the series of changes you're sending.

390

391 \subsection{Changing the behaviour of patchbombs}

392

393 Not every project has exactly the same conventions for sending changes

394 in email; the \hgext{patchbomb} extension tries to accommodate a

395 number of variations through command line options.

396 \begin{itemize}

397 \item You can write a subject for the introductory message on the

398 command line using the \hgxopt{patchbomb}{email}{-s} option. This

399 takes one argument, the text of the subject to use.

400 \item To change the email address from which the messages originate,

401 use the \hgxopt{patchbomb}{email}{-f} option. This takes one

402 argument, the email address to use.

403 \item The default behaviour is to send unified diffs (see

404 section~\ref{sec:mq:patch} for a description of the format), one per

405 message. You can send a binary bundle instead with the

406 \hgxopt{patchbomb}{email}{-b} option.

407 \item Unified diffs are normally prefaced with a metadata header. You

408 can omit this, and send unadorned diffs, with the

409 \hgxopt{patchbomb}{email}{--plain} option.

410 \item Diffs are normally sent ``inline'', in the same body part as the

411 description of a patch. This makes it easiest for the largest

412 number of readers to quote and respond to parts of a diff, as some

413 mail clients will only quote the first MIME body part in a message.

414 If you'd prefer to send the description and the diff in separate

415 body parts, use the \hgxopt{patchbomb}{email}{-a} option.

416 \item Instead of sending mail messages, you can write them to an

417 \texttt{mbox}-format mail folder using the

418 \hgxopt{patchbomb}{email}{-m} option. That option takes one

419 argument, the name of the file to write to.

420 \item If you would like to add a \command{diffstat}-format summary to

421 each patch, and one to the introductory message, use the

422 \hgxopt{patchbomb}{email}{-d} option. The \command{diffstat}

423 command displays a table containing the name of each file patched,

424 the number of lines affected, and a histogram showing how much each

425 file is modified. This gives readers a qualitative glance at how

426 complex a patch is.

427 \end{itemize}

428

429 %%% Local Variables:

430 %%% mode: latex

431 %%% TeX-master: "00book"

432 %%% End: