hgbook

annotate en/hgext.tex @ 228:50223e198614

Update output.
author Bryan O'Sullivan <bos@serpentine.com>
date Sat May 26 12:04:52 2007 -0700 (2007-05-26)
parents 34943a3d50d6
children 28ddbf9f3729
rev   line source
bos@223 1 \chapter{Adding functionality with extensions}
bos@223 2 \label{chap:hgext}
bos@223 3
bos@224 4 While the core of Mercurial is quite complete from a functionality
bos@224 5 standpoint, it's deliberately shorn of fancy features. This approach
bos@224 6 of preserving simplicity keeps the software easy to deal with for both
bos@224 7 maintainers and users.
bos@224 8
bos@224 9 However, Mercurial doesn't box you in with an inflexible command set:
bos@224 10 you can add features to it as \emph{extensions} (sometimes known as
bos@224 11 \emph{plugins}). We've already discussed a few of these extensions in
bos@224 12 earlier chapters.
bos@224 13 \begin{itemize}
bos@224 14 \item Section~\ref{sec:tour-merge:fetch} covers the \hgext{fetch}
bos@224 15 extension; this combines pulling new changes and merging them with
bos@224 16 local changes into a single command, \hgcmd{fetch}.
bos@224 17 \item The \hgext{bisect} extension adds an efficient pruning search
bos@224 18 for changes that introduced bugs, and we documented it in
bos@224 19 chapter~\ref{sec:undo:bisect}.
bos@224 20 \item In chapter~\ref{chap:hook}, we covered several extensions that
bos@224 21 are useful for hook-related functionality: \hgext{acl} adds access
bos@224 22 control lists; \hgext{bugzilla} adds integration with the Bugzilla
bos@224 23 bug tracking system; and \hgext{notify} sends notification emails on
bos@224 24 new changes.
bos@224 25 \item The Mercurial Queues patch management extension is so invaluable
bos@224 26 that it merits two chapters and an appendix all to itself.
bos@224 27 Chapter~\ref{chap:mq} covers the basics;
bos@224 28 chapter~\ref{chap:mq-collab} discusses advanced topics; and
bos@224 29 appendix~\ref{chap:mqref} goes into detail on each command.
bos@224 30 \end{itemize}
bos@224 31
bos@224 32 In this chapter, we'll cover some of the other extensions that are
bos@224 33 available for Mercurial, and briefly touch on some of the machinery
bos@224 34 you'll need to know about if you want to write an extension of your
bos@224 35 own.
bos@224 36 \begin{itemize}
bos@224 37 \item In section~\ref{sec:hgext:inotify}, we'll discuss the
bos@224 38 possibility of \emph{huge} performance improvements using the
bos@224 39 \hgext{inotify} extension.
bos@224 40 \end{itemize}
bos@224 41
bos@224 42 \section{Improve performance with the \hgext{inotify} extension}
bos@224 43 \label{sec:hgext:inotify}
bos@224 44
bos@224 45 Are you interested in having some of the most common Mercurial
bos@224 46 operations run as much as a hundred times faster? Read on!
bos@224 47
bos@224 48 Mercurial has great performance under normal circumstances. For
bos@224 49 example, when you run the \hgcmd{status} command, Mercurial has to
bos@224 50 scan almost every directory and file in your repository so that it can
bos@224 51 display file status. Many other Mercurial commands need to do the
bos@224 52 same work behind the scenes; for example, the \hgcmd{diff} command
bos@224 53 uses the status machinery to avoid doing an expensive comparison
bos@224 54 operation on files that obviously haven't changed.
bos@224 55
bos@224 56 Because obtaining file status is crucial to good performance, the
bos@224 57 authors of Mercurial have optimised this code to within an inch of its
bos@224 58 life. However, there's no avoiding the fact that when you run
bos@224 59 \hgcmd{status}, Mercurial is going to have to perform at least one
bos@224 60 expensive system call for each managed file to determine whether it's
bos@224 61 changed since the last time Mercurial checked. For a sufficiently
bos@224 62 large repository, this can take a long time.
bos@224 63
bos@224 64 To put a number on the magnitude of this effect, I created a
bos@224 65 repository containing 150,000 managed files. I timed \hgcmd{status}
bos@224 66 as taking ten seconds to run, even when \emph{none} of those files had
bos@224 67 been modified.
bos@224 68
bos@224 69 Many modern operating systems contain a file notification facility.
bos@224 70 If a program signs up to an appropriate service, the operating system
bos@224 71 will notify it every time a file of interest is created, modified, or
bos@224 72 deleted. On Linux systems, the kernel component that does this is
bos@224 73 called \texttt{inotify}.
bos@224 74
bos@224 75 Mercurial's \hgext{inotify} extension talks to the kernel's
bos@224 76 \texttt{inotify} component to optimise \hgcmd{status} commands. The
bos@224 77 extension has two components. A daemon sits in the background and
bos@224 78 receives notifications from the \texttt{inotify} subsystem. It also
bos@224 79 listens for connections from a regular Mercurial command. The
bos@224 80 extension modifies Mercurial's behaviour so that instead of scanning
bos@224 81 the filesystem, it queries the daemon. Since the daemon has perfect
bos@224 82 information about the state of the repository, it can respond with a
bos@224 83 result instantaneously, avoiding the need to scan every directory and
bos@224 84 file in the repository.
bos@224 85
bos@224 86 Recall the ten seconds that I measured plain Mercurial as taking to
bos@224 87 run \hgcmd{status} on a 150,000 file repository. With the
bos@224 88 \hgext{inotify} extension enabled, the time dropped to 0.1~seconds, a
bos@224 89 factor of \emph{one hundred} faster.
bos@224 90
bos@224 91 Before we continue, please pay attention to some caveats.
bos@224 92 \begin{itemize}
bos@224 93 \item The \hgext{inotify} extension is Linux-specific. Because it
bos@224 94 interfaces directly to the Linux kernel's \texttt{inotify}
bos@224 95 subsystem, it does not work on other operating systems.
bos@224 96 \item It should work on any Linux distribution that was released after
bos@224 97 early~2005. Older distributions are likely to have a kernel that
bos@224 98 lacks \texttt{inotify}, or a version of \texttt{glibc} that does not
bos@224 99 have the necessary interfacing support.
bos@224 100 \item Not all filesystems are suitable for use with the
bos@224 101 \hgext{inotify} extension. Network filesystems such as NFS are a
bos@224 102 non-starter, for example, particularly if you're running Mercurial
bos@224 103 on several systems, all mounting the same network filesystem. The
bos@224 104 kernel's \texttt{inotify} system has no way of knowing about changes
bos@224 105 made on another system. Most local filesystems (e.g.~ext3, XFS,
bos@224 106 ReiserFS) should work fine.
bos@224 107 \end{itemize}
bos@224 108
bos@224 109 The \hgext{inotify} extension is not yet shipped with Mercurial as of
bos@224 110 May~2007, so it's a little more involved to set up than other
bos@224 111 extensions. But the performance improvement is worth it!
bos@224 112
bos@224 113 The extension currently comes in two parts: a set of patches to the
bos@224 114 Mercurial source code, and a library of Python bindings to the
bos@224 115 \texttt{inotify} subsystem.
bos@224 116 \begin{note}
bos@224 117 There are \emph{two} Python \texttt{inotify} binding libraries. One
bos@224 118 of them is called \texttt{pyinotify}, and is packaged by some Linux
bos@224 119 distributions as \texttt{python-inotify}. This is \emph{not} the
bos@224 120 one you'll need, as it is too buggy and inefficient to be practical.
bos@224 121 \end{note}
bos@224 122 To get going, it's best to already have a functioning copy of
bos@224 123 Mercurial installed.
bos@224 124 \begin{note}
bos@224 125 If you follow the instructions below, you'll be \emph{replacing} and
bos@224 126 overwriting any existing installation of Mercurial that you might
bos@224 127 already have, using the latest ``bleeding edge'' Mercurial code.
bos@224 128 Don't say you weren't warned!
bos@224 129 \end{note}
bos@224 130 \begin{enumerate}
bos@224 131 \item Clone the Python \texttt{inotify} binding repository. Build and
bos@224 132 install it.
bos@224 133 \begin{codesample4}
bos@224 134 hg clone http://hg.kublai.com/python/inotify
bos@224 135 cd inotify
bos@224 136 python setup.py build --force
bos@224 137 sudo python setup.py install --skip-build
bos@224 138 \end{codesample4}
bos@224 139 \item Clone the \dirname{crew} Mercurial repository. Clone the
bos@224 140 \hgext{inotify} patch repository so that Mercurial Queues will be
bos@224 141 able to apply patches to your cope of the \dirname{crew} repository.
bos@224 142 \begin{codesample4}
bos@224 143 hg clone http://hg.intevation.org/mercurial/crew
bos@224 144 hg clone crew inotify
bos@224 145 hg clone http://hg.kublai.com/mercurial/patches/inotify inotify/.hg/patches
bos@224 146 \end{codesample4}
bos@224 147 \item Make sure that you have the Mercurial Queues extension,
bos@224 148 \hgext{mq}, enabled. If you've never used MQ, read
bos@224 149 section~\ref{sec:mq:start} to get started quickly.
bos@224 150 \item Go into the \dirname{inotify} repo, and apply all of the
bos@224 151 \hgext{inotify} patches using the \hgopt{qpush}{-a} option to the
bos@224 152 \hgcmd{qpush} command.
bos@224 153 \begin{codesample4}
bos@224 154 cd inotify
bos@224 155 hg qpush -a
bos@224 156 \end{codesample4}
bos@224 157 If you get an error message from \hgcmd{qpush}, you should not
bos@224 158 continue. Instead, ask for help.
bos@224 159 \item Build and install the patched version of Mercurial.
bos@224 160 \begin{codesample4}
bos@224 161 python setup.py build --force
bos@224 162 sudo python setup.py install --skip-build
bos@224 163 \end{codesample4}
bos@224 164 \end{enumerate}
bos@224 165 Once you've build a suitably patched version of Mercurial, all you
bos@224 166 need to do to enable the \hgext{inotify} extension is add an entry to
bos@224 167 your \hgrc.
bos@224 168 \begin{codesample2}
bos@224 169 [extensions]
bos@224 170 inotify =
bos@224 171 \end{codesample2}
bos@224 172 When the \hgext{inotify} extension is enabled, Mercurial will
bos@224 173 automatically and transparently start the status daemon the first time
bos@224 174 you run a command that needs status in a repository. It runs one
bos@224 175 status daemon per repository.
bos@224 176
bos@224 177 The status daemon is started silently, and runs in the background. If
bos@224 178 you look at a list of running processes after you've enabled the
bos@224 179 \hgext{inotify} extension and run a few commands in different
bos@224 180 repositories, you'll thus see a few \texttt{hg} processes sitting
bos@224 181 around, waiting for updates from the kernel and queries from
bos@224 182 Mercurial.
bos@224 183
bos@224 184 The first time you run a Mercurial command in a repository when you
bos@224 185 have the \hgext{inotify} extension enabled, it will run with about the
bos@224 186 same performance as a normal Mercurial command. This is because the
bos@224 187 status daemon needs to perform a normal status scan so that it has a
bos@224 188 baseline against which to apply later updates from the kernel.
bos@224 189 However, \emph{every} subsequent command that does any kind of status
bos@224 190 check should be noticeably faster on repositories of even fairly
bos@224 191 modest size. Better yet, the bigger your repository is, the greater a
bos@224 192 performance advantage you'll see. The \hgext{inotify} daemon makes
bos@224 193 status operations almost instantaneous on repositories of all sizes!
bos@224 194
bos@224 195 If you like, you can manually start a status daemon using the
bos@224 196 \hgcmd{inserve} command. This gives you slightly finer control over
bos@224 197 how the daemon ought to run. This command will of course only be
bos@224 198 available when the \hgext{inotify} extension is enabled.
bos@224 199
bos@224 200 When you're using the \hgext{inotify} extension, you should notice
bos@224 201 \emph{no difference at all} in Mercurial's behaviour, with the sole
bos@224 202 exception of status-related commands running a whole lot faster than
bos@224 203 they used to. You should specifically expect that commands will not
bos@224 204 print different output; neither should they give different results.
bos@224 205 If either of these situations occurs, please report a bug.
bos@223 206
bos@226 207 \section{Flexible diff support with the \hgext{extdiff} extension}
bos@226 208 \label{sec:hgext:extdiff}
bos@226 209
bos@226 210 Mercurial's built-in \hgcmd{diff} command outputs plaintext unified
bos@226 211 diffs.
bos@226 212 \interaction{extdiff.diff}
bos@226 213 If you would like to use an external tool to display modifications,
bos@226 214 you'll want to use the \hgext{extdiff} extension. This will let you
bos@226 215 use, for example, a graphical diff tool.
bos@226 216
bos@226 217 The \hgext{extdiff} extension is bundled with Mercurial, so it's easy
bos@226 218 to set up. In the \rcsection{extensions} section of your \hgrc,
bos@226 219 simply add a one-line entry to enable the extension.
bos@226 220 \begin{codesample2}
bos@226 221 [extensions]
bos@226 222 extdiff =
bos@226 223 \end{codesample2}
bos@226 224 This introduces a command named \hgcmd{extdiff}, which by default uses
bos@226 225 your system's \command{diff} command to generate a unified diff in the
bos@226 226 same form as the built-in \hgcmd{diff} command.
bos@226 227 \interaction{extdiff.extdiff}
bos@226 228 The result won't be exactly the same as with the built-in \hgcmd{diff}
bos@226 229 variations, because the output of \command{diff} varies from one
bos@226 230 system to another, even when passed the same options.
bos@226 231
bos@226 232 As the ``\texttt{making snapshot}'' lines of output above imply, the
bos@226 233 \hgcmd{extdiff} command works by creating two snapshots of your source
bos@226 234 tree. The first snapshot is of the source revision; the second, of
bos@226 235 the target revision or working directory. The \hgcmd{extdiff} command
bos@226 236 generates these snapshots in a temporary directory, passes the name of
bos@226 237 each directory to an external diff viewer, then deletes the temporary
bos@226 238 directory. For efficiency, it only snapshots the directories and
bos@226 239 files that have changed between the two revisions.
bos@226 240
bos@226 241 Snapshot directory names have the same base name as your repository.
bos@226 242 If your repository path is \dirname{/quux/bar/foo}, then \dirname{foo}
bos@226 243 will be the name of each snapshot directory. Each snapshot directory
bos@226 244 name has its changeset ID appended, if appropriate. If a snapshot is
bos@226 245 of revision \texttt{a631aca1083f}, the directory will be named
bos@226 246 \dirname{foo.a631aca1083f}. A snapshot of the working directory won't
bos@226 247 have a changeset ID appended, so it would just be \dirname{foo} in
bos@226 248 this example. To see what this looks like in practice, look again at
bos@226 249 the \hgcmd{extdiff} example above. Notice that the diff has the
bos@226 250 snapshot directory names embedded in its header.
bos@226 251
bos@226 252 The \hgcmd{extdiff} command accepts two important options. The
bos@226 253 \hgopt{extdiff}{-p} option lets you choose a program to view
bos@226 254 differences with, instead of \command{diff}. With the
bos@226 255 \hgopt{extdiff}{-o} option, you can change the options that
bos@226 256 \hgcmd{extdiff} passes to the program (by default, these options are
bos@226 257 ``\texttt{-Npru}'', which only make sense if you're running
bos@226 258 \command{diff}). In other respects, the \hgcmd{extdiff} acts
bos@226 259 similarly to the built-in \hgcmd{diff} command: you use the same
bos@226 260 option names, syntax, and arguments to specify the revisions you want,
bos@226 261 the files you want, and so on.
bos@226 262
bos@226 263 As an example, here's how to run the normal system \command{diff}
bos@226 264 command, getting it to generate context diffs (using the
bos@226 265 \cmdopt{diff}{-c} option) instead of unified diffs, and five lines of
bos@226 266 context instead of the default three (passing \texttt{5} as the
bos@226 267 argument to the \cmdopt{diff}{-C} option).
bos@226 268 \interaction{extdiff.extdiff-ctx}
bos@226 269
bos@226 270 Launching a visual diff tool is just as easy. Here's how to launch
bos@226 271 the \command{kdiff3} viewer.
bos@226 272 \begin{codesample2}
bos@226 273 hg extdiff -p kdiff3 -o ''
bos@226 274 \end{codesample2}
bos@226 275
bos@226 276 If your diff viewing command can't deal with directories, you can
bos@226 277 easily work around this with a little scripting. For an example of
bos@226 278 such scripting in action with the \hgext{mq} extension and the
bos@226 279 \command{interdiff} command, see
bos@226 280 section~\ref{mq-collab:tips:interdiff}.
bos@226 281
bos@226 282 \subsection{Defining command aliases}
bos@226 283
bos@226 284 It can be cumbersome to remember the options to both the
bos@226 285 \hgcmd{extdiff} command and the diff viewer you want to use, so the
bos@226 286 \hgext{extdiff} extension lets you define \emph{new} commands that
bos@226 287 will invoke your diff viewer with exactly the right options.
bos@226 288
bos@226 289 All you need to do is edit your \hgrc, and add a section named
bos@226 290 \rcsection{extdiff}. Inside this section, you can define multiple
bos@226 291 commands. Here's how to add a \texttt{kdiff3} command. Once you've
bos@226 292 defined this, you can type ``\texttt{hg kdiff3}'' and the
bos@226 293 \hgext{extdiff} extension will run \command{kdiff3} for you.
bos@226 294 \begin{codesample2}
bos@226 295 [extdiff]
bos@226 296 cmd.kdiff3 =
bos@226 297 \end{codesample2}
bos@226 298 If you leave the right hand side of the definition empty, as above,
bos@226 299 the \hgext{extdiff} extension uses the name of the command you defined
bos@226 300 as the name of the external program to run. But these names don't
bos@226 301 have to be the same. Here, we define a command named ``\texttt{hg
bos@226 302 wibble}'', which runs \command{kdiff3}.
bos@226 303 \begin{codesample2}
bos@226 304 [extdiff]
bos@226 305 cmd.wibble = kdiff3
bos@226 306 \end{codesample2}
bos@226 307
bos@226 308 You can also specify the default options that you want to invoke your
bos@226 309 diff viewing program with. The prefix to use is ``\texttt{opts.}'',
bos@226 310 followed by the name of the command to which the options apply. This
bos@226 311 example defines a ``\texttt{hg vimdiff}'' command that runs the
bos@226 312 \command{vim} editor's \texttt{DirDiff} extension.
bos@226 313 \begin{codesample2}
bos@226 314 [extdiff]
bos@226 315 cmd.vimdiff = vim
bos@226 316 opts.vimdiff = -f '+next' '+execute "DirDiff" argv(0) argv(1)'
bos@226 317 \end{codesample2}
bos@226 318
bos@223 319 %%% Local Variables:
bos@223 320 %%% mode: latex
bos@223 321 %%% TeX-master: "00book"
bos@223 322 %%% End: