bos@223: \chapter{Adding functionality with extensions} bos@223: \label{chap:hgext} bos@223: bos@224: While the core of Mercurial is quite complete from a functionality bos@224: standpoint, it's deliberately shorn of fancy features. This approach bos@224: of preserving simplicity keeps the software easy to deal with for both bos@224: maintainers and users. bos@224: bos@224: However, Mercurial doesn't box you in with an inflexible command set: bos@224: you can add features to it as \emph{extensions} (sometimes known as bos@224: \emph{plugins}). We've already discussed a few of these extensions in bos@224: earlier chapters. bos@224: \begin{itemize} bos@224: \item Section~\ref{sec:tour-merge:fetch} covers the \hgext{fetch} bos@224: extension; this combines pulling new changes and merging them with bos@231: local changes into a single command, \hgxcmd{fetch}{fetch}. bos@224: \item The \hgext{bisect} extension adds an efficient pruning search bos@224: for changes that introduced bugs, and we documented it in bos@224: chapter~\ref{sec:undo:bisect}. bos@224: \item In chapter~\ref{chap:hook}, we covered several extensions that bos@224: are useful for hook-related functionality: \hgext{acl} adds access bos@224: control lists; \hgext{bugzilla} adds integration with the Bugzilla bos@224: bug tracking system; and \hgext{notify} sends notification emails on bos@224: new changes. bos@224: \item The Mercurial Queues patch management extension is so invaluable bos@224: that it merits two chapters and an appendix all to itself. bos@224: Chapter~\ref{chap:mq} covers the basics; bos@224: chapter~\ref{chap:mq-collab} discusses advanced topics; and bos@224: appendix~\ref{chap:mqref} goes into detail on each command. bos@224: \end{itemize} bos@224: bos@224: In this chapter, we'll cover some of the other extensions that are bos@224: available for Mercurial, and briefly touch on some of the machinery bos@224: you'll need to know about if you want to write an extension of your bos@224: own. bos@224: \begin{itemize} bos@224: \item In section~\ref{sec:hgext:inotify}, we'll discuss the bos@224: possibility of \emph{huge} performance improvements using the bos@224: \hgext{inotify} extension. bos@224: \end{itemize} bos@224: bos@224: \section{Improve performance with the \hgext{inotify} extension} bos@224: \label{sec:hgext:inotify} bos@224: bos@224: Are you interested in having some of the most common Mercurial bos@224: operations run as much as a hundred times faster? Read on! bos@224: bos@224: Mercurial has great performance under normal circumstances. For bos@224: example, when you run the \hgcmd{status} command, Mercurial has to bos@224: scan almost every directory and file in your repository so that it can bos@224: display file status. Many other Mercurial commands need to do the bos@224: same work behind the scenes; for example, the \hgcmd{diff} command bos@224: uses the status machinery to avoid doing an expensive comparison bos@224: operation on files that obviously haven't changed. bos@224: bos@224: Because obtaining file status is crucial to good performance, the bos@224: authors of Mercurial have optimised this code to within an inch of its bos@224: life. However, there's no avoiding the fact that when you run bos@224: \hgcmd{status}, Mercurial is going to have to perform at least one bos@224: expensive system call for each managed file to determine whether it's bos@224: changed since the last time Mercurial checked. For a sufficiently bos@224: large repository, this can take a long time. bos@224: bos@224: To put a number on the magnitude of this effect, I created a bos@224: repository containing 150,000 managed files. I timed \hgcmd{status} bos@224: as taking ten seconds to run, even when \emph{none} of those files had bos@224: been modified. bos@224: bos@224: Many modern operating systems contain a file notification facility. bos@224: If a program signs up to an appropriate service, the operating system bos@224: will notify it every time a file of interest is created, modified, or bos@224: deleted. On Linux systems, the kernel component that does this is bos@224: called \texttt{inotify}. bos@224: bos@224: Mercurial's \hgext{inotify} extension talks to the kernel's bos@224: \texttt{inotify} component to optimise \hgcmd{status} commands. The bos@224: extension has two components. A daemon sits in the background and bos@224: receives notifications from the \texttt{inotify} subsystem. It also bos@224: listens for connections from a regular Mercurial command. The bos@224: extension modifies Mercurial's behaviour so that instead of scanning bos@224: the filesystem, it queries the daemon. Since the daemon has perfect bos@224: information about the state of the repository, it can respond with a bos@224: result instantaneously, avoiding the need to scan every directory and bos@224: file in the repository. bos@224: bos@224: Recall the ten seconds that I measured plain Mercurial as taking to bos@224: run \hgcmd{status} on a 150,000 file repository. With the bos@224: \hgext{inotify} extension enabled, the time dropped to 0.1~seconds, a bos@224: factor of \emph{one hundred} faster. bos@224: bos@224: Before we continue, please pay attention to some caveats. bos@224: \begin{itemize} bos@224: \item The \hgext{inotify} extension is Linux-specific. Because it bos@224: interfaces directly to the Linux kernel's \texttt{inotify} bos@224: subsystem, it does not work on other operating systems. bos@224: \item It should work on any Linux distribution that was released after bos@224: early~2005. Older distributions are likely to have a kernel that bos@224: lacks \texttt{inotify}, or a version of \texttt{glibc} that does not bos@224: have the necessary interfacing support. bos@224: \item Not all filesystems are suitable for use with the bos@224: \hgext{inotify} extension. Network filesystems such as NFS are a bos@224: non-starter, for example, particularly if you're running Mercurial bos@224: on several systems, all mounting the same network filesystem. The bos@224: kernel's \texttt{inotify} system has no way of knowing about changes bos@224: made on another system. Most local filesystems (e.g.~ext3, XFS, bos@224: ReiserFS) should work fine. bos@224: \end{itemize} bos@224: bos@224: The \hgext{inotify} extension is not yet shipped with Mercurial as of bos@224: May~2007, so it's a little more involved to set up than other bos@224: extensions. But the performance improvement is worth it! bos@224: bos@224: The extension currently comes in two parts: a set of patches to the bos@224: Mercurial source code, and a library of Python bindings to the bos@224: \texttt{inotify} subsystem. bos@224: \begin{note} bos@224: There are \emph{two} Python \texttt{inotify} binding libraries. One bos@224: of them is called \texttt{pyinotify}, and is packaged by some Linux bos@224: distributions as \texttt{python-inotify}. This is \emph{not} the bos@224: one you'll need, as it is too buggy and inefficient to be practical. bos@224: \end{note} bos@224: To get going, it's best to already have a functioning copy of bos@224: Mercurial installed. bos@224: \begin{note} bos@224: If you follow the instructions below, you'll be \emph{replacing} and bos@224: overwriting any existing installation of Mercurial that you might bos@224: already have, using the latest ``bleeding edge'' Mercurial code. bos@224: Don't say you weren't warned! bos@224: \end{note} bos@224: \begin{enumerate} bos@224: \item Clone the Python \texttt{inotify} binding repository. Build and bos@224: install it. bos@224: \begin{codesample4} bos@224: hg clone http://hg.kublai.com/python/inotify bos@224: cd inotify bos@224: python setup.py build --force bos@224: sudo python setup.py install --skip-build bos@224: \end{codesample4} bos@224: \item Clone the \dirname{crew} Mercurial repository. Clone the bos@224: \hgext{inotify} patch repository so that Mercurial Queues will be bos@224: able to apply patches to your cope of the \dirname{crew} repository. bos@224: \begin{codesample4} bos@224: hg clone http://hg.intevation.org/mercurial/crew bos@224: hg clone crew inotify bos@224: hg clone http://hg.kublai.com/mercurial/patches/inotify inotify/.hg/patches bos@224: \end{codesample4} bos@224: \item Make sure that you have the Mercurial Queues extension, bos@224: \hgext{mq}, enabled. If you've never used MQ, read bos@224: section~\ref{sec:mq:start} to get started quickly. bos@224: \item Go into the \dirname{inotify} repo, and apply all of the bos@231: \hgext{inotify} patches using the \hgxopt{mq}{qpush}{-a} option to bos@231: the \hgxcmd{mq}{qpush} command. bos@224: \begin{codesample4} bos@224: cd inotify bos@224: hg qpush -a bos@224: \end{codesample4} bos@231: If you get an error message from \hgxcmd{mq}{qpush}, you should not bos@224: continue. Instead, ask for help. bos@224: \item Build and install the patched version of Mercurial. bos@224: \begin{codesample4} bos@224: python setup.py build --force bos@224: sudo python setup.py install --skip-build bos@224: \end{codesample4} bos@224: \end{enumerate} bos@224: Once you've build a suitably patched version of Mercurial, all you bos@224: need to do to enable the \hgext{inotify} extension is add an entry to bos@224: your \hgrc. bos@224: \begin{codesample2} bos@224: [extensions] bos@224: inotify = bos@224: \end{codesample2} bos@224: When the \hgext{inotify} extension is enabled, Mercurial will bos@224: automatically and transparently start the status daemon the first time bos@224: you run a command that needs status in a repository. It runs one bos@224: status daemon per repository. bos@224: bos@224: The status daemon is started silently, and runs in the background. If bos@224: you look at a list of running processes after you've enabled the bos@224: \hgext{inotify} extension and run a few commands in different bos@224: repositories, you'll thus see a few \texttt{hg} processes sitting bos@224: around, waiting for updates from the kernel and queries from bos@224: Mercurial. bos@224: bos@224: The first time you run a Mercurial command in a repository when you bos@224: have the \hgext{inotify} extension enabled, it will run with about the bos@224: same performance as a normal Mercurial command. This is because the bos@224: status daemon needs to perform a normal status scan so that it has a bos@224: baseline against which to apply later updates from the kernel. bos@224: However, \emph{every} subsequent command that does any kind of status bos@224: check should be noticeably faster on repositories of even fairly bos@224: modest size. Better yet, the bigger your repository is, the greater a bos@224: performance advantage you'll see. The \hgext{inotify} daemon makes bos@224: status operations almost instantaneous on repositories of all sizes! bos@224: bos@224: If you like, you can manually start a status daemon using the bos@231: \hgxcmd{inotify}{inserve} command. This gives you slightly finer bos@231: control over how the daemon ought to run. This command will of course bos@231: only be available when the \hgext{inotify} extension is enabled. bos@224: bos@224: When you're using the \hgext{inotify} extension, you should notice bos@224: \emph{no difference at all} in Mercurial's behaviour, with the sole bos@224: exception of status-related commands running a whole lot faster than bos@224: they used to. You should specifically expect that commands will not bos@224: print different output; neither should they give different results. bos@224: If either of these situations occurs, please report a bug. bos@223: bos@226: \section{Flexible diff support with the \hgext{extdiff} extension} bos@226: \label{sec:hgext:extdiff} bos@226: bos@226: Mercurial's built-in \hgcmd{diff} command outputs plaintext unified bos@226: diffs. bos@226: \interaction{extdiff.diff} bos@226: If you would like to use an external tool to display modifications, bos@226: you'll want to use the \hgext{extdiff} extension. This will let you bos@226: use, for example, a graphical diff tool. bos@226: bos@226: The \hgext{extdiff} extension is bundled with Mercurial, so it's easy bos@226: to set up. In the \rcsection{extensions} section of your \hgrc, bos@226: simply add a one-line entry to enable the extension. bos@226: \begin{codesample2} bos@226: [extensions] bos@226: extdiff = bos@226: \end{codesample2} bos@231: This introduces a command named \hgxcmd{extdiff}{extdiff}, which by bos@231: default uses your system's \command{diff} command to generate a bos@231: unified diff in the same form as the built-in \hgcmd{diff} command. bos@226: \interaction{extdiff.extdiff} bos@226: The result won't be exactly the same as with the built-in \hgcmd{diff} bos@226: variations, because the output of \command{diff} varies from one bos@226: system to another, even when passed the same options. bos@226: bos@226: As the ``\texttt{making snapshot}'' lines of output above imply, the bos@231: \hgxcmd{extdiff}{extdiff} command works by creating two snapshots of bos@231: your source tree. The first snapshot is of the source revision; the bos@231: second, of the target revision or working directory. The bos@231: \hgxcmd{extdiff}{extdiff} command generates these snapshots in a bos@231: temporary directory, passes the name of each directory to an external bos@231: diff viewer, then deletes the temporary directory. For efficiency, it bos@231: only snapshots the directories and files that have changed between the bos@231: two revisions. bos@226: bos@226: Snapshot directory names have the same base name as your repository. bos@226: If your repository path is \dirname{/quux/bar/foo}, then \dirname{foo} bos@226: will be the name of each snapshot directory. Each snapshot directory bos@226: name has its changeset ID appended, if appropriate. If a snapshot is bos@226: of revision \texttt{a631aca1083f}, the directory will be named bos@226: \dirname{foo.a631aca1083f}. A snapshot of the working directory won't bos@226: have a changeset ID appended, so it would just be \dirname{foo} in bos@226: this example. To see what this looks like in practice, look again at bos@231: the \hgxcmd{extdiff}{extdiff} example above. Notice that the diff has bos@231: the snapshot directory names embedded in its header. bos@231: bos@231: The \hgxcmd{extdiff}{extdiff} command accepts two important options. bos@231: The \hgxopt{extdiff}{extdiff}{-p} option lets you choose a program to bos@231: view differences with, instead of \command{diff}. With the bos@231: \hgxopt{extdiff}{extdiff}{-o} option, you can change the options that bos@231: \hgxcmd{extdiff}{extdiff} passes to the program (by default, these bos@231: options are ``\texttt{-Npru}'', which only make sense if you're bos@231: running \command{diff}). In other respects, the bos@231: \hgxcmd{extdiff}{extdiff} command acts similarly to the built-in bos@231: \hgcmd{diff} command: you use the same option names, syntax, and bos@231: arguments to specify the revisions you want, the files you want, and bos@231: so on. bos@226: bos@226: As an example, here's how to run the normal system \command{diff} bos@226: command, getting it to generate context diffs (using the bos@226: \cmdopt{diff}{-c} option) instead of unified diffs, and five lines of bos@226: context instead of the default three (passing \texttt{5} as the bos@226: argument to the \cmdopt{diff}{-C} option). bos@226: \interaction{extdiff.extdiff-ctx} bos@226: bos@226: Launching a visual diff tool is just as easy. Here's how to launch bos@226: the \command{kdiff3} viewer. bos@226: \begin{codesample2} bos@226: hg extdiff -p kdiff3 -o '' bos@226: \end{codesample2} bos@226: bos@226: If your diff viewing command can't deal with directories, you can bos@226: easily work around this with a little scripting. For an example of bos@226: such scripting in action with the \hgext{mq} extension and the bos@226: \command{interdiff} command, see bos@226: section~\ref{mq-collab:tips:interdiff}. bos@226: bos@226: \subsection{Defining command aliases} bos@226: bos@226: It can be cumbersome to remember the options to both the bos@231: \hgxcmd{extdiff}{extdiff} command and the diff viewer you want to use, bos@231: so the \hgext{extdiff} extension lets you define \emph{new} commands bos@231: that will invoke your diff viewer with exactly the right options. bos@226: bos@226: All you need to do is edit your \hgrc, and add a section named bos@226: \rcsection{extdiff}. Inside this section, you can define multiple bos@226: commands. Here's how to add a \texttt{kdiff3} command. Once you've bos@226: defined this, you can type ``\texttt{hg kdiff3}'' and the bos@226: \hgext{extdiff} extension will run \command{kdiff3} for you. bos@226: \begin{codesample2} bos@226: [extdiff] bos@226: cmd.kdiff3 = bos@226: \end{codesample2} bos@226: If you leave the right hand side of the definition empty, as above, bos@226: the \hgext{extdiff} extension uses the name of the command you defined bos@226: as the name of the external program to run. But these names don't bos@226: have to be the same. Here, we define a command named ``\texttt{hg bos@226: wibble}'', which runs \command{kdiff3}. bos@226: \begin{codesample2} bos@226: [extdiff] bos@226: cmd.wibble = kdiff3 bos@226: \end{codesample2} bos@226: bos@226: You can also specify the default options that you want to invoke your bos@226: diff viewing program with. The prefix to use is ``\texttt{opts.}'', bos@226: followed by the name of the command to which the options apply. This bos@226: example defines a ``\texttt{hg vimdiff}'' command that runs the bos@226: \command{vim} editor's \texttt{DirDiff} extension. bos@226: \begin{codesample2} bos@226: [extdiff] bos@226: cmd.vimdiff = vim bos@226: opts.vimdiff = -f '+next' '+execute "DirDiff" argv(0) argv(1)' bos@226: \end{codesample2} bos@226: bos@232: \section{Cherrypicking changes with the \hgext{transplant} extension} bos@232: \label{sec:hgext:transplant} bos@232: bos@232: Need to have a long chat with Brendan about this. bos@232: bos@232: \section{Send changes via email with the \hgext{patchbomb} extension} bos@232: \label{sec:hgext:patchbomb} bos@232: bos@232: Many projects have a culture of ``change review'', in which people bos@232: send their modifications to a mailing list for others to read and bos@232: comment on before they commit the final version to a shared bos@232: repository. Some projects have people who act as gatekeepers; they bos@232: apply changes from other people to a repository to which those others bos@232: don't have access. bos@232: bos@232: Mercurial makes it easy to send changes over email for review or bos@232: application, via its \hgext{patchbomb} extension. The extension is so bos@232: namd because changes are formatted as patches, and it's usual to send bos@232: one changeset per email message. Sending a long series of changes by bos@232: email is thus much like ``bombing'' the recipient's inbox, hence bos@232: ``patchbomb''. bos@232: bos@232: As usual, the basic configuration of the \hgext{patchbomb} extension bos@232: takes just one or two lines in your \hgrc. bos@232: \begin{codesample2} bos@232: [extensions] bos@232: patchbomb = bos@232: \end{codesample2} bos@232: Once you've enabled the extension, you will have a new command bos@241: available, named \hgxcmd{patchbomb}{email}. bos@232: bos@231: bos@223: %%% Local Variables: bos@223: %%% mode: latex bos@223: %%% TeX-master: "00book" bos@223: %%% End: