rev |
line source |
bos@223
|
1 \chapter{Adding functionality with extensions}
|
bos@223
|
2 \label{chap:hgext}
|
bos@223
|
3
|
bos@224
|
4 While the core of Mercurial is quite complete from a functionality
|
bos@224
|
5 standpoint, it's deliberately shorn of fancy features. This approach
|
bos@224
|
6 of preserving simplicity keeps the software easy to deal with for both
|
bos@224
|
7 maintainers and users.
|
bos@224
|
8
|
bos@224
|
9 However, Mercurial doesn't box you in with an inflexible command set:
|
bos@224
|
10 you can add features to it as \emph{extensions} (sometimes known as
|
bos@224
|
11 \emph{plugins}). We've already discussed a few of these extensions in
|
bos@224
|
12 earlier chapters.
|
bos@224
|
13 \begin{itemize}
|
bos@224
|
14 \item Section~\ref{sec:tour-merge:fetch} covers the \hgext{fetch}
|
bos@224
|
15 extension; this combines pulling new changes and merging them with
|
bos@224
|
16 local changes into a single command, \hgcmd{fetch}.
|
bos@224
|
17 \item The \hgext{bisect} extension adds an efficient pruning search
|
bos@224
|
18 for changes that introduced bugs, and we documented it in
|
bos@224
|
19 chapter~\ref{sec:undo:bisect}.
|
bos@224
|
20 \item In chapter~\ref{chap:hook}, we covered several extensions that
|
bos@224
|
21 are useful for hook-related functionality: \hgext{acl} adds access
|
bos@224
|
22 control lists; \hgext{bugzilla} adds integration with the Bugzilla
|
bos@224
|
23 bug tracking system; and \hgext{notify} sends notification emails on
|
bos@224
|
24 new changes.
|
bos@224
|
25 \item The Mercurial Queues patch management extension is so invaluable
|
bos@224
|
26 that it merits two chapters and an appendix all to itself.
|
bos@224
|
27 Chapter~\ref{chap:mq} covers the basics;
|
bos@224
|
28 chapter~\ref{chap:mq-collab} discusses advanced topics; and
|
bos@224
|
29 appendix~\ref{chap:mqref} goes into detail on each command.
|
bos@224
|
30 \end{itemize}
|
bos@224
|
31
|
bos@224
|
32 In this chapter, we'll cover some of the other extensions that are
|
bos@224
|
33 available for Mercurial, and briefly touch on some of the machinery
|
bos@224
|
34 you'll need to know about if you want to write an extension of your
|
bos@224
|
35 own.
|
bos@224
|
36 \begin{itemize}
|
bos@224
|
37 \item In section~\ref{sec:hgext:inotify}, we'll discuss the
|
bos@224
|
38 possibility of \emph{huge} performance improvements using the
|
bos@224
|
39 \hgext{inotify} extension.
|
bos@224
|
40 \end{itemize}
|
bos@224
|
41
|
bos@224
|
42 \section{Improve performance with the \hgext{inotify} extension}
|
bos@224
|
43 \label{sec:hgext:inotify}
|
bos@224
|
44
|
bos@224
|
45 Are you interested in having some of the most common Mercurial
|
bos@224
|
46 operations run as much as a hundred times faster? Read on!
|
bos@224
|
47
|
bos@224
|
48 Mercurial has great performance under normal circumstances. For
|
bos@224
|
49 example, when you run the \hgcmd{status} command, Mercurial has to
|
bos@224
|
50 scan almost every directory and file in your repository so that it can
|
bos@224
|
51 display file status. Many other Mercurial commands need to do the
|
bos@224
|
52 same work behind the scenes; for example, the \hgcmd{diff} command
|
bos@224
|
53 uses the status machinery to avoid doing an expensive comparison
|
bos@224
|
54 operation on files that obviously haven't changed.
|
bos@224
|
55
|
bos@224
|
56 Because obtaining file status is crucial to good performance, the
|
bos@224
|
57 authors of Mercurial have optimised this code to within an inch of its
|
bos@224
|
58 life. However, there's no avoiding the fact that when you run
|
bos@224
|
59 \hgcmd{status}, Mercurial is going to have to perform at least one
|
bos@224
|
60 expensive system call for each managed file to determine whether it's
|
bos@224
|
61 changed since the last time Mercurial checked. For a sufficiently
|
bos@224
|
62 large repository, this can take a long time.
|
bos@224
|
63
|
bos@224
|
64 To put a number on the magnitude of this effect, I created a
|
bos@224
|
65 repository containing 150,000 managed files. I timed \hgcmd{status}
|
bos@224
|
66 as taking ten seconds to run, even when \emph{none} of those files had
|
bos@224
|
67 been modified.
|
bos@224
|
68
|
bos@224
|
69 Many modern operating systems contain a file notification facility.
|
bos@224
|
70 If a program signs up to an appropriate service, the operating system
|
bos@224
|
71 will notify it every time a file of interest is created, modified, or
|
bos@224
|
72 deleted. On Linux systems, the kernel component that does this is
|
bos@224
|
73 called \texttt{inotify}.
|
bos@224
|
74
|
bos@224
|
75 Mercurial's \hgext{inotify} extension talks to the kernel's
|
bos@224
|
76 \texttt{inotify} component to optimise \hgcmd{status} commands. The
|
bos@224
|
77 extension has two components. A daemon sits in the background and
|
bos@224
|
78 receives notifications from the \texttt{inotify} subsystem. It also
|
bos@224
|
79 listens for connections from a regular Mercurial command. The
|
bos@224
|
80 extension modifies Mercurial's behaviour so that instead of scanning
|
bos@224
|
81 the filesystem, it queries the daemon. Since the daemon has perfect
|
bos@224
|
82 information about the state of the repository, it can respond with a
|
bos@224
|
83 result instantaneously, avoiding the need to scan every directory and
|
bos@224
|
84 file in the repository.
|
bos@224
|
85
|
bos@224
|
86 Recall the ten seconds that I measured plain Mercurial as taking to
|
bos@224
|
87 run \hgcmd{status} on a 150,000 file repository. With the
|
bos@224
|
88 \hgext{inotify} extension enabled, the time dropped to 0.1~seconds, a
|
bos@224
|
89 factor of \emph{one hundred} faster.
|
bos@224
|
90
|
bos@224
|
91 Before we continue, please pay attention to some caveats.
|
bos@224
|
92 \begin{itemize}
|
bos@224
|
93 \item The \hgext{inotify} extension is Linux-specific. Because it
|
bos@224
|
94 interfaces directly to the Linux kernel's \texttt{inotify}
|
bos@224
|
95 subsystem, it does not work on other operating systems.
|
bos@224
|
96 \item It should work on any Linux distribution that was released after
|
bos@224
|
97 early~2005. Older distributions are likely to have a kernel that
|
bos@224
|
98 lacks \texttt{inotify}, or a version of \texttt{glibc} that does not
|
bos@224
|
99 have the necessary interfacing support.
|
bos@224
|
100 \item Not all filesystems are suitable for use with the
|
bos@224
|
101 \hgext{inotify} extension. Network filesystems such as NFS are a
|
bos@224
|
102 non-starter, for example, particularly if you're running Mercurial
|
bos@224
|
103 on several systems, all mounting the same network filesystem. The
|
bos@224
|
104 kernel's \texttt{inotify} system has no way of knowing about changes
|
bos@224
|
105 made on another system. Most local filesystems (e.g.~ext3, XFS,
|
bos@224
|
106 ReiserFS) should work fine.
|
bos@224
|
107 \end{itemize}
|
bos@224
|
108
|
bos@224
|
109 The \hgext{inotify} extension is not yet shipped with Mercurial as of
|
bos@224
|
110 May~2007, so it's a little more involved to set up than other
|
bos@224
|
111 extensions. But the performance improvement is worth it!
|
bos@224
|
112
|
bos@224
|
113 The extension currently comes in two parts: a set of patches to the
|
bos@224
|
114 Mercurial source code, and a library of Python bindings to the
|
bos@224
|
115 \texttt{inotify} subsystem.
|
bos@224
|
116 \begin{note}
|
bos@224
|
117 There are \emph{two} Python \texttt{inotify} binding libraries. One
|
bos@224
|
118 of them is called \texttt{pyinotify}, and is packaged by some Linux
|
bos@224
|
119 distributions as \texttt{python-inotify}. This is \emph{not} the
|
bos@224
|
120 one you'll need, as it is too buggy and inefficient to be practical.
|
bos@224
|
121 \end{note}
|
bos@224
|
122 To get going, it's best to already have a functioning copy of
|
bos@224
|
123 Mercurial installed.
|
bos@224
|
124 \begin{note}
|
bos@224
|
125 If you follow the instructions below, you'll be \emph{replacing} and
|
bos@224
|
126 overwriting any existing installation of Mercurial that you might
|
bos@224
|
127 already have, using the latest ``bleeding edge'' Mercurial code.
|
bos@224
|
128 Don't say you weren't warned!
|
bos@224
|
129 \end{note}
|
bos@224
|
130 \begin{enumerate}
|
bos@224
|
131 \item Clone the Python \texttt{inotify} binding repository. Build and
|
bos@224
|
132 install it.
|
bos@224
|
133 \begin{codesample4}
|
bos@224
|
134 hg clone http://hg.kublai.com/python/inotify
|
bos@224
|
135 cd inotify
|
bos@224
|
136 python setup.py build --force
|
bos@224
|
137 sudo python setup.py install --skip-build
|
bos@224
|
138 \end{codesample4}
|
bos@224
|
139 \item Clone the \dirname{crew} Mercurial repository. Clone the
|
bos@224
|
140 \hgext{inotify} patch repository so that Mercurial Queues will be
|
bos@224
|
141 able to apply patches to your cope of the \dirname{crew} repository.
|
bos@224
|
142 \begin{codesample4}
|
bos@224
|
143 hg clone http://hg.intevation.org/mercurial/crew
|
bos@224
|
144 hg clone crew inotify
|
bos@224
|
145 hg clone http://hg.kublai.com/mercurial/patches/inotify inotify/.hg/patches
|
bos@224
|
146 \end{codesample4}
|
bos@224
|
147 \item Make sure that you have the Mercurial Queues extension,
|
bos@224
|
148 \hgext{mq}, enabled. If you've never used MQ, read
|
bos@224
|
149 section~\ref{sec:mq:start} to get started quickly.
|
bos@224
|
150 \item Go into the \dirname{inotify} repo, and apply all of the
|
bos@224
|
151 \hgext{inotify} patches using the \hgopt{qpush}{-a} option to the
|
bos@224
|
152 \hgcmd{qpush} command.
|
bos@224
|
153 \begin{codesample4}
|
bos@224
|
154 cd inotify
|
bos@224
|
155 hg qpush -a
|
bos@224
|
156 \end{codesample4}
|
bos@224
|
157 If you get an error message from \hgcmd{qpush}, you should not
|
bos@224
|
158 continue. Instead, ask for help.
|
bos@224
|
159 \item Build and install the patched version of Mercurial.
|
bos@224
|
160 \begin{codesample4}
|
bos@224
|
161 python setup.py build --force
|
bos@224
|
162 sudo python setup.py install --skip-build
|
bos@224
|
163 \end{codesample4}
|
bos@224
|
164 \end{enumerate}
|
bos@224
|
165 Once you've build a suitably patched version of Mercurial, all you
|
bos@224
|
166 need to do to enable the \hgext{inotify} extension is add an entry to
|
bos@224
|
167 your \hgrc.
|
bos@224
|
168 \begin{codesample2}
|
bos@224
|
169 [extensions]
|
bos@224
|
170 inotify =
|
bos@224
|
171 \end{codesample2}
|
bos@224
|
172 When the \hgext{inotify} extension is enabled, Mercurial will
|
bos@224
|
173 automatically and transparently start the status daemon the first time
|
bos@224
|
174 you run a command that needs status in a repository. It runs one
|
bos@224
|
175 status daemon per repository.
|
bos@224
|
176
|
bos@224
|
177 The status daemon is started silently, and runs in the background. If
|
bos@224
|
178 you look at a list of running processes after you've enabled the
|
bos@224
|
179 \hgext{inotify} extension and run a few commands in different
|
bos@224
|
180 repositories, you'll thus see a few \texttt{hg} processes sitting
|
bos@224
|
181 around, waiting for updates from the kernel and queries from
|
bos@224
|
182 Mercurial.
|
bos@224
|
183
|
bos@224
|
184 The first time you run a Mercurial command in a repository when you
|
bos@224
|
185 have the \hgext{inotify} extension enabled, it will run with about the
|
bos@224
|
186 same performance as a normal Mercurial command. This is because the
|
bos@224
|
187 status daemon needs to perform a normal status scan so that it has a
|
bos@224
|
188 baseline against which to apply later updates from the kernel.
|
bos@224
|
189 However, \emph{every} subsequent command that does any kind of status
|
bos@224
|
190 check should be noticeably faster on repositories of even fairly
|
bos@224
|
191 modest size. Better yet, the bigger your repository is, the greater a
|
bos@224
|
192 performance advantage you'll see. The \hgext{inotify} daemon makes
|
bos@224
|
193 status operations almost instantaneous on repositories of all sizes!
|
bos@224
|
194
|
bos@224
|
195 If you like, you can manually start a status daemon using the
|
bos@224
|
196 \hgcmd{inserve} command. This gives you slightly finer control over
|
bos@224
|
197 how the daemon ought to run. This command will of course only be
|
bos@224
|
198 available when the \hgext{inotify} extension is enabled.
|
bos@224
|
199
|
bos@224
|
200 When you're using the \hgext{inotify} extension, you should notice
|
bos@224
|
201 \emph{no difference at all} in Mercurial's behaviour, with the sole
|
bos@224
|
202 exception of status-related commands running a whole lot faster than
|
bos@224
|
203 they used to. You should specifically expect that commands will not
|
bos@224
|
204 print different output; neither should they give different results.
|
bos@224
|
205 If either of these situations occurs, please report a bug.
|
bos@223
|
206
|
bos@223
|
207 %%% Local Variables:
|
bos@223
|
208 %%% mode: latex
|
bos@223
|
209 %%% TeX-master: "00book"
|
bos@223
|
210 %%% End:
|