hgbook: en/mq-collab.tex annotate

hgbook

annotate en/mq-collab.tex @ 307:fb5c0d56d7f1

Fix test 'tour'.

Executing 'tour' test now creates some files in /tmp to store the
revision numbers as they are created on the fly and appear in the output
files. When SVG files are to be converted to PNG or EPS files within the
Makefile, a tool 'fixsvg' will be invoked to substitute some placeholder
markup by the real version number which fits to the test output, before
the final conversion takes place.

author	Guido Ostkamp <hg@ostkamp.fastmail.fm>
date	Wed Aug 20 22:15:35 2008 +0200 (2008-08-20)
parents	4119e57679f7
children	5561812fc5c9

rev	line source
bos@104	1 \chapter{Advanced uses of Mercurial Queues}
bos@224	2 \label{chap:mq-collab}
bos@104	3
bos@104	4 While it's easy to pick up straightforward uses of Mercurial Queues,
bos@104	5 use of a little discipline and some of MQ's less frequently used
bos@104	6 capabilities makes it possible to work in complicated development
bos@104	7 environments.
bos@104	8
bos@105	9 In this chapter, I will use as an example a technique I have used to
bos@105	10 manage the development of an Infiniband device driver for the Linux
bos@105	11 kernel. The driver in question is large (at least as drivers go),
bos@105	12 with 25,000 lines of code spread across 35 source files. It is
bos@105	13 maintained by a small team of developers.
bos@104	14
bos@104	15 While much of the material in this chapter is specific to Linux, the
bos@104	16 same principles apply to any code base for which you're not the
bos@104	17 primary owner, and upon which you need to do a lot of development.
bos@104	18
bos@104	19 \section{The problem of many targets}
bos@104	20
bos@104	21 The Linux kernel changes rapidly, and has never been internally
bos@104	22 stable; developers frequently make drastic changes between releases.
bos@104	23 This means that a version of the driver that works well with a
bos@104	24 particular released version of the kernel will not even \emph{compile}
bos@104	25 correctly against, typically, any other version.
bos@104	26
bos@104	27 To maintain a driver, we have to keep a number of distinct versions of
bos@104	28 Linux in mind.
bos@104	29 \begin{itemize}
bos@104	30 \item One target is the main Linux kernel development tree.
bos@104	31 Maintenance of the code is in this case partly shared by other
bos@104	32 developers in the kernel community, who make ``drive-by''
bos@104	33 modifications to the driver as they develop and refine kernel
bos@104	34 subsystems.
bos@104	35 \item We also maintain a number of ``backports'' to older versions of
bos@104	36 the Linux kernel, to support the needs of customers who are running
bos@105	37 older Linux distributions that do not incorporate our drivers. (To
bos@105	38 \emph{backport} a piece of code is to modify it to work in an older
bos@105	39 version of its target environment than the version it was developed
bos@105	40 for.)
bos@104	41 \item Finally, we make software releases on a schedule that is
bos@104	42 necessarily not aligned with those used by Linux distributors and
bos@104	43 kernel developers, so that we can deliver new features to customers
bos@104	44 without forcing them to upgrade their entire kernels or
bos@104	45 distributions.
bos@104	46 \end{itemize}
bos@104	47
bos@104	48 \subsection{Tempting approaches that don't work well}
bos@104	49
bos@104	50 There are two ``standard'' ways to maintain a piece of software that
bos@104	51 has to target many different environments.
bos@104	52
bos@104	53 The first is to maintain a number of branches, each intended for a
bos@104	54 single target. The trouble with this approach is that you must
bos@104	55 maintain iron discipline in the flow of changes between repositories.
bos@104	56 A new feature or bug fix must start life in a ``pristine'' repository,
bos@104	57 then percolate out to every backport repository. Backport changes are
bos@104	58 more limited in the branches they should propagate to; a backport
bos@104	59 change that is applied to a branch where it doesn't belong will
bos@104	60 probably stop the driver from compiling.
bos@104	61
bos@104	62 The second is to maintain a single source tree filled with conditional
bos@104	63 statements that turn chunks of code on or off depending on the
bos@104	64 intended target. Because these ``ifdefs'' are not allowed in the
bos@104	65 Linux kernel tree, a manual or automatic process must be followed to
bos@104	66 strip them out and yield a clean tree. A code base maintained in this
bos@104	67 fashion rapidly becomes a rat's nest of conditional blocks that are
bos@104	68 difficult to understand and maintain.
bos@104	69
bos@104	70 Neither of these approaches is well suited to a situation where you
bos@104	71 don't ``own'' the canonical copy of a source tree. In the case of a
bos@104	72 Linux driver that is distributed with the standard kernel, Linus's
bos@104	73 tree contains the copy of the code that will be treated by the world
bos@104	74 as canonical. The upstream version of ``my'' driver can be modified
bos@104	75 by people I don't know, without me even finding out about it until
bos@104	76 after the changes show up in Linus's tree.
bos@104	77
bos@104	78 These approaches have the added weakness of making it difficult to
bos@104	79 generate well-formed patches to submit upstream.
bos@104	80
bos@104	81 In principle, Mercurial Queues seems like a good candidate to manage a
bos@104	82 development scenario such as the above. While this is indeed the
bos@104	83 case, MQ contains a few added features that make the job more
bos@104	84 pleasant.
bos@104	85
bos@105	86 \section{Conditionally applying patches with
bos@105	87 guards}
bos@104	88
bos@104	89 Perhaps the best way to maintain sanity with so many targets is to be
bos@104	90 able to choose specific patches to apply for a given situation. MQ
bos@104	91 provides a feature called ``guards'' (which originates with quilt's
bos@104	92 \texttt{guards} command) that does just this. To start off, let's
bos@104	93 create a simple repository for experimenting in.
bos@104	94 \interaction{mq.guards.init}
bos@104	95 This gives us a tiny repository that contains two patches that don't
bos@104	96 have any dependencies on each other, because they touch different files.
bos@104	97
bos@104	98 The idea behind conditional application is that you can ``tag'' a
bos@104	99 patch with a \emph{guard}, which is simply a text string of your
bos@104	100 choosing, then tell MQ to select specific guards to use when applying
bos@104	101 patches. MQ will then either apply, or skip over, a guarded patch,
bos@104	102 depending on the guards that you have selected.
bos@104	103
bos@104	104 A patch can have an arbitrary number of guards;
bos@104	105 each one is \emph{positive} (``apply this patch if this guard is
bos@104	106 selected'') or \emph{negative} (``skip this patch if this guard is
bos@104	107 selected''). A patch with no guards is always applied.
bos@104	108
bos@104	109 \section{Controlling the guards on a patch}
bos@104	110
bos@233	111 The \hgxcmd{mq}{qguard} command lets you determine which guards should
bos@104	112 apply to a patch, or display the guards that are already in effect.
bos@104	113 Without any arguments, it displays the guards on the current topmost
bos@104	114 patch.
bos@104	115 \interaction{mq.guards.qguard}
bos@104	116 To set a positive guard on a patch, prefix the name of the guard with
bos@104	117 a ``\texttt{+}''.
bos@104	118 \interaction{mq.guards.qguard.pos}
bos@104	119 To set a negative guard on a patch, prefix the name of the guard with
bos@104	120 a ``\texttt{-}''.
bos@104	121 \interaction{mq.guards.qguard.neg}
bos@104	122
bos@104	123 \begin{note}
bos@233	124 The \hgxcmd{mq}{qguard} command \emph{sets} the guards on a patch; it
bos@104	125 doesn't \emph{modify} them. What this means is that if you run
bos@104	126 \hgcmdargs{qguard}{+a +b} on a patch, then \hgcmdargs{qguard}{+c} on
bos@104	127 the same patch, the \emph{only} guard that will be set on it
bos@104	128 afterwards is \texttt{+c}.
bos@104	129 \end{note}
bos@104	130
bos@104	131 Mercurial stores guards in the \sfilename{series} file; the form in
bos@104	132 which they are stored is easy both to understand and to edit by hand.
bos@233	133 (In other words, you don't have to use the \hgxcmd{mq}{qguard} command if
bos@104	134 you don't want to; it's okay to simply edit the \sfilename{series}
bos@104	135 file.)
bos@104	136 \interaction{mq.guards.series}
bos@104	137
bos@104	138 \section{Selecting the guards to use}
bos@104	139
bos@233	140 The \hgxcmd{mq}{qselect} command determines which guards are active at a
bos@104	141 given time. The effect of this is to determine which patches MQ will
bos@233	142 apply the next time you run \hgxcmd{mq}{qpush}. It has no other effect; in
bos@104	143 particular, it doesn't do anything to patches that are already
bos@104	144 applied.
bos@104	145
bos@233	146 With no arguments, the \hgxcmd{mq}{qselect} command lists the guards
bos@104	147 currently in effect, one per line of output. Each argument is treated
bos@104	148 as the name of a guard to apply.
bos@104	149 \interaction{mq.guards.qselect.foo}
bos@104	150 In case you're interested, the currently selected guards are stored in
bos@104	151 the \sfilename{guards} file.
bos@104	152 \interaction{mq.guards.qselect.cat}
bos@104	153 We can see the effect the selected guards have when we run
bos@233	154 \hgxcmd{mq}{qpush}.
bos@104	155 \interaction{mq.guards.qselect.qpush}
bos@104	156
bos@104	157 A guard cannot start with a ``\texttt{+}'' or ``\texttt{-}''
bos@106	158 character. The name of a guard must not contain white space, but most
bos@106	159 othter characters are acceptable. If you try to use a guard with an
bos@106	160 invalid name, MQ will complain:
bos@106	161 \interaction{mq.guards.qselect.error}
bos@104	162 Changing the selected guards changes the patches that are applied.
bos@106	163 \interaction{mq.guards.qselect.quux}
bos@105	164 You can see in the example below that negative guards take precedence
bos@105	165 over positive guards.
bos@104	166 \interaction{mq.guards.qselect.foobar}
bos@104	167
bos@105	168 \section{MQ's rules for applying patches}
bos@105	169
bos@105	170 The rules that MQ uses when deciding whether to apply a patch
bos@105	171 are as follows.
bos@105	172 \begin{itemize}
bos@105	173 \item A patch that has no guards is always applied.
bos@105	174 \item If the patch has any negative guard that matches any currently
bos@105	175 selected guard, the patch is skipped.
bos@105	176 \item If the patch has any positive guard that matches any currently
bos@105	177 selected guard, the patch is applied.
bos@105	178 \item If the patch has positive or negative guards, but none matches
bos@105	179 any currently selected guard, the patch is skipped.
bos@105	180 \end{itemize}
bos@105	181
bos@105	182 \section{Trimming the work environment}
bos@105	183
bos@105	184 In working on the device driver I mentioned earlier, I don't apply the
bos@105	185 patches to a normal Linux kernel tree. Instead, I use a repository
bos@105	186 that contains only a snapshot of the source files and headers that are
bos@105	187 relevant to Infiniband development. This repository is~1\% the size
bos@105	188 of a kernel repository, so it's easier to work with.
bos@105	189
bos@105	190 I then choose a ``base'' version on top of which the patches are
bos@105	191 applied. This is a snapshot of the Linux kernel tree as of a revision
bos@105	192 of my choosing. When I take the snapshot, I record the changeset ID
bos@105	193 from the kernel repository in the commit message. Since the snapshot
bos@105	194 preserves the ``shape'' and content of the relevant parts of the
bos@105	195 kernel tree, I can apply my patches on top of either my tiny
bos@105	196 repository or a normal kernel tree.
bos@105	197
bos@105	198 Normally, the base tree atop which the patches apply should be a
bos@105	199 snapshot of a very recent upstream tree. This best facilitates the
bos@105	200 development of patches that can easily be submitted upstream with few
bos@105	201 or no modifications.
bos@105	202
bos@105	203 \section{Dividing up the \sfilename{series} file}
bos@105	204
bos@105	205 I categorise the patches in the \sfilename{series} file into a number
bos@105	206 of logical groups. Each section of like patches begins with a block
bos@105	207 of comments that describes the purpose of the patches that follow.
bos@105	208
bos@105	209 The sequence of patch groups that I maintain follows. The ordering of
bos@105	210 these groups is important; I'll describe why after I introduce the
bos@105	211 groups.
bos@105	212 \begin{itemize}
bos@105	213 \item The ``accepted'' group. Patches that the development team has
bos@105	214 submitted to the maintainer of the Infiniband subsystem, and which
bos@105	215 he has accepted, but which are not present in the snapshot that the
bos@105	216 tiny repository is based on. These are ``read only'' patches,
bos@105	217 present only to transform the tree into a similar state as it is in
bos@105	218 the upstream maintainer's repository.
bos@105	219 \item The ``rework'' group. Patches that I have submitted, but that
bos@105	220 the upstream maintainer has requested modifications to before he
bos@105	221 will accept them.
bos@105	222 \item The ``pending'' group. Patches that I have not yet submitted to
bos@105	223 the upstream maintainer, but which we have finished working on.
bos@105	224 These will be ``read only'' for a while. If the upstream maintainer
bos@105	225 accepts them upon submission, I'll move them to the end of the
bos@105	226 ``accepted'' group. If he requests that I modify any, I'll move
bos@105	227 them to the beginning of the ``rework'' group.
bos@105	228 \item The ``in progress'' group. Patches that are actively being
bos@105	229 developed, and should not be submitted anywhere yet.
bos@105	230 \item The ``backport'' group. Patches that adapt the source tree to
bos@105	231 older versions of the kernel tree.
bos@105	232 \item The ``do not ship'' group. Patches that for some reason should
bos@105	233 never be submitted upstream. For example, one such patch might
bos@105	234 change embedded driver identification strings to make it easier to
bos@105	235 distinguish, in the field, between an out-of-tree version of the
bos@105	236 driver and a version shipped by a distribution vendor.
bos@105	237 \end{itemize}
bos@105	238
bos@105	239 Now to return to the reasons for ordering groups of patches in this
bos@105	240 way. We would like the lowest patches in the stack to be as stable as
bos@105	241 possible, so that we will not need to rework higher patches due to
bos@105	242 changes in context. Putting patches that will never be changed first
bos@105	243 in the \sfilename{series} file serves this purpose.
bos@105	244
bos@105	245 We would also like the patches that we know we'll need to modify to be
bos@105	246 applied on top of a source tree that resembles the upstream tree as
bos@105	247 closely as possible. This is why we keep accepted patches around for
bos@105	248 a while.
bos@105	249
bos@105	250 The ``backport'' and ``do not ship'' patches float at the end of the
bos@106	251 \sfilename{series} file. The backport patches must be applied on top
bos@106	252 of all other patches, and the ``do not ship'' patches might as well
bos@106	253 stay out of harm's way.
bos@106	254
bos@106	255 \section{Maintaining the patch series}
bos@106	256
bos@106	257 In my work, I use a number of guards to control which patches are to
bos@106	258 be applied.
bos@106	259
bos@106	260 \begin{itemize}
bos@106	261 \item ``Accepted'' patches are guarded with \texttt{accepted}. I
bos@106	262 enable this guard most of the time. When I'm applying the patches
bos@106	263 on top of a tree where the patches are already present, I can turn
max@271	264 this patch off, and the patches that follow it will apply cleanly.
bos@106	265 \item Patches that are ``finished'', but not yet submitted, have no
bos@106	266 guards. If I'm applying the patch stack to a copy of the upstream
bos@106	267 tree, I don't need to enable any guards in order to get a reasonably
bos@106	268 safe source tree.
bos@106	269 \item Those patches that need reworking before being resubmitted are
bos@106	270 guarded with \texttt{rework}.
bos@106	271 \item For those patches that are still under development, I use
bos@106	272 \texttt{devel}.
bos@106	273 \item A backport patch may have several guards, one for each version
bos@106	274 of the kernel to which it applies. For example, a patch that
bos@106	275 backports a piece of code to~2.6.9 will have a~\texttt{2.6.9} guard.
bos@106	276 \end{itemize}
bos@106	277 This variety of guards gives me considerable flexibility in
bos@106	278 qdetermining what kind of source tree I want to end up with. For most
bos@106	279 situations, the selection of appropriate guards is automated during
bos@106	280 the build process, but I can manually tune the guards to use for less
bos@106	281 common circumstances.
bos@106	282
bos@106	283 \subsection{The art of writing backport patches}
bos@106	284
bos@106	285 Using MQ, writing a backport patch is a simple process. All such a
bos@106	286 patch has to do is modify a piece of code that uses a kernel feature
bos@106	287 not present in the older version of the kernel, so that the driver
bos@106	288 continues to work correctly under that older version.
bos@106	289
bos@106	290 A useful goal when writing a good backport patch is to make your code
bos@106	291 look as if it was written for the older version of the kernel you're
bos@106	292 targeting. The less obtrusive the patch, the easier it will be to
bos@106	293 understand and maintain. If you're writing a collection of backport
bos@106	294 patches to avoid the ``rat's nest'' effect of lots of
bos@106	295 \texttt{\#ifdef}s (hunks of source code that are only used
bos@106	296 conditionally) in your code, don't introduce version-dependent
bos@106	297 \texttt{\#ifdef}s into the patches. Instead, write several patches,
bos@106	298 each of which makes unconditional changes, and control their
bos@106	299 application using guards.
bos@106	300
bos@106	301 There are two reasons to divide backport patches into a distinct
bos@106	302 group, away from the ``regular'' patches whose effects they modify.
bos@106	303 The first is that intermingling the two makes it more difficult to use
bos@106	304 a tool like the \hgext{patchbomb} extension to automate the process of
bos@106	305 submitting the patches to an upstream maintainer. The second is that
bos@106	306 a backport patch could perturb the context in which a subsequent
bos@106	307 regular patch is applied, making it impossible to apply the regular
bos@106	308 patch cleanly \emph{without} the earlier backport patch already being
bos@106	309 applied.
bos@106	310
bos@106	311 \section{Useful tips for developing with MQ}
bos@106	312
bos@106	313 \subsection{Organising patches in directories}
bos@106	314
bos@106	315 If you're working on a substantial project with MQ, it's not difficult
bos@106	316 to accumulate a large number of patches. For example, I have one
bos@106	317 patch repository that contains over 250 patches.
bos@106	318
bos@106	319 If you can group these patches into separate logical categories, you
bos@106	320 can if you like store them in different directories; MQ has no
bos@106	321 problems with patch names that contain path separators.
bos@106	322
bos@106	323 \subsection{Viewing the history of a patch}
bos@106	324 \label{mq-collab:tips:interdiff}
bos@106	325
bos@106	326 If you're developing a set of patches over a long time, it's a good
bos@106	327 idea to maintain them in a repository, as discussed in
bos@106	328 section~\ref{sec:mq:repo}. If you do so, you'll quickly discover that
bos@106	329 using the \hgcmd{diff} command to look at the history of changes to a
bos@106	330 patch is unworkable. This is in part because you're looking at the
bos@106	331 second derivative of the real code (a diff of a diff), but also
bos@106	332 because MQ adds noise to the process by modifying time stamps and
bos@106	333 directory names when it updates a patch.
bos@106	334
bos@106	335 However, you can use the \hgext{extdiff} extension, which is bundled
bos@106	336 with Mercurial, to turn a diff of two versions of a patch into
bos@106	337 something readable. To do this, you will need a third-party package
bos@106	338 called \package{patchutils}~\cite{web:patchutils}. This provides a
bos@106	339 command named \command{interdiff}, which shows the differences between
bos@106	340 two diffs as a diff. Used on two versions of the same diff, it
bos@106	341 generates a diff that represents the diff from the first to the second
bos@106	342 version.
bos@106	343
bos@106	344 You can enable the \hgext{extdiff} extension in the usual way, by
bos@106	345 adding a line to the \rcsection{extensions} section of your \hgrc.
bos@106	346 \begin{codesample2}
bos@106	347 [extensions]
bos@106	348 extdiff =
bos@106	349 \end{codesample2}
bos@106	350 The \command{interdiff} command expects to be passed the names of two
bos@106	351 files, but the \hgext{extdiff} extension passes the program it runs a
bos@106	352 pair of directories, each of which can contain an arbitrary number of
bos@106	353 files. We thus need a small program that will run \command{interdiff}
bos@106	354 on each pair of files in these two directories. This program is
bos@106	355 available as \sfilename{hg-interdiff} in the \dirname{examples}
bos@106	356 directory of the source code repository that accompanies this book.
bos@106	357 \excode{hg-interdiff}
bos@106	358
bos@106	359 With the \sfilename{hg-interdiff} program in your shell's search path,
bos@106	360 you can run it as follows, from inside an MQ patch directory:
bos@106	361 \begin{codesample2}
bos@106	362 hg extdiff -p hg-interdiff -r A:B my-change.patch
bos@106	363 \end{codesample2}
bos@106	364 Since you'll probably want to use this long-winded command a lot, you
bos@106	365 can get \hgext{hgext} to make it available as a normal Mercurial
bos@106	366 command, again by editing your \hgrc.
bos@106	367 \begin{codesample2}
bos@106	368 [extdiff]
bos@106	369 cmd.interdiff = hg-interdiff
bos@106	370 \end{codesample2}
bos@106	371 This directs \hgext{hgext} to make an \texttt{interdiff} command
bos@106	372 available, so you can now shorten the previous invocation of
bos@238	373 \hgxcmd{extdiff}{extdiff} to something a little more wieldy.
bos@106	374 \begin{codesample2}
bos@106	375 hg interdiff -r A:B my-change.patch
bos@106	376 \end{codesample2}
bos@105	377
bos@107	378 \begin{note}
bos@107	379 The \command{interdiff} command works well only if the underlying
bos@107	380 files against which versions of a patch are generated remain the
bos@107	381 same. If you create a patch, modify the underlying files, and then
bos@107	382 regenerate the patch, \command{interdiff} may not produce useful
bos@107	383 output.
bos@107	384 \end{note}
bos@107	385
bos@240	386 The \hgext{extdiff} extension is useful for more than merely improving
bos@239	387 the presentation of MQ~patches. To read more about it, go to
bos@239	388 section~\ref{sec:hgext:extdiff}.
bos@239	389
bos@104	390 %%% Local Variables:
bos@104	391 %%% mode: latex
bos@104	392 %%% TeX-master: "00book"
bos@104	393 %%% End: