hgbook: es/mq-collab.tex annotate

hgbook

annotate es/mq-collab.tex @ 377:d5f1049a79dd

roll back untranslatable and more on undo.tex

author	Igor TAmara <igor@tamarapatino.org>
date	Mon Oct 27 17:14:02 2008 -0500 (2008-10-27)
parents	04c08ad7e92e
children	6cf30b3ed48f

rev	line source
jerojasro@336	1 \chapter{Advanced uses of Mercurial Queues}
jerojasro@336	2 \label{chap:mq-collab}
jerojasro@336	3
jerojasro@336	4 While it's easy to pick up straightforward uses of Mercurial Queues,
jerojasro@336	5 use of a little discipline and some of MQ's less frequently used
jerojasro@336	6 capabilities makes it possible to work in complicated development
jerojasro@336	7 environments.
jerojasro@336	8
jerojasro@336	9 In this chapter, I will use as an example a technique I have used to
jerojasro@336	10 manage the development of an Infiniband device driver for the Linux
jerojasro@336	11 kernel. The driver in question is large (at least as drivers go),
jerojasro@336	12 with 25,000 lines of code spread across 35 source files. It is
jerojasro@336	13 maintained by a small team of developers.
jerojasro@336	14
jerojasro@336	15 While much of the material in this chapter is specific to Linux, the
jerojasro@336	16 same principles apply to any code base for which you're not the
jerojasro@336	17 primary owner, and upon which you need to do a lot of development.
jerojasro@336	18
jerojasro@336	19 \section{The problem of many targets}
jerojasro@336	20
jerojasro@336	21 The Linux kernel changes rapidly, and has never been internally
jerojasro@336	22 stable; developers frequently make drastic changes between releases.
jerojasro@336	23 This means that a version of the driver that works well with a
jerojasro@336	24 particular released version of the kernel will not even \emph{compile}
jerojasro@336	25 correctly against, typically, any other version.
jerojasro@336	26
jerojasro@336	27 To maintain a driver, we have to keep a number of distinct versions of
jerojasro@336	28 Linux in mind.
jerojasro@336	29 \begin{itemize}
jerojasro@336	30 \item One target is the main Linux kernel development tree.
jerojasro@336	31 Maintenance of the code is in this case partly shared by other
jerojasro@336	32 developers in the kernel community, who make ``drive-by''
jerojasro@336	33 modifications to the driver as they develop and refine kernel
jerojasro@336	34 subsystems.
jerojasro@336	35 \item We also maintain a number of ``backports'' to older versions of
jerojasro@336	36 the Linux kernel, to support the needs of customers who are running
jerojasro@336	37 older Linux distributions that do not incorporate our drivers. (To
jerojasro@336	38 \emph{backport} a piece of code is to modify it to work in an older
jerojasro@336	39 version of its target environment than the version it was developed
jerojasro@336	40 for.)
jerojasro@336	41 \item Finally, we make software releases on a schedule that is
jerojasro@336	42 necessarily not aligned with those used by Linux distributors and
jerojasro@336	43 kernel developers, so that we can deliver new features to customers
jerojasro@336	44 without forcing them to upgrade their entire kernels or
jerojasro@336	45 distributions.
jerojasro@336	46 \end{itemize}
jerojasro@336	47
jerojasro@336	48 \subsection{Tempting approaches that don't work well}
jerojasro@336	49
jerojasro@336	50 There are two ``standard'' ways to maintain a piece of software that
jerojasro@336	51 has to target many different environments.
jerojasro@336	52
jerojasro@336	53 The first is to maintain a number of branches, each intended for a
jerojasro@336	54 single target. The trouble with this approach is that you must
jerojasro@336	55 maintain iron discipline in the flow of changes between repositories.
jerojasro@336	56 A new feature or bug fix must start life in a ``pristine'' repository,
jerojasro@336	57 then percolate out to every backport repository. Backport changes are
jerojasro@336	58 more limited in the branches they should propagate to; a backport
jerojasro@336	59 change that is applied to a branch where it doesn't belong will
jerojasro@336	60 probably stop the driver from compiling.
jerojasro@336	61
jerojasro@336	62 The second is to maintain a single source tree filled with conditional
jerojasro@336	63 statements that turn chunks of code on or off depending on the
jerojasro@336	64 intended target. Because these ``ifdefs'' are not allowed in the
jerojasro@336	65 Linux kernel tree, a manual or automatic process must be followed to
jerojasro@336	66 strip them out and yield a clean tree. A code base maintained in this
jerojasro@336	67 fashion rapidly becomes a rat's nest of conditional blocks that are
jerojasro@336	68 difficult to understand and maintain.
jerojasro@336	69
jerojasro@336	70 Neither of these approaches is well suited to a situation where you
jerojasro@336	71 don't ``own'' the canonical copy of a source tree. In the case of a
jerojasro@336	72 Linux driver that is distributed with the standard kernel, Linus's
jerojasro@336	73 tree contains the copy of the code that will be treated by the world
jerojasro@336	74 as canonical. The upstream version of ``my'' driver can be modified
jerojasro@336	75 by people I don't know, without me even finding out about it until
jerojasro@336	76 after the changes show up in Linus's tree.
jerojasro@336	77
jerojasro@336	78 These approaches have the added weakness of making it difficult to
jerojasro@336	79 generate well-formed patches to submit upstream.
jerojasro@336	80
jerojasro@336	81 In principle, Mercurial Queues seems like a good candidate to manage a
jerojasro@336	82 development scenario such as the above. While this is indeed the
jerojasro@336	83 case, MQ contains a few added features that make the job more
jerojasro@336	84 pleasant.
jerojasro@336	85
jerojasro@336	86 \section{Conditionally applying patches with
jerojasro@336	87 guards}
jerojasro@336	88
jerojasro@336	89 Perhaps the best way to maintain sanity with so many targets is to be
jerojasro@336	90 able to choose specific patches to apply for a given situation. MQ
jerojasro@336	91 provides a feature called ``guards'' (which originates with quilt's
jerojasro@336	92 \texttt{guards} command) that does just this. To start off, let's
jerojasro@336	93 create a simple repository for experimenting in.
jerojasro@336	94 \interaction{mq.guards.init}
jerojasro@336	95 This gives us a tiny repository that contains two patches that don't
jerojasro@336	96 have any dependencies on each other, because they touch different files.
jerojasro@336	97
jerojasro@336	98 The idea behind conditional application is that you can ``tag'' a
jerojasro@336	99 patch with a \emph{guard}, which is simply a text string of your
jerojasro@336	100 choosing, then tell MQ to select specific guards to use when applying
jerojasro@336	101 patches. MQ will then either apply, or skip over, a guarded patch,
jerojasro@336	102 depending on the guards that you have selected.
jerojasro@336	103
jerojasro@336	104 A patch can have an arbitrary number of guards;
jerojasro@336	105 each one is \emph{positive} (``apply this patch if this guard is
jerojasro@336	106 selected'') or \emph{negative} (``skip this patch if this guard is
jerojasro@336	107 selected''). A patch with no guards is always applied.
jerojasro@336	108
jerojasro@336	109 \section{Controlling the guards on a patch}
jerojasro@336	110
jerojasro@336	111 The \hgxcmd{mq}{qguard} command lets you determine which guards should
jerojasro@336	112 apply to a patch, or display the guards that are already in effect.
jerojasro@336	113 Without any arguments, it displays the guards on the current topmost
jerojasro@336	114 patch.
jerojasro@336	115 \interaction{mq.guards.qguard}
jerojasro@336	116 To set a positive guard on a patch, prefix the name of the guard with
jerojasro@336	117 a ``\texttt{+}''.
jerojasro@336	118 \interaction{mq.guards.qguard.pos}
jerojasro@336	119 To set a negative guard on a patch, prefix the name of the guard with
jerojasro@336	120 a ``\texttt{-}''.
jerojasro@336	121 \interaction{mq.guards.qguard.neg}
jerojasro@336	122
jerojasro@336	123 \begin{note}
jerojasro@336	124 The \hgxcmd{mq}{qguard} command \emph{sets} the guards on a patch; it
jerojasro@336	125 doesn't \emph{modify} them. What this means is that if you run
jerojasro@336	126 \hgcmdargs{qguard}{+a +b} on a patch, then \hgcmdargs{qguard}{+c} on
jerojasro@336	127 the same patch, the \emph{only} guard that will be set on it
jerojasro@336	128 afterwards is \texttt{+c}.
jerojasro@336	129 \end{note}
jerojasro@336	130
jerojasro@336	131 Mercurial stores guards in the \sfilename{series} file; the form in
jerojasro@336	132 which they are stored is easy both to understand and to edit by hand.
jerojasro@336	133 (In other words, you don't have to use the \hgxcmd{mq}{qguard} command if
jerojasro@336	134 you don't want to; it's okay to simply edit the \sfilename{series}
jerojasro@336	135 file.)
jerojasro@336	136 \interaction{mq.guards.series}
jerojasro@336	137
jerojasro@336	138 \section{Selecting the guards to use}
jerojasro@336	139
jerojasro@336	140 The \hgxcmd{mq}{qselect} command determines which guards are active at a
jerojasro@336	141 given time. The effect of this is to determine which patches MQ will
jerojasro@336	142 apply the next time you run \hgxcmd{mq}{qpush}. It has no other effect; in
jerojasro@336	143 particular, it doesn't do anything to patches that are already
jerojasro@336	144 applied.
jerojasro@336	145
jerojasro@336	146 With no arguments, the \hgxcmd{mq}{qselect} command lists the guards
jerojasro@336	147 currently in effect, one per line of output. Each argument is treated
jerojasro@336	148 as the name of a guard to apply.
jerojasro@336	149 \interaction{mq.guards.qselect.foo}
jerojasro@336	150 In case you're interested, the currently selected guards are stored in
jerojasro@336	151 the \sfilename{guards} file.
jerojasro@336	152 \interaction{mq.guards.qselect.cat}
jerojasro@336	153 We can see the effect the selected guards have when we run
jerojasro@336	154 \hgxcmd{mq}{qpush}.
jerojasro@336	155 \interaction{mq.guards.qselect.qpush}
jerojasro@336	156
jerojasro@336	157 A guard cannot start with a ``\texttt{+}'' or ``\texttt{-}''
jerojasro@336	158 character. The name of a guard must not contain white space, but most
jerojasro@336	159 other characters are acceptable. If you try to use a guard with an
jerojasro@336	160 invalid name, MQ will complain:
jerojasro@336	161 \interaction{mq.guards.qselect.error}
jerojasro@336	162 Changing the selected guards changes the patches that are applied.
jerojasro@336	163 \interaction{mq.guards.qselect.quux}
jerojasro@336	164 You can see in the example below that negative guards take precedence
jerojasro@336	165 over positive guards.
jerojasro@336	166 \interaction{mq.guards.qselect.foobar}
jerojasro@336	167
jerojasro@336	168 \section{MQ's rules for applying patches}
jerojasro@336	169
jerojasro@336	170 The rules that MQ uses when deciding whether to apply a patch
jerojasro@336	171 are as follows.
jerojasro@336	172 \begin{itemize}
jerojasro@336	173 \item A patch that has no guards is always applied.
jerojasro@336	174 \item If the patch has any negative guard that matches any currently
jerojasro@336	175 selected guard, the patch is skipped.
jerojasro@336	176 \item If the patch has any positive guard that matches any currently
jerojasro@336	177 selected guard, the patch is applied.
jerojasro@336	178 \item If the patch has positive or negative guards, but none matches
jerojasro@336	179 any currently selected guard, the patch is skipped.
jerojasro@336	180 \end{itemize}
jerojasro@336	181
jerojasro@336	182 \section{Trimming the work environment}
jerojasro@336	183
jerojasro@336	184 In working on the device driver I mentioned earlier, I don't apply the
jerojasro@336	185 patches to a normal Linux kernel tree. Instead, I use a repository
jerojasro@336	186 that contains only a snapshot of the source files and headers that are
jerojasro@336	187 relevant to Infiniband development. This repository is~1\% the size
jerojasro@336	188 of a kernel repository, so it's easier to work with.
jerojasro@336	189
jerojasro@336	190 I then choose a ``base'' version on top of which the patches are
jerojasro@336	191 applied. This is a snapshot of the Linux kernel tree as of a revision
jerojasro@336	192 of my choosing. When I take the snapshot, I record the changeset ID
jerojasro@336	193 from the kernel repository in the commit message. Since the snapshot
jerojasro@336	194 preserves the ``shape'' and content of the relevant parts of the
jerojasro@336	195 kernel tree, I can apply my patches on top of either my tiny
jerojasro@336	196 repository or a normal kernel tree.
jerojasro@336	197
jerojasro@336	198 Normally, the base tree atop which the patches apply should be a
jerojasro@336	199 snapshot of a very recent upstream tree. This best facilitates the
jerojasro@336	200 development of patches that can easily be submitted upstream with few
jerojasro@336	201 or no modifications.
jerojasro@336	202
jerojasro@336	203 \section{Dividing up the \sfilename{series} file}
jerojasro@336	204
jerojasro@336	205 I categorise the patches in the \sfilename{series} file into a number
jerojasro@336	206 of logical groups. Each section of like patches begins with a block
jerojasro@336	207 of comments that describes the purpose of the patches that follow.
jerojasro@336	208
jerojasro@336	209 The sequence of patch groups that I maintain follows. The ordering of
jerojasro@336	210 these groups is important; I'll describe why after I introduce the
jerojasro@336	211 groups.
jerojasro@336	212 \begin{itemize}
jerojasro@336	213 \item The ``accepted'' group. Patches that the development team has
jerojasro@336	214 submitted to the maintainer of the Infiniband subsystem, and which
jerojasro@336	215 he has accepted, but which are not present in the snapshot that the
jerojasro@336	216 tiny repository is based on. These are ``read only'' patches,
jerojasro@336	217 present only to transform the tree into a similar state as it is in
jerojasro@336	218 the upstream maintainer's repository.
jerojasro@336	219 \item The ``rework'' group. Patches that I have submitted, but that
jerojasro@336	220 the upstream maintainer has requested modifications to before he
jerojasro@336	221 will accept them.
jerojasro@336	222 \item The ``pending'' group. Patches that I have not yet submitted to
jerojasro@336	223 the upstream maintainer, but which we have finished working on.
jerojasro@336	224 These will be ``read only'' for a while. If the upstream maintainer
jerojasro@336	225 accepts them upon submission, I'll move them to the end of the
jerojasro@336	226 ``accepted'' group. If he requests that I modify any, I'll move
jerojasro@336	227 them to the beginning of the ``rework'' group.
jerojasro@336	228 \item The ``in progress'' group. Patches that are actively being
jerojasro@336	229 developed, and should not be submitted anywhere yet.
jerojasro@336	230 \item The ``backport'' group. Patches that adapt the source tree to
jerojasro@336	231 older versions of the kernel tree.
jerojasro@336	232 \item The ``do not ship'' group. Patches that for some reason should
jerojasro@336	233 never be submitted upstream. For example, one such patch might
jerojasro@336	234 change embedded driver identification strings to make it easier to
jerojasro@336	235 distinguish, in the field, between an out-of-tree version of the
jerojasro@336	236 driver and a version shipped by a distribution vendor.
jerojasro@336	237 \end{itemize}
jerojasro@336	238
jerojasro@336	239 Now to return to the reasons for ordering groups of patches in this
jerojasro@336	240 way. We would like the lowest patches in the stack to be as stable as
jerojasro@336	241 possible, so that we will not need to rework higher patches due to
jerojasro@336	242 changes in context. Putting patches that will never be changed first
jerojasro@336	243 in the \sfilename{series} file serves this purpose.
jerojasro@336	244
jerojasro@336	245 We would also like the patches that we know we'll need to modify to be
jerojasro@336	246 applied on top of a source tree that resembles the upstream tree as
jerojasro@336	247 closely as possible. This is why we keep accepted patches around for
jerojasro@336	248 a while.
jerojasro@336	249
jerojasro@336	250 The ``backport'' and ``do not ship'' patches float at the end of the
jerojasro@336	251 \sfilename{series} file. The backport patches must be applied on top
jerojasro@336	252 of all other patches, and the ``do not ship'' patches might as well
jerojasro@336	253 stay out of harm's way.
jerojasro@336	254
jerojasro@336	255 \section{Maintaining the patch series}
jerojasro@336	256
jerojasro@336	257 In my work, I use a number of guards to control which patches are to
jerojasro@336	258 be applied.
jerojasro@336	259
jerojasro@336	260 \begin{itemize}
jerojasro@336	261 \item ``Accepted'' patches are guarded with \texttt{accepted}. I
jerojasro@336	262 enable this guard most of the time. When I'm applying the patches
jerojasro@336	263 on top of a tree where the patches are already present, I can turn
jerojasro@336	264 this patch off, and the patches that follow it will apply cleanly.
jerojasro@336	265 \item Patches that are ``finished'', but not yet submitted, have no
jerojasro@336	266 guards. If I'm applying the patch stack to a copy of the upstream
jerojasro@336	267 tree, I don't need to enable any guards in order to get a reasonably
jerojasro@336	268 safe source tree.
jerojasro@336	269 \item Those patches that need reworking before being resubmitted are
jerojasro@336	270 guarded with \texttt{rework}.
jerojasro@336	271 \item For those patches that are still under development, I use
jerojasro@336	272 \texttt{devel}.
jerojasro@336	273 \item A backport patch may have several guards, one for each version
jerojasro@336	274 of the kernel to which it applies. For example, a patch that
jerojasro@336	275 backports a piece of code to~2.6.9 will have a~\texttt{2.6.9} guard.
jerojasro@336	276 \end{itemize}
jerojasro@336	277 This variety of guards gives me considerable flexibility in
jerojasro@336	278 qdetermining what kind of source tree I want to end up with. For most
jerojasro@336	279 situations, the selection of appropriate guards is automated during
jerojasro@336	280 the build process, but I can manually tune the guards to use for less
jerojasro@336	281 common circumstances.
jerojasro@336	282
jerojasro@336	283 \subsection{The art of writing backport patches}
jerojasro@336	284
jerojasro@336	285 Using MQ, writing a backport patch is a simple process. All such a
jerojasro@336	286 patch has to do is modify a piece of code that uses a kernel feature
jerojasro@336	287 not present in the older version of the kernel, so that the driver
jerojasro@336	288 continues to work correctly under that older version.
jerojasro@336	289
jerojasro@336	290 A useful goal when writing a good backport patch is to make your code
jerojasro@336	291 look as if it was written for the older version of the kernel you're
jerojasro@336	292 targeting. The less obtrusive the patch, the easier it will be to
jerojasro@336	293 understand and maintain. If you're writing a collection of backport
jerojasro@336	294 patches to avoid the ``rat's nest'' effect of lots of
jerojasro@336	295 \texttt{\#ifdef}s (hunks of source code that are only used
jerojasro@336	296 conditionally) in your code, don't introduce version-dependent
jerojasro@336	297 \texttt{\#ifdef}s into the patches. Instead, write several patches,
jerojasro@336	298 each of which makes unconditional changes, and control their
jerojasro@336	299 application using guards.
jerojasro@336	300
jerojasro@336	301 There are two reasons to divide backport patches into a distinct
jerojasro@336	302 group, away from the ``regular'' patches whose effects they modify.
jerojasro@336	303 The first is that intermingling the two makes it more difficult to use
jerojasro@336	304 a tool like the \hgext{patchbomb} extension to automate the process of
jerojasro@336	305 submitting the patches to an upstream maintainer. The second is that
jerojasro@336	306 a backport patch could perturb the context in which a subsequent
jerojasro@336	307 regular patch is applied, making it impossible to apply the regular
jerojasro@336	308 patch cleanly \emph{without} the earlier backport patch already being
jerojasro@336	309 applied.
jerojasro@336	310
jerojasro@336	311 \section{Useful tips for developing with MQ}
jerojasro@336	312
jerojasro@336	313 \subsection{Organising patches in directories}
jerojasro@336	314
jerojasro@336	315 If you're working on a substantial project with MQ, it's not difficult
jerojasro@336	316 to accumulate a large number of patches. For example, I have one
jerojasro@336	317 patch repository that contains over 250 patches.
jerojasro@336	318
jerojasro@336	319 If you can group these patches into separate logical categories, you
jerojasro@336	320 can if you like store them in different directories; MQ has no
jerojasro@336	321 problems with patch names that contain path separators.
jerojasro@336	322
jerojasro@336	323 \subsection{Viewing the history of a patch}
jerojasro@336	324 \label{mq-collab:tips:interdiff}
jerojasro@336	325
jerojasro@336	326 If you're developing a set of patches over a long time, it's a good
jerojasro@336	327 idea to maintain them in a repository, as discussed in
jerojasro@336	328 section~\ref{sec:mq:repo}. If you do so, you'll quickly discover that
jerojasro@336	329 using the \hgcmd{diff} command to look at the history of changes to a
jerojasro@336	330 patch is unworkable. This is in part because you're looking at the
jerojasro@336	331 second derivative of the real code (a diff of a diff), but also
jerojasro@336	332 because MQ adds noise to the process by modifying time stamps and
jerojasro@336	333 directory names when it updates a patch.
jerojasro@336	334
jerojasro@336	335 However, you can use the \hgext{extdiff} extension, which is bundled
jerojasro@336	336 with Mercurial, to turn a diff of two versions of a patch into
jerojasro@336	337 something readable. To do this, you will need a third-party package
jerojasro@336	338 called \package{patchutils}~\cite{web:patchutils}. This provides a
jerojasro@336	339 command named \command{interdiff}, which shows the differences between
jerojasro@336	340 two diffs as a diff. Used on two versions of the same diff, it
jerojasro@336	341 generates a diff that represents the diff from the first to the second
jerojasro@336	342 version.
jerojasro@336	343
jerojasro@336	344 You can enable the \hgext{extdiff} extension in the usual way, by
jerojasro@336	345 adding a line to the \rcsection{extensions} section of your \hgrc.
jerojasro@336	346 \begin{codesample2}
jerojasro@336	347 [extensions]
jerojasro@336	348 extdiff =
jerojasro@336	349 \end{codesample2}
jerojasro@336	350 The \command{interdiff} command expects to be passed the names of two
jerojasro@336	351 files, but the \hgext{extdiff} extension passes the program it runs a
jerojasro@336	352 pair of directories, each of which can contain an arbitrary number of
jerojasro@336	353 files. We thus need a small program that will run \command{interdiff}
jerojasro@336	354 on each pair of files in these two directories. This program is
jerojasro@336	355 available as \sfilename{hg-interdiff} in the \dirname{examples}
jerojasro@336	356 directory of the source code repository that accompanies this book.
jerojasro@336	357 \excode{hg-interdiff}
jerojasro@336	358
jerojasro@336	359 With the \sfilename{hg-interdiff} program in your shell's search path,
jerojasro@336	360 you can run it as follows, from inside an MQ patch directory:
jerojasro@336	361 \begin{codesample2}
jerojasro@336	362 hg extdiff -p hg-interdiff -r A:B my-change.patch
jerojasro@336	363 \end{codesample2}
jerojasro@336	364 Since you'll probably want to use this long-winded command a lot, you
jerojasro@336	365 can get \hgext{hgext} to make it available as a normal Mercurial
jerojasro@336	366 command, again by editing your \hgrc.
jerojasro@336	367 \begin{codesample2}
jerojasro@336	368 [extdiff]
jerojasro@336	369 cmd.interdiff = hg-interdiff
jerojasro@336	370 \end{codesample2}
jerojasro@336	371 This directs \hgext{hgext} to make an \texttt{interdiff} command
jerojasro@336	372 available, so you can now shorten the previous invocation of
jerojasro@336	373 \hgxcmd{extdiff}{extdiff} to something a little more wieldy.
jerojasro@336	374 \begin{codesample2}
jerojasro@336	375 hg interdiff -r A:B my-change.patch
jerojasro@336	376 \end{codesample2}
jerojasro@336	377
jerojasro@336	378 \begin{note}
jerojasro@336	379 The \command{interdiff} command works well only if the underlying
jerojasro@336	380 files against which versions of a patch are generated remain the
jerojasro@336	381 same. If you create a patch, modify the underlying files, and then
jerojasro@336	382 regenerate the patch, \command{interdiff} may not produce useful
jerojasro@336	383 output.
jerojasro@336	384 \end{note}
jerojasro@336	385
jerojasro@336	386 The \hgext{extdiff} extension is useful for more than merely improving
jerojasro@336	387 the presentation of MQ~patches. To read more about it, go to
jerojasro@336	388 section~\ref{sec:hgext:extdiff}.
jerojasro@336	389
jerojasro@336	390 %%% Local Variables:
jerojasro@336	391 %%% mode: latex
jerojasro@336	392 %%% TeX-master: "00book"
jerojasro@336	393 %%% End: