hgbook

annotate en/mq-collab.tex @ 307:fb5c0d56d7f1

Fix test 'tour'.

Executing 'tour' test now creates some files in /tmp to store the
revision numbers as they are created on the fly and appear in the output
files. When SVG files are to be converted to PNG or EPS files within the
Makefile, a tool 'fixsvg' will be invoked to substitute some placeholder
markup by the real version number which fits to the test output, before
the final conversion takes place.
author Guido Ostkamp <hg@ostkamp.fastmail.fm>
date Wed Aug 20 22:15:35 2008 +0200 (2008-08-20)
parents 4119e57679f7
children 5561812fc5c9
rev   line source
bos@104 1 \chapter{Advanced uses of Mercurial Queues}
bos@224 2 \label{chap:mq-collab}
bos@104 3
bos@104 4 While it's easy to pick up straightforward uses of Mercurial Queues,
bos@104 5 use of a little discipline and some of MQ's less frequently used
bos@104 6 capabilities makes it possible to work in complicated development
bos@104 7 environments.
bos@104 8
bos@105 9 In this chapter, I will use as an example a technique I have used to
bos@105 10 manage the development of an Infiniband device driver for the Linux
bos@105 11 kernel. The driver in question is large (at least as drivers go),
bos@105 12 with 25,000 lines of code spread across 35 source files. It is
bos@105 13 maintained by a small team of developers.
bos@104 14
bos@104 15 While much of the material in this chapter is specific to Linux, the
bos@104 16 same principles apply to any code base for which you're not the
bos@104 17 primary owner, and upon which you need to do a lot of development.
bos@104 18
bos@104 19 \section{The problem of many targets}
bos@104 20
bos@104 21 The Linux kernel changes rapidly, and has never been internally
bos@104 22 stable; developers frequently make drastic changes between releases.
bos@104 23 This means that a version of the driver that works well with a
bos@104 24 particular released version of the kernel will not even \emph{compile}
bos@104 25 correctly against, typically, any other version.
bos@104 26
bos@104 27 To maintain a driver, we have to keep a number of distinct versions of
bos@104 28 Linux in mind.
bos@104 29 \begin{itemize}
bos@104 30 \item One target is the main Linux kernel development tree.
bos@104 31 Maintenance of the code is in this case partly shared by other
bos@104 32 developers in the kernel community, who make ``drive-by''
bos@104 33 modifications to the driver as they develop and refine kernel
bos@104 34 subsystems.
bos@104 35 \item We also maintain a number of ``backports'' to older versions of
bos@104 36 the Linux kernel, to support the needs of customers who are running
bos@105 37 older Linux distributions that do not incorporate our drivers. (To
bos@105 38 \emph{backport} a piece of code is to modify it to work in an older
bos@105 39 version of its target environment than the version it was developed
bos@105 40 for.)
bos@104 41 \item Finally, we make software releases on a schedule that is
bos@104 42 necessarily not aligned with those used by Linux distributors and
bos@104 43 kernel developers, so that we can deliver new features to customers
bos@104 44 without forcing them to upgrade their entire kernels or
bos@104 45 distributions.
bos@104 46 \end{itemize}
bos@104 47
bos@104 48 \subsection{Tempting approaches that don't work well}
bos@104 49
bos@104 50 There are two ``standard'' ways to maintain a piece of software that
bos@104 51 has to target many different environments.
bos@104 52
bos@104 53 The first is to maintain a number of branches, each intended for a
bos@104 54 single target. The trouble with this approach is that you must
bos@104 55 maintain iron discipline in the flow of changes between repositories.
bos@104 56 A new feature or bug fix must start life in a ``pristine'' repository,
bos@104 57 then percolate out to every backport repository. Backport changes are
bos@104 58 more limited in the branches they should propagate to; a backport
bos@104 59 change that is applied to a branch where it doesn't belong will
bos@104 60 probably stop the driver from compiling.
bos@104 61
bos@104 62 The second is to maintain a single source tree filled with conditional
bos@104 63 statements that turn chunks of code on or off depending on the
bos@104 64 intended target. Because these ``ifdefs'' are not allowed in the
bos@104 65 Linux kernel tree, a manual or automatic process must be followed to
bos@104 66 strip them out and yield a clean tree. A code base maintained in this
bos@104 67 fashion rapidly becomes a rat's nest of conditional blocks that are
bos@104 68 difficult to understand and maintain.
bos@104 69
bos@104 70 Neither of these approaches is well suited to a situation where you
bos@104 71 don't ``own'' the canonical copy of a source tree. In the case of a
bos@104 72 Linux driver that is distributed with the standard kernel, Linus's
bos@104 73 tree contains the copy of the code that will be treated by the world
bos@104 74 as canonical. The upstream version of ``my'' driver can be modified
bos@104 75 by people I don't know, without me even finding out about it until
bos@104 76 after the changes show up in Linus's tree.
bos@104 77
bos@104 78 These approaches have the added weakness of making it difficult to
bos@104 79 generate well-formed patches to submit upstream.
bos@104 80
bos@104 81 In principle, Mercurial Queues seems like a good candidate to manage a
bos@104 82 development scenario such as the above. While this is indeed the
bos@104 83 case, MQ contains a few added features that make the job more
bos@104 84 pleasant.
bos@104 85
bos@105 86 \section{Conditionally applying patches with
bos@105 87 guards}
bos@104 88
bos@104 89 Perhaps the best way to maintain sanity with so many targets is to be
bos@104 90 able to choose specific patches to apply for a given situation. MQ
bos@104 91 provides a feature called ``guards'' (which originates with quilt's
bos@104 92 \texttt{guards} command) that does just this. To start off, let's
bos@104 93 create a simple repository for experimenting in.
bos@104 94 \interaction{mq.guards.init}
bos@104 95 This gives us a tiny repository that contains two patches that don't
bos@104 96 have any dependencies on each other, because they touch different files.
bos@104 97
bos@104 98 The idea behind conditional application is that you can ``tag'' a
bos@104 99 patch with a \emph{guard}, which is simply a text string of your
bos@104 100 choosing, then tell MQ to select specific guards to use when applying
bos@104 101 patches. MQ will then either apply, or skip over, a guarded patch,
bos@104 102 depending on the guards that you have selected.
bos@104 103
bos@104 104 A patch can have an arbitrary number of guards;
bos@104 105 each one is \emph{positive} (``apply this patch if this guard is
bos@104 106 selected'') or \emph{negative} (``skip this patch if this guard is
bos@104 107 selected''). A patch with no guards is always applied.
bos@104 108
bos@104 109 \section{Controlling the guards on a patch}
bos@104 110
bos@233 111 The \hgxcmd{mq}{qguard} command lets you determine which guards should
bos@104 112 apply to a patch, or display the guards that are already in effect.
bos@104 113 Without any arguments, it displays the guards on the current topmost
bos@104 114 patch.
bos@104 115 \interaction{mq.guards.qguard}
bos@104 116 To set a positive guard on a patch, prefix the name of the guard with
bos@104 117 a ``\texttt{+}''.
bos@104 118 \interaction{mq.guards.qguard.pos}
bos@104 119 To set a negative guard on a patch, prefix the name of the guard with
bos@104 120 a ``\texttt{-}''.
bos@104 121 \interaction{mq.guards.qguard.neg}
bos@104 122
bos@104 123 \begin{note}
bos@233 124 The \hgxcmd{mq}{qguard} command \emph{sets} the guards on a patch; it
bos@104 125 doesn't \emph{modify} them. What this means is that if you run
bos@104 126 \hgcmdargs{qguard}{+a +b} on a patch, then \hgcmdargs{qguard}{+c} on
bos@104 127 the same patch, the \emph{only} guard that will be set on it
bos@104 128 afterwards is \texttt{+c}.
bos@104 129 \end{note}
bos@104 130
bos@104 131 Mercurial stores guards in the \sfilename{series} file; the form in
bos@104 132 which they are stored is easy both to understand and to edit by hand.
bos@233 133 (In other words, you don't have to use the \hgxcmd{mq}{qguard} command if
bos@104 134 you don't want to; it's okay to simply edit the \sfilename{series}
bos@104 135 file.)
bos@104 136 \interaction{mq.guards.series}
bos@104 137
bos@104 138 \section{Selecting the guards to use}
bos@104 139
bos@233 140 The \hgxcmd{mq}{qselect} command determines which guards are active at a
bos@104 141 given time. The effect of this is to determine which patches MQ will
bos@233 142 apply the next time you run \hgxcmd{mq}{qpush}. It has no other effect; in
bos@104 143 particular, it doesn't do anything to patches that are already
bos@104 144 applied.
bos@104 145
bos@233 146 With no arguments, the \hgxcmd{mq}{qselect} command lists the guards
bos@104 147 currently in effect, one per line of output. Each argument is treated
bos@104 148 as the name of a guard to apply.
bos@104 149 \interaction{mq.guards.qselect.foo}
bos@104 150 In case you're interested, the currently selected guards are stored in
bos@104 151 the \sfilename{guards} file.
bos@104 152 \interaction{mq.guards.qselect.cat}
bos@104 153 We can see the effect the selected guards have when we run
bos@233 154 \hgxcmd{mq}{qpush}.
bos@104 155 \interaction{mq.guards.qselect.qpush}
bos@104 156
bos@104 157 A guard cannot start with a ``\texttt{+}'' or ``\texttt{-}''
bos@106 158 character. The name of a guard must not contain white space, but most
bos@106 159 othter characters are acceptable. If you try to use a guard with an
bos@106 160 invalid name, MQ will complain:
bos@106 161 \interaction{mq.guards.qselect.error}
bos@104 162 Changing the selected guards changes the patches that are applied.
bos@106 163 \interaction{mq.guards.qselect.quux}
bos@105 164 You can see in the example below that negative guards take precedence
bos@105 165 over positive guards.
bos@104 166 \interaction{mq.guards.qselect.foobar}
bos@104 167
bos@105 168 \section{MQ's rules for applying patches}
bos@105 169
bos@105 170 The rules that MQ uses when deciding whether to apply a patch
bos@105 171 are as follows.
bos@105 172 \begin{itemize}
bos@105 173 \item A patch that has no guards is always applied.
bos@105 174 \item If the patch has any negative guard that matches any currently
bos@105 175 selected guard, the patch is skipped.
bos@105 176 \item If the patch has any positive guard that matches any currently
bos@105 177 selected guard, the patch is applied.
bos@105 178 \item If the patch has positive or negative guards, but none matches
bos@105 179 any currently selected guard, the patch is skipped.
bos@105 180 \end{itemize}
bos@105 181
bos@105 182 \section{Trimming the work environment}
bos@105 183
bos@105 184 In working on the device driver I mentioned earlier, I don't apply the
bos@105 185 patches to a normal Linux kernel tree. Instead, I use a repository
bos@105 186 that contains only a snapshot of the source files and headers that are
bos@105 187 relevant to Infiniband development. This repository is~1\% the size
bos@105 188 of a kernel repository, so it's easier to work with.
bos@105 189
bos@105 190 I then choose a ``base'' version on top of which the patches are
bos@105 191 applied. This is a snapshot of the Linux kernel tree as of a revision
bos@105 192 of my choosing. When I take the snapshot, I record the changeset ID
bos@105 193 from the kernel repository in the commit message. Since the snapshot
bos@105 194 preserves the ``shape'' and content of the relevant parts of the
bos@105 195 kernel tree, I can apply my patches on top of either my tiny
bos@105 196 repository or a normal kernel tree.
bos@105 197
bos@105 198 Normally, the base tree atop which the patches apply should be a
bos@105 199 snapshot of a very recent upstream tree. This best facilitates the
bos@105 200 development of patches that can easily be submitted upstream with few
bos@105 201 or no modifications.
bos@105 202
bos@105 203 \section{Dividing up the \sfilename{series} file}
bos@105 204
bos@105 205 I categorise the patches in the \sfilename{series} file into a number
bos@105 206 of logical groups. Each section of like patches begins with a block
bos@105 207 of comments that describes the purpose of the patches that follow.
bos@105 208
bos@105 209 The sequence of patch groups that I maintain follows. The ordering of
bos@105 210 these groups is important; I'll describe why after I introduce the
bos@105 211 groups.
bos@105 212 \begin{itemize}
bos@105 213 \item The ``accepted'' group. Patches that the development team has
bos@105 214 submitted to the maintainer of the Infiniband subsystem, and which
bos@105 215 he has accepted, but which are not present in the snapshot that the
bos@105 216 tiny repository is based on. These are ``read only'' patches,
bos@105 217 present only to transform the tree into a similar state as it is in
bos@105 218 the upstream maintainer's repository.
bos@105 219 \item The ``rework'' group. Patches that I have submitted, but that
bos@105 220 the upstream maintainer has requested modifications to before he
bos@105 221 will accept them.
bos@105 222 \item The ``pending'' group. Patches that I have not yet submitted to
bos@105 223 the upstream maintainer, but which we have finished working on.
bos@105 224 These will be ``read only'' for a while. If the upstream maintainer
bos@105 225 accepts them upon submission, I'll move them to the end of the
bos@105 226 ``accepted'' group. If he requests that I modify any, I'll move
bos@105 227 them to the beginning of the ``rework'' group.
bos@105 228 \item The ``in progress'' group. Patches that are actively being
bos@105 229 developed, and should not be submitted anywhere yet.
bos@105 230 \item The ``backport'' group. Patches that adapt the source tree to
bos@105 231 older versions of the kernel tree.
bos@105 232 \item The ``do not ship'' group. Patches that for some reason should
bos@105 233 never be submitted upstream. For example, one such patch might
bos@105 234 change embedded driver identification strings to make it easier to
bos@105 235 distinguish, in the field, between an out-of-tree version of the
bos@105 236 driver and a version shipped by a distribution vendor.
bos@105 237 \end{itemize}
bos@105 238
bos@105 239 Now to return to the reasons for ordering groups of patches in this
bos@105 240 way. We would like the lowest patches in the stack to be as stable as
bos@105 241 possible, so that we will not need to rework higher patches due to
bos@105 242 changes in context. Putting patches that will never be changed first
bos@105 243 in the \sfilename{series} file serves this purpose.
bos@105 244
bos@105 245 We would also like the patches that we know we'll need to modify to be
bos@105 246 applied on top of a source tree that resembles the upstream tree as
bos@105 247 closely as possible. This is why we keep accepted patches around for
bos@105 248 a while.
bos@105 249
bos@105 250 The ``backport'' and ``do not ship'' patches float at the end of the
bos@106 251 \sfilename{series} file. The backport patches must be applied on top
bos@106 252 of all other patches, and the ``do not ship'' patches might as well
bos@106 253 stay out of harm's way.
bos@106 254
bos@106 255 \section{Maintaining the patch series}
bos@106 256
bos@106 257 In my work, I use a number of guards to control which patches are to
bos@106 258 be applied.
bos@106 259
bos@106 260 \begin{itemize}
bos@106 261 \item ``Accepted'' patches are guarded with \texttt{accepted}. I
bos@106 262 enable this guard most of the time. When I'm applying the patches
bos@106 263 on top of a tree where the patches are already present, I can turn
max@271 264 this patch off, and the patches that follow it will apply cleanly.
bos@106 265 \item Patches that are ``finished'', but not yet submitted, have no
bos@106 266 guards. If I'm applying the patch stack to a copy of the upstream
bos@106 267 tree, I don't need to enable any guards in order to get a reasonably
bos@106 268 safe source tree.
bos@106 269 \item Those patches that need reworking before being resubmitted are
bos@106 270 guarded with \texttt{rework}.
bos@106 271 \item For those patches that are still under development, I use
bos@106 272 \texttt{devel}.
bos@106 273 \item A backport patch may have several guards, one for each version
bos@106 274 of the kernel to which it applies. For example, a patch that
bos@106 275 backports a piece of code to~2.6.9 will have a~\texttt{2.6.9} guard.
bos@106 276 \end{itemize}
bos@106 277 This variety of guards gives me considerable flexibility in
bos@106 278 qdetermining what kind of source tree I want to end up with. For most
bos@106 279 situations, the selection of appropriate guards is automated during
bos@106 280 the build process, but I can manually tune the guards to use for less
bos@106 281 common circumstances.
bos@106 282
bos@106 283 \subsection{The art of writing backport patches}
bos@106 284
bos@106 285 Using MQ, writing a backport patch is a simple process. All such a
bos@106 286 patch has to do is modify a piece of code that uses a kernel feature
bos@106 287 not present in the older version of the kernel, so that the driver
bos@106 288 continues to work correctly under that older version.
bos@106 289
bos@106 290 A useful goal when writing a good backport patch is to make your code
bos@106 291 look as if it was written for the older version of the kernel you're
bos@106 292 targeting. The less obtrusive the patch, the easier it will be to
bos@106 293 understand and maintain. If you're writing a collection of backport
bos@106 294 patches to avoid the ``rat's nest'' effect of lots of
bos@106 295 \texttt{\#ifdef}s (hunks of source code that are only used
bos@106 296 conditionally) in your code, don't introduce version-dependent
bos@106 297 \texttt{\#ifdef}s into the patches. Instead, write several patches,
bos@106 298 each of which makes unconditional changes, and control their
bos@106 299 application using guards.
bos@106 300
bos@106 301 There are two reasons to divide backport patches into a distinct
bos@106 302 group, away from the ``regular'' patches whose effects they modify.
bos@106 303 The first is that intermingling the two makes it more difficult to use
bos@106 304 a tool like the \hgext{patchbomb} extension to automate the process of
bos@106 305 submitting the patches to an upstream maintainer. The second is that
bos@106 306 a backport patch could perturb the context in which a subsequent
bos@106 307 regular patch is applied, making it impossible to apply the regular
bos@106 308 patch cleanly \emph{without} the earlier backport patch already being
bos@106 309 applied.
bos@106 310
bos@106 311 \section{Useful tips for developing with MQ}
bos@106 312
bos@106 313 \subsection{Organising patches in directories}
bos@106 314
bos@106 315 If you're working on a substantial project with MQ, it's not difficult
bos@106 316 to accumulate a large number of patches. For example, I have one
bos@106 317 patch repository that contains over 250 patches.
bos@106 318
bos@106 319 If you can group these patches into separate logical categories, you
bos@106 320 can if you like store them in different directories; MQ has no
bos@106 321 problems with patch names that contain path separators.
bos@106 322
bos@106 323 \subsection{Viewing the history of a patch}
bos@106 324 \label{mq-collab:tips:interdiff}
bos@106 325
bos@106 326 If you're developing a set of patches over a long time, it's a good
bos@106 327 idea to maintain them in a repository, as discussed in
bos@106 328 section~\ref{sec:mq:repo}. If you do so, you'll quickly discover that
bos@106 329 using the \hgcmd{diff} command to look at the history of changes to a
bos@106 330 patch is unworkable. This is in part because you're looking at the
bos@106 331 second derivative of the real code (a diff of a diff), but also
bos@106 332 because MQ adds noise to the process by modifying time stamps and
bos@106 333 directory names when it updates a patch.
bos@106 334
bos@106 335 However, you can use the \hgext{extdiff} extension, which is bundled
bos@106 336 with Mercurial, to turn a diff of two versions of a patch into
bos@106 337 something readable. To do this, you will need a third-party package
bos@106 338 called \package{patchutils}~\cite{web:patchutils}. This provides a
bos@106 339 command named \command{interdiff}, which shows the differences between
bos@106 340 two diffs as a diff. Used on two versions of the same diff, it
bos@106 341 generates a diff that represents the diff from the first to the second
bos@106 342 version.
bos@106 343
bos@106 344 You can enable the \hgext{extdiff} extension in the usual way, by
bos@106 345 adding a line to the \rcsection{extensions} section of your \hgrc.
bos@106 346 \begin{codesample2}
bos@106 347 [extensions]
bos@106 348 extdiff =
bos@106 349 \end{codesample2}
bos@106 350 The \command{interdiff} command expects to be passed the names of two
bos@106 351 files, but the \hgext{extdiff} extension passes the program it runs a
bos@106 352 pair of directories, each of which can contain an arbitrary number of
bos@106 353 files. We thus need a small program that will run \command{interdiff}
bos@106 354 on each pair of files in these two directories. This program is
bos@106 355 available as \sfilename{hg-interdiff} in the \dirname{examples}
bos@106 356 directory of the source code repository that accompanies this book.
bos@106 357 \excode{hg-interdiff}
bos@106 358
bos@106 359 With the \sfilename{hg-interdiff} program in your shell's search path,
bos@106 360 you can run it as follows, from inside an MQ patch directory:
bos@106 361 \begin{codesample2}
bos@106 362 hg extdiff -p hg-interdiff -r A:B my-change.patch
bos@106 363 \end{codesample2}
bos@106 364 Since you'll probably want to use this long-winded command a lot, you
bos@106 365 can get \hgext{hgext} to make it available as a normal Mercurial
bos@106 366 command, again by editing your \hgrc.
bos@106 367 \begin{codesample2}
bos@106 368 [extdiff]
bos@106 369 cmd.interdiff = hg-interdiff
bos@106 370 \end{codesample2}
bos@106 371 This directs \hgext{hgext} to make an \texttt{interdiff} command
bos@106 372 available, so you can now shorten the previous invocation of
bos@238 373 \hgxcmd{extdiff}{extdiff} to something a little more wieldy.
bos@106 374 \begin{codesample2}
bos@106 375 hg interdiff -r A:B my-change.patch
bos@106 376 \end{codesample2}
bos@105 377
bos@107 378 \begin{note}
bos@107 379 The \command{interdiff} command works well only if the underlying
bos@107 380 files against which versions of a patch are generated remain the
bos@107 381 same. If you create a patch, modify the underlying files, and then
bos@107 382 regenerate the patch, \command{interdiff} may not produce useful
bos@107 383 output.
bos@107 384 \end{note}
bos@107 385
bos@240 386 The \hgext{extdiff} extension is useful for more than merely improving
bos@239 387 the presentation of MQ~patches. To read more about it, go to
bos@239 388 section~\ref{sec:hgext:extdiff}.
bos@239 389
bos@104 390 %%% Local Variables:
bos@104 391 %%% mode: latex
bos@104 392 %%% TeX-master: "00book"
bos@104 393 %%% End: