hgbook

view en/ch13-mq-collab.xml @ 559:b90b024729f1

WIP DocBook snapshot that all compiles. Mirabile dictu!
author Bryan O'Sullivan <bos@serpentine.com>
date Wed Feb 18 00:22:09 2009 -0800 (2009-02-18)
parents en/ch13-mq-collab.tex@f72b7e6cbe90
children 8fcd44708f41
line source
1 <!-- vim: set filetype=docbkxml shiftwidth=2 autoindent expandtab tw=77 : -->
3 <chapter id="chap:mq-collab">
4 <title>Advanced uses of Mercurial Queues</title>
6 <para>While it's easy to pick up straightforward uses of Mercurial
7 Queues, use of a little discipline and some of MQ's less
8 frequently used capabilities makes it possible to work in
9 complicated development environments.</para>
11 <para>In this chapter, I will use as an example a technique I have
12 used to manage the development of an Infiniband device driver for
13 the Linux kernel. The driver in question is large (at least as
14 drivers go), with 25,000 lines of code spread across 35 source
15 files. It is maintained by a small team of developers.</para>
17 <para>While much of the material in this chapter is specific to
18 Linux, the same principles apply to any code base for which you're
19 not the primary owner, and upon which you need to do a lot of
20 development.</para>
22 <sect1>
23 <title>The problem of many targets</title>
25 <para>The Linux kernel changes rapidly, and has never been
26 internally stable; developers frequently make drastic changes
27 between releases. This means that a version of the driver that
28 works well with a particular released version of the kernel will
29 not even <emphasis>compile</emphasis> correctly against,
30 typically, any other version.</para>
32 <para>To maintain a driver, we have to keep a number of distinct
33 versions of Linux in mind.</para>
34 <itemizedlist>
35 <listitem><para>One target is the main Linux kernel development
36 tree. Maintenance of the code is in this case partly shared
37 by other developers in the kernel community, who make
38 <quote>drive-by</quote> modifications to the driver as they
39 develop and refine kernel subsystems.</para>
40 </listitem>
41 <listitem><para>We also maintain a number of
42 <quote>backports</quote> to older versions of the Linux
43 kernel, to support the needs of customers who are running
44 older Linux distributions that do not incorporate our
45 drivers. (To <emphasis>backport</emphasis> a piece of code
46 is to modify it to work in an older version of its target
47 environment than the version it was developed for.)</para>
48 </listitem>
49 <listitem><para>Finally, we make software releases on a schedule
50 that is necessarily not aligned with those used by Linux
51 distributors and kernel developers, so that we can deliver
52 new features to customers without forcing them to upgrade
53 their entire kernels or distributions.</para>
54 </listitem></itemizedlist>
56 <sect2>
57 <title>Tempting approaches that don't work well</title>
59 <para>There are two <quote>standard</quote> ways to maintain a
60 piece of software that has to target many different
61 environments.</para>
63 <para>The first is to maintain a number of branches, each
64 intended for a single target. The trouble with this approach
65 is that you must maintain iron discipline in the flow of
66 changes between repositories. A new feature or bug fix must
67 start life in a <quote>pristine</quote> repository, then
68 percolate out to every backport repository. Backport changes
69 are more limited in the branches they should propagate to; a
70 backport change that is applied to a branch where it doesn't
71 belong will probably stop the driver from compiling.</para>
73 <para>The second is to maintain a single source tree filled with
74 conditional statements that turn chunks of code on or off
75 depending on the intended target. Because these
76 <quote>ifdefs</quote> are not allowed in the Linux kernel
77 tree, a manual or automatic process must be followed to strip
78 them out and yield a clean tree. A code base maintained in
79 this fashion rapidly becomes a rat's nest of conditional
80 blocks that are difficult to understand and maintain.</para>
82 <para>Neither of these approaches is well suited to a situation
83 where you don't <quote>own</quote> the canonical copy of a
84 source tree. In the case of a Linux driver that is
85 distributed with the standard kernel, Linus's tree contains
86 the copy of the code that will be treated by the world as
87 canonical. The upstream version of <quote>my</quote> driver
88 can be modified by people I don't know, without me even
89 finding out about it until after the changes show up in
90 Linus's tree.</para>
92 <para>These approaches have the added weakness of making it
93 difficult to generate well-formed patches to submit
94 upstream.</para>
96 <para>In principle, Mercurial Queues seems like a good candidate
97 to manage a development scenario such as the above. While
98 this is indeed the case, MQ contains a few added features that
99 make the job more pleasant.</para>
101 </sect2>
102 </sect1>
103 <sect1>
104 <title>Conditionally applying patches with guards</title>
106 <para>Perhaps the best way to maintain sanity with so many targets
107 is to be able to choose specific patches to apply for a given
108 situation. MQ provides a feature called <quote>guards</quote>
109 (which originates with quilt's <literal>guards</literal>
110 command) that does just this. To start off, let's create a
111 simple repository for experimenting in. <!--
112 &interaction.mq.guards.init; --> This gives us a tiny repository
113 that contains two patches that don't have any dependencies on
114 each other, because they touch different files.</para>
116 <para>The idea behind conditional application is that you can
117 <quote>tag</quote> a patch with a <emphasis>guard</emphasis>,
118 which is simply a text string of your choosing, then tell MQ to
119 select specific guards to use when applying patches. MQ will
120 then either apply, or skip over, a guarded patch, depending on
121 the guards that you have selected.</para>
123 <para>A patch can have an arbitrary number of guards; each one is
124 <emphasis>positive</emphasis> (<quote>apply this patch if this
125 guard is selected</quote>) or <emphasis>negative</emphasis>
126 (<quote>skip this patch if this guard is selected</quote>). A
127 patch with no guards is always applied.</para>
129 </sect1>
130 <sect1>
131 <title>Controlling the guards on a patch</title>
133 <para>The <command role="hg-ext-mq">qguard</command> command lets
134 you determine which guards should apply to a patch, or display
135 the guards that are already in effect. Without any arguments, it
136 displays the guards on the current topmost patch. <!--
137 &interaction.mq.guards.qguard; --> To set a positive guard on a
138 patch, prefix the name of the guard with a
139 <quote><literal>+</literal></quote>. <!--
140 &interaction.mq.guards.qguard.pos; --> To set a negative guard
141 on a patch, prefix the name of the guard with a
142 <quote><literal>-</literal></quote>. <!--
143 &interaction.mq.guards.qguard.neg; --></para>
145 <note>
146 <para> The <command role="hg-ext-mq">qguard</command> command
147 <emphasis>sets</emphasis> the guards on a patch; it doesn't
148 <emphasis>modify</emphasis> them. What this means is that if
149 you run <command role="hg-cmd">hg qguard +a +b</command> on a
150 patch, then <command role="hg-cmd">hg qguard +c</command> on
151 the same patch, the <emphasis>only</emphasis> guard that will
152 be set on it afterwards is <literal>+c</literal>.</para>
153 </note>
155 <para>Mercurial stores guards in the <filename
156 role="special">series</filename> file; the form in which they
157 are stored is easy both to understand and to edit by hand. (In
158 other words, you don't have to use the <command
159 role="hg-ext-mq">qguard</command> command if you don't want
160 to; it's okay to simply edit the <filename
161 role="special">series</filename> file.) <!--
162 &interaction.mq.guards.series; --></para>
164 </sect1>
165 <sect1>
166 <title>Selecting the guards to use</title>
168 <para>The <command role="hg-ext-mq">qselect</command> command
169 determines which guards are active at a given time. The effect
170 of this is to determine which patches MQ will apply the next
171 time you run <command role="hg-ext-mq">qpush</command>. It has
172 no other effect; in particular, it doesn't do anything to
173 patches that are already applied.</para>
175 <para>With no arguments, the <command
176 role="hg-ext-mq">qselect</command> command lists the guards
177 currently in effect, one per line of output. Each argument is
178 treated as the name of a guard to apply. <!--
179 &interaction.mq.guards.qselect.foo; --> In case you're
180 interested, the currently selected guards are stored in the
181 <filename role="special">guards</filename> file. <!--
182 &interaction.mq.guards.qselect.cat; --> We can see the effect
183 the selected guards have when we run <command
184 role="hg-ext-mq">qpush</command>. <!--
185 &interaction.mq.guards.qselect.qpush; --></para>
187 <para>A guard cannot start with a
188 <quote><literal>+</literal></quote> or
189 <quote><literal>-</literal></quote> character. The name of a
190 guard must not contain white space, but most other characters
191 are acceptable. If you try to use a guard with an invalid name,
192 MQ will complain: <!-- &interaction.mq.guards.qselect.error; -->
193 Changing the selected guards changes the patches that are
194 applied. <!-- &interaction.mq.guards.qselect.quux; --> You can
195 see in the example below that negative guards take precedence
196 over positive guards. <!--
197 &interaction.mq.guards.qselect.foobar; --></para>
199 </sect1>
200 <sect1>
201 <title>MQ's rules for applying patches</title>
203 <para>The rules that MQ uses when deciding whether to apply a
204 patch are as follows.</para>
205 <itemizedlist>
206 <listitem><para>A patch that has no guards is always
207 applied.</para>
208 </listitem>
209 <listitem><para>If the patch has any negative guard that matches
210 any currently selected guard, the patch is skipped.</para>
211 </listitem>
212 <listitem><para>If the patch has any positive guard that matches
213 any currently selected guard, the patch is applied.</para>
214 </listitem>
215 <listitem><para>If the patch has positive or negative guards,
216 but none matches any currently selected guard, the patch is
217 skipped.</para>
218 </listitem></itemizedlist>
220 </sect1>
221 <sect1>
222 <title>Trimming the work environment</title>
224 <para>In working on the device driver I mentioned earlier, I don't
225 apply the patches to a normal Linux kernel tree. Instead, I use
226 a repository that contains only a snapshot of the source files
227 and headers that are relevant to Infiniband development. This
228 repository is 1% the size of a kernel repository, so it's easier
229 to work with.</para>
231 <para>I then choose a <quote>base</quote> version on top of which
232 the patches are applied. This is a snapshot of the Linux kernel
233 tree as of a revision of my choosing. When I take the snapshot,
234 I record the changeset ID from the kernel repository in the
235 commit message. Since the snapshot preserves the
236 <quote>shape</quote> and content of the relevant parts of the
237 kernel tree, I can apply my patches on top of either my tiny
238 repository or a normal kernel tree.</para>
240 <para>Normally, the base tree atop which the patches apply should
241 be a snapshot of a very recent upstream tree. This best
242 facilitates the development of patches that can easily be
243 submitted upstream with few or no modifications.</para>
245 </sect1>
246 <sect1>
247 <title>Dividing up the <filename role="special">series</filename>
248 file</title>
250 <para>I categorise the patches in the <filename
251 role="special">series</filename> file into a number of logical
252 groups. Each section of like patches begins with a block of
253 comments that describes the purpose of the patches that
254 follow.</para>
256 <para>The sequence of patch groups that I maintain follows. The
257 ordering of these groups is important; I'll describe why after I
258 introduce the groups.</para>
259 <itemizedlist>
260 <listitem><para>The <quote>accepted</quote> group. Patches that
261 the development team has submitted to the maintainer of the
262 Infiniband subsystem, and which he has accepted, but which
263 are not present in the snapshot that the tiny repository is
264 based on. These are <quote>read only</quote> patches,
265 present only to transform the tree into a similar state as
266 it is in the upstream maintainer's repository.</para>
267 </listitem>
268 <listitem><para>The <quote>rework</quote> group. Patches that I
269 have submitted, but that the upstream maintainer has
270 requested modifications to before he will accept
271 them.</para>
272 </listitem>
273 <listitem><para>The <quote>pending</quote> group. Patches that
274 I have not yet submitted to the upstream maintainer, but
275 which we have finished working on. These will be <quote>read
276 only</quote> for a while. If the upstream maintainer
277 accepts them upon submission, I'll move them to the end of
278 the <quote>accepted</quote> group. If he requests that I
279 modify any, I'll move them to the beginning of the
280 <quote>rework</quote> group.</para>
281 </listitem>
282 <listitem><para>The <quote>in progress</quote> group. Patches
283 that are actively being developed, and should not be
284 submitted anywhere yet.</para>
285 </listitem>
286 <listitem><para>The <quote>backport</quote> group. Patches that
287 adapt the source tree to older versions of the kernel
288 tree.</para>
289 </listitem>
290 <listitem><para>The <quote>do not ship</quote> group. Patches
291 that for some reason should never be submitted upstream.
292 For example, one such patch might change embedded driver
293 identification strings to make it easier to distinguish, in
294 the field, between an out-of-tree version of the driver and
295 a version shipped by a distribution vendor.</para>
296 </listitem></itemizedlist>
298 <para>Now to return to the reasons for ordering groups of patches
299 in this way. We would like the lowest patches in the stack to
300 be as stable as possible, so that we will not need to rework
301 higher patches due to changes in context. Putting patches that
302 will never be changed first in the <filename
303 role="special">series</filename> file serves this
304 purpose.</para>
306 <para>We would also like the patches that we know we'll need to
307 modify to be applied on top of a source tree that resembles the
308 upstream tree as closely as possible. This is why we keep
309 accepted patches around for a while.</para>
311 <para>The <quote>backport</quote> and <quote>do not ship</quote>
312 patches float at the end of the <filename
313 role="special">series</filename> file. The backport patches
314 must be applied on top of all other patches, and the <quote>do
315 not ship</quote> patches might as well stay out of harm's
316 way.</para>
318 </sect1>
319 <sect1>
320 <title>Maintaining the patch series</title>
322 <para>In my work, I use a number of guards to control which
323 patches are to be applied.</para>
325 <itemizedlist>
326 <listitem><para><quote>Accepted</quote> patches are guarded with
327 <literal>accepted</literal>. I enable this guard most of
328 the time. When I'm applying the patches on top of a tree
329 where the patches are already present, I can turn this patch
330 off, and the patches that follow it will apply
331 cleanly.</para>
332 </listitem>
333 <listitem><para>Patches that are <quote>finished</quote>, but
334 not yet submitted, have no guards. If I'm applying the
335 patch stack to a copy of the upstream tree, I don't need to
336 enable any guards in order to get a reasonably safe source
337 tree.</para>
338 </listitem>
339 <listitem><para>Those patches that need reworking before being
340 resubmitted are guarded with
341 <literal>rework</literal>.</para>
342 </listitem>
343 <listitem><para>For those patches that are still under
344 development, I use <literal>devel</literal>.</para>
345 </listitem>
346 <listitem><para>A backport patch may have several guards, one
347 for each version of the kernel to which it applies. For
348 example, a patch that backports a piece of code to 2.6.9
349 will have a <literal>2.6.9</literal> guard.</para>
350 </listitem></itemizedlist>
351 <para>This variety of guards gives me considerable flexibility in
352 determining what kind of source tree I want to end up with. For
353 most situations, the selection of appropriate guards is
354 automated during the build process, but I can manually tune the
355 guards to use for less common circumstances.</para>
357 <sect2>
358 <title>The art of writing backport patches</title>
360 <para>Using MQ, writing a backport patch is a simple process.
361 All such a patch has to do is modify a piece of code that uses
362 a kernel feature not present in the older version of the
363 kernel, so that the driver continues to work correctly under
364 that older version.</para>
366 <para>A useful goal when writing a good backport patch is to
367 make your code look as if it was written for the older version
368 of the kernel you're targeting. The less obtrusive the patch,
369 the easier it will be to understand and maintain. If you're
370 writing a collection of backport patches to avoid the
371 <quote>rat's nest</quote> effect of lots of
372 <literal>#ifdef</literal>s (hunks of source code that are only
373 used conditionally) in your code, don't introduce
374 version-dependent <literal>#ifdef</literal>s into the patches.
375 Instead, write several patches, each of which makes
376 unconditional changes, and control their application using
377 guards.</para>
379 <para>There are two reasons to divide backport patches into a
380 distinct group, away from the <quote>regular</quote> patches
381 whose effects they modify. The first is that intermingling the
382 two makes it more difficult to use a tool like the <literal
383 role="hg-ext">patchbomb</literal> extension to automate the
384 process of submitting the patches to an upstream maintainer.
385 The second is that a backport patch could perturb the context
386 in which a subsequent regular patch is applied, making it
387 impossible to apply the regular patch cleanly
388 <emphasis>without</emphasis> the earlier backport patch
389 already being applied.</para>
391 </sect2>
392 </sect1>
393 <sect1>
394 <title>Useful tips for developing with MQ</title>
396 <sect2>
397 <title>Organising patches in directories</title>
399 <para>If you're working on a substantial project with MQ, it's
400 not difficult to accumulate a large number of patches. For
401 example, I have one patch repository that contains over 250
402 patches.</para>
404 <para>If you can group these patches into separate logical
405 categories, you can if you like store them in different
406 directories; MQ has no problems with patch names that contain
407 path separators.</para>
409 </sect2>
410 <sect2 id="mq-collab:tips:interdiff">
411 <title>Viewing the history of a patch</title>
413 <para>If you're developing a set of patches over a long time,
414 it's a good idea to maintain them in a repository, as
415 discussed in section <xref linkend="sec:mq:repo"/>. If you do
416 so, you'll quickly
417 discover that using the <command role="hg-cmd">hg
418 diff</command> command to look at the history of changes to
419 a patch is unworkable. This is in part because you're looking
420 at the second derivative of the real code (a diff of a diff),
421 but also because MQ adds noise to the process by modifying
422 time stamps and directory names when it updates a
423 patch.</para>
425 <para>However, you can use the <literal
426 role="hg-ext">extdiff</literal> extension, which is bundled
427 with Mercurial, to turn a diff of two versions of a patch into
428 something readable. To do this, you will need a third-party
429 package called <literal role="package">patchutils</literal>
430 <citation>web:patchutils</citation>. This provides a command
431 named <command>interdiff</command>, which shows the
432 differences between two diffs as a diff. Used on two versions
433 of the same diff, it generates a diff that represents the diff
434 from the first to the second version.</para>
436 <para>You can enable the <literal
437 role="hg-ext">extdiff</literal> extension in the usual way,
438 by adding a line to the <literal
439 role="rc-extensions">extensions</literal> section of your
440 <filename role="special"> /.hgrc</filename>.</para>
441 <programlisting>[extensions] extdiff =</programlisting>
442 <para>The <command>interdiff</command> command expects to be
443 passed the names of two files, but the <literal
444 role="hg-ext">extdiff</literal> extension passes the program
445 it runs a pair of directories, each of which can contain an
446 arbitrary number of files. We thus need a small program that
447 will run <command>interdiff</command> on each pair of files in
448 these two directories. This program is available as <filename
449 role="special">hg-interdiff</filename> in the <filename
450 class="directory">examples</filename> directory of the
451 source code repository that accompanies this book. <!--
452 &example.hg-interdiff; --></para>
454 <para>With the <filename role="special">hg-interdiff</filename>
455 program in your shell's search path, you can run it as
456 follows, from inside an MQ patch directory:</para>
457 <programlisting>hg extdiff -p hg-interdiff -r A:B
458 my-change.patch</programlisting>
459 <para>Since you'll probably want to use this long-winded command
460 a lot, you can get <literal role="hg-ext">hgext</literal> to
461 make it available as a normal Mercurial command, again by
462 editing your <filename role="special">
463 /.hgrc</filename>.</para>
464 <programlisting>[extdiff] cmd.interdiff =
465 hg-interdiff</programlisting>
466 <para>This directs <literal role="hg-ext">hgext</literal> to
467 make an <literal>interdiff</literal> command available, so you
468 can now shorten the previous invocation of <command
469 role="hg-ext-extdiff">extdiff</command> to something a
470 little more wieldy.</para>
471 <programlisting>hg interdiff -r A:B
472 my-change.patch</programlisting>
474 <note>
475 <para> The <command>interdiff</command> command works well
476 only if the underlying files against which versions of a
477 patch are generated remain the same. If you create a patch,
478 modify the underlying files, and then regenerate the patch,
479 <command>interdiff</command> may not produce useful
480 output.</para>
481 </note>
483 <para>The <literal role="hg-ext">extdiff</literal> extension is
484 useful for more than merely improving the presentation of MQ
485 patches. To read more about it, go to section <xref
486 linkend="sec:hgext:extdiff"/>.</para>
488 </sect2>
489 </sect1>
490 </chapter>
492 <!--
493 local variables:
494 sgml-parent-document: ("00book.xml" "book" "chapter")
495 end:
496 -->