Advanced uses of Mercurial Queues

belaran@964: belaran@964: belaran@964: belaran@964: Advanced uses of Mercurial Queues belaran@964: \label{chap:mq-collab} belaran@964: belaran@964: While it's easy to pick up straightforward uses of Mercurial Queues, belaran@964: use of a little discipline and some of MQ's less frequently used belaran@964: capabilities makes it possible to work in complicated development belaran@964: environments. belaran@964: belaran@964: In this chapter, I will use as an example a technique I have used to belaran@964: manage the development of an Infiniband device driver for the Linux belaran@964: kernel. The driver in question is large (at least as drivers go), belaran@964: with 25,000 lines of code spread across 35 source files. It is belaran@964: maintained by a small team of developers. belaran@964: belaran@964: While much of the material in this chapter is specific to Linux, the belaran@964: same principles apply to any code base for which you're not the belaran@964: primary owner, and upon which you need to do a lot of development. belaran@964: belaran@964: belaran@964: The problem of many targets belaran@964: belaran@964: The Linux kernel changes rapidly, and has never been internally belaran@964: stable; developers frequently make drastic changes between releases. belaran@964: This means that a version of the driver that works well with a belaran@964: particular released version of the kernel will not even compile belaran@964: correctly against, typically, any other version. belaran@964: belaran@964: To maintain a driver, we have to keep a number of distinct versions of belaran@964: Linux in mind. belaran@964: belaran@964: One target is the main Linux kernel development tree. belaran@964: Maintenance of the code is in this case partly shared by other belaran@964: developers in the kernel community, who make drive-by belaran@964: modifications to the driver as they develop and refine kernel belaran@964: subsystems. belaran@964: belaran@964: We also maintain a number of backports to older versions of belaran@964: the Linux kernel, to support the needs of customers who are running belaran@964: older Linux distributions that do not incorporate our drivers. (To belaran@964: backport a piece of code is to modify it to work in an older belaran@964: version of its target environment than the version it was developed belaran@964: for.) belaran@964: belaran@964: Finally, we make software releases on a schedule that is belaran@964: necessarily not aligned with those used by Linux distributors and belaran@964: kernel developers, so that we can deliver new features to customers belaran@964: without forcing them to upgrade their entire kernels or belaran@964: distributions. belaran@964: belaran@964: belaran@964: belaran@964: belaran@964: Tempting approaches that don't work well belaran@964: belaran@964: There are two standard ways to maintain a piece of software that belaran@964: has to target many different environments. belaran@964: belaran@964: belaran@964: The first is to maintain a number of branches, each intended for a belaran@964: single target. The trouble with this approach is that you must belaran@964: maintain iron discipline in the flow of changes between repositories. belaran@964: A new feature or bug fix must start life in a pristine repository, belaran@964: then percolate out to every backport repository. Backport changes are belaran@964: more limited in the branches they should propagate to; a backport belaran@964: change that is applied to a branch where it doesn't belong will belaran@964: probably stop the driver from compiling. belaran@964: belaran@964: belaran@964: The second is to maintain a single source tree filled with conditional belaran@964: statements that turn chunks of code on or off depending on the belaran@964: intended target. Because these ifdefs are not allowed in the belaran@964: Linux kernel tree, a manual or automatic process must be followed to belaran@964: strip them out and yield a clean tree. A code base maintained in this belaran@964: fashion rapidly becomes a rat's nest of conditional blocks that are belaran@964: difficult to understand and maintain. belaran@964: belaran@964: belaran@964: Neither of these approaches is well suited to a situation where you belaran@964: don't own the canonical copy of a source tree. In the case of a belaran@964: Linux driver that is distributed with the standard kernel, Linus's belaran@964: tree contains the copy of the code that will be treated by the world belaran@964: as canonical. The upstream version of my driver can be modified belaran@964: by people I don't know, without me even finding out about it until belaran@964: after the changes show up in Linus's tree. belaran@964: belaran@964: belaran@964: These approaches have the added weakness of making it difficult to belaran@964: generate well-formed patches to submit upstream. belaran@964: belaran@964: belaran@964: In principle, Mercurial Queues seems like a good candidate to manage a belaran@964: development scenario such as the above. While this is indeed the belaran@964: case, MQ contains a few added features that make the job more belaran@964: pleasant. belaran@964: belaran@964: belaran@964: \section{Conditionally applying patches with belaran@964: guards} belaran@964: belaran@964: belaran@964: Perhaps the best way to maintain sanity with so many targets is to be belaran@964: able to choose specific patches to apply for a given situation. MQ belaran@964: provides a feature called guards (which originates with quilt's belaran@964: guards command) that does just this. To start off, let's belaran@964: create a simple repository for experimenting in. belaran@964: belaran@964: This gives us a tiny repository that contains two patches that don't belaran@964: have any dependencies on each other, because they touch different files. belaran@964: belaran@964: belaran@964: The idea behind conditional application is that you can tag a belaran@964: patch with a guard, which is simply a text string of your belaran@964: choosing, then tell MQ to select specific guards to use when applying belaran@964: patches. MQ will then either apply, or skip over, a guarded patch, belaran@964: depending on the guards that you have selected. belaran@964: belaran@964: belaran@964: A patch can have an arbitrary number of guards; belaran@964: each one is positive (apply this patch if this guard is belaran@964: selected) or negative (skip this patch if this guard is belaran@964: selected). A patch with no guards is always applied. belaran@964: belaran@964: belaran@964: belaran@964: belaran@964: belaran@964: Controlling the guards on a patch belaran@964: belaran@964: The qguard command lets you determine which guards should belaran@964: apply to a patch, or display the guards that are already in effect. belaran@964: Without any arguments, it displays the guards on the current topmost belaran@964: patch. belaran@964: belaran@964: To set a positive guard on a patch, prefix the name of the guard with belaran@964: a +. belaran@964: belaran@964: To set a negative guard on a patch, prefix the name of the guard with belaran@964: a -. belaran@964: belaran@964: belaran@964: belaran@964: belaran@964: The qguard command sets the guards on a patch; it belaran@964: doesn't modify them. What this means is that if you run belaran@964: hg qguard +a +b on a patch, then hg qguard +c on belaran@964: the same patch, the only guard that will be set on it belaran@964: afterwards is +c. belaran@964: belaran@964: belaran@964: belaran@964: Mercurial stores guards in the series file; the form in belaran@964: which they are stored is easy both to understand and to edit by hand. belaran@964: (In other words, you don't have to use the qguard command if belaran@964: you don't want to; it's okay to simply edit the series belaran@964: file.) belaran@964: belaran@964: belaran@964: belaran@964: belaran@964: belaran@964: Selecting the guards to use belaran@964: belaran@964: The qselect command determines which guards are active at a belaran@964: given time. The effect of this is to determine which patches MQ will belaran@964: apply the next time you run qpush. It has no other effect; in belaran@964: particular, it doesn't do anything to patches that are already belaran@964: applied. belaran@964: belaran@964: belaran@964: With no arguments, the qselect command lists the guards belaran@964: currently in effect, one per line of output. Each argument is treated belaran@964: as the name of a guard to apply. belaran@964: belaran@964: In case you're interested, the currently selected guards are stored in belaran@964: the guards file. belaran@964: belaran@964: We can see the effect the selected guards have when we run belaran@964: qpush. belaran@964: belaran@964: belaran@964: belaran@964: A guard cannot start with a + or - belaran@964: character. The name of a guard must not contain white space, but most belaran@964: other characters are acceptable. If you try to use a guard with an belaran@964: invalid name, MQ will complain: belaran@964: belaran@964: Changing the selected guards changes the patches that are applied. belaran@964: belaran@964: You can see in the example below that negative guards take precedence belaran@964: over positive guards. belaran@964: belaran@964: belaran@964: belaran@964: belaran@964: belaran@964: MQ's rules for applying patches belaran@964: belaran@964: The rules that MQ uses when deciding whether to apply a patch belaran@964: are as follows. belaran@964: belaran@964: belaran@964: A patch that has no guards is always applied. belaran@964: belaran@964: belaran@964: If the patch has any negative guard that matches any currently belaran@964: selected guard, the patch is skipped. belaran@964: belaran@964: belaran@964: If the patch has any positive guard that matches any currently belaran@964: selected guard, the patch is applied. belaran@964: belaran@964: belaran@964: If the patch has positive or negative guards, but none matches belaran@964: any currently selected guard, the patch is skipped. belaran@964: belaran@964: belaran@964: belaran@964: belaran@964: belaran@964: Trimming the work environment belaran@964: belaran@964: In working on the device driver I mentioned earlier, I don't apply the belaran@964: patches to a normal Linux kernel tree. Instead, I use a repository belaran@964: that contains only a snapshot of the source files and headers that are belaran@964: relevant to Infiniband development. This repository is 1% the size belaran@964: of a kernel repository, so it's easier to work with. belaran@964: belaran@964: belaran@964: I then choose a base version on top of which the patches are belaran@964: applied. This is a snapshot of the Linux kernel tree as of a revision belaran@964: of my choosing. When I take the snapshot, I record the changeset ID belaran@964: from the kernel repository in the commit message. Since the snapshot belaran@964: preserves the shape and content of the relevant parts of the belaran@964: kernel tree, I can apply my patches on top of either my tiny belaran@964: repository or a normal kernel tree. belaran@964: belaran@964: belaran@964: Normally, the base tree atop which the patches apply should be a belaran@964: snapshot of a very recent upstream tree. This best facilitates the belaran@964: development of patches that can easily be submitted upstream with few belaran@964: or no modifications. belaran@964: belaran@964: belaran@964: belaran@964: belaran@964: Dividing up the <filename role="special">series</filename> file belaran@964: belaran@964: I categorise the patches in the series file into a number belaran@964: of logical groups. Each section of like patches begins with a block belaran@964: of comments that describes the purpose of the patches that follow. belaran@964: belaran@964: belaran@964: The sequence of patch groups that I maintain follows. The ordering of belaran@964: these groups is important; I'll describe why after I introduce the belaran@964: groups. belaran@964: belaran@964: belaran@964: The accepted group. Patches that the development team has belaran@964: submitted to the maintainer of the Infiniband subsystem, and which belaran@964: he has accepted, but which are not present in the snapshot that the belaran@964: tiny repository is based on. These are read only patches, belaran@964: present only to transform the tree into a similar state as it is in belaran@964: the upstream maintainer's repository. belaran@964: belaran@964: belaran@964: The rework group. Patches that I have submitted, but that belaran@964: the upstream maintainer has requested modifications to before he belaran@964: will accept them. belaran@964: belaran@964: belaran@964: The pending group. Patches that I have not yet submitted to belaran@964: the upstream maintainer, but which we have finished working on. belaran@964: These will be read only for a while. If the upstream maintainer belaran@964: accepts them upon submission, I'll move them to the end of the belaran@964: accepted group. If he requests that I modify any, I'll move belaran@964: them to the beginning of the rework group. belaran@964: belaran@964: belaran@964: The in progress group. Patches that are actively being belaran@964: developed, and should not be submitted anywhere yet. belaran@964: belaran@964: belaran@964: The backport group. Patches that adapt the source tree to belaran@964: older versions of the kernel tree. belaran@964: belaran@964: belaran@964: The do not ship group. Patches that for some reason should belaran@964: never be submitted upstream. For example, one such patch might belaran@964: change embedded driver identification strings to make it easier to belaran@964: distinguish, in the field, between an out-of-tree version of the belaran@964: driver and a version shipped by a distribution vendor. belaran@964: belaran@964: belaran@964: belaran@964: Now to return to the reasons for ordering groups of patches in this belaran@964: way. We would like the lowest patches in the stack to be as stable as belaran@964: possible, so that we will not need to rework higher patches due to belaran@964: changes in context. Putting patches that will never be changed first belaran@964: in the series file serves this purpose. belaran@964: belaran@964: belaran@964: We would also like the patches that we know we'll need to modify to be belaran@964: applied on top of a source tree that resembles the upstream tree as belaran@964: closely as possible. This is why we keep accepted patches around for belaran@964: a while. belaran@964: belaran@964: belaran@964: The backport and do not ship patches float at the end of the belaran@964: series file. The backport patches must be applied on top belaran@964: of all other patches, and the do not ship patches might as well belaran@964: stay out of harm's way. belaran@964: belaran@964: belaran@964: belaran@964: belaran@964: Maintaining the patch series belaran@964: belaran@964: In my work, I use a number of guards to control which patches are to belaran@964: be applied. belaran@964: belaran@964: belaran@964: belaran@964: Accepted patches are guarded with accepted. I belaran@964: enable this guard most of the time. When I'm applying the patches belaran@964: on top of a tree where the patches are already present, I can turn belaran@964: this patch off, and the patches that follow it will apply cleanly. belaran@964: belaran@964: belaran@964: Patches that are finished, but not yet submitted, have no belaran@964: guards. If I'm applying the patch stack to a copy of the upstream belaran@964: tree, I don't need to enable any guards in order to get a reasonably belaran@964: safe source tree. belaran@964: belaran@964: belaran@964: Those patches that need reworking before being resubmitted are belaran@964: guarded with rework. belaran@964: belaran@964: belaran@964: For those patches that are still under development, I use belaran@964: devel. belaran@964: belaran@964: belaran@964: A backport patch may have several guards, one for each version belaran@964: of the kernel to which it applies. For example, a patch that belaran@964: backports a piece of code to 2.6.9 will have a 2.6.9 guard. belaran@964: belaran@964: belaran@964: This variety of guards gives me considerable flexibility in belaran@964: determining what kind of source tree I want to end up with. For most belaran@964: situations, the selection of appropriate guards is automated during belaran@964: the build process, but I can manually tune the guards to use for less belaran@964: common circumstances. belaran@964: belaran@964: belaran@964: belaran@964: The art of writing backport patches belaran@964: belaran@964: Using MQ, writing a backport patch is a simple process. All such a belaran@964: patch has to do is modify a piece of code that uses a kernel feature belaran@964: not present in the older version of the kernel, so that the driver belaran@964: continues to work correctly under that older version. belaran@964: belaran@964: belaran@964: A useful goal when writing a good backport patch is to make your code belaran@964: look as if it was written for the older version of the kernel you're belaran@964: targeting. The less obtrusive the patch, the easier it will be to belaran@964: understand and maintain. If you're writing a collection of backport belaran@964: patches to avoid the rat's nest effect of lots of belaran@964: #ifdefs (hunks of source code that are only used belaran@964: conditionally) in your code, don't introduce version-dependent belaran@964: #ifdefs into the patches. Instead, write several patches, belaran@964: each of which makes unconditional changes, and control their belaran@964: application using guards. belaran@964: belaran@964: belaran@964: There are two reasons to divide backport patches into a distinct belaran@964: group, away from the regular patches whose effects they modify. belaran@964: The first is that intermingling the two makes it more difficult to use belaran@964: a tool like the patchbomb extension to automate the process of belaran@964: submitting the patches to an upstream maintainer. The second is that belaran@964: a backport patch could perturb the context in which a subsequent belaran@964: regular patch is applied, making it impossible to apply the regular belaran@964: patch cleanly without the earlier backport patch already being belaran@964: applied. belaran@964: belaran@964: belaran@964: belaran@964: belaran@964: belaran@964: Useful tips for developing with MQ belaran@964: belaran@964: belaran@964: Organising patches in directories belaran@964: belaran@964: If you're working on a substantial project with MQ, it's not difficult belaran@964: to accumulate a large number of patches. For example, I have one belaran@964: patch repository that contains over 250 patches. belaran@964: belaran@964: belaran@964: If you can group these patches into separate logical categories, you belaran@964: can if you like store them in different directories; MQ has no belaran@964: problems with patch names that contain path separators. belaran@964: belaran@964: belaran@964: belaran@964: belaran@964: Viewing the history of a patch belaran@964: \label{mq-collab:tips:interdiff} belaran@964: belaran@964: belaran@964: If you're developing a set of patches over a long time, it's a good belaran@964: idea to maintain them in a repository, as discussed in belaran@964: section . If you do so, you'll quickly discover that belaran@964: using the hg diff command to look at the history of changes to a belaran@964: patch is unworkable. This is in part because you're looking at the belaran@964: second derivative of the real code (a diff of a diff), but also belaran@964: because MQ adds noise to the process by modifying time stamps and belaran@964: directory names when it updates a patch. belaran@964: belaran@964: belaran@964: However, you can use the extdiff extension, which is bundled belaran@964: with Mercurial, to turn a diff of two versions of a patch into belaran@964: something readable. To do this, you will need a third-party package belaran@964: called patchutils web:patchutils. This provides a belaran@964: command named interdiff, which shows the differences between belaran@964: two diffs as a diff. Used on two versions of the same diff, it belaran@964: generates a diff that represents the diff from the first to the second belaran@964: version. belaran@964: belaran@964: belaran@964: You can enable the extdiff extension in the usual way, by belaran@964: adding a line to the extensions section of your /.hgrc. belaran@964: belaran@964: belaran@964: [extensions] belaran@964: extdiff = belaran@964: belaran@964: belaran@964: The interdiff command expects to be passed the names of two belaran@964: files, but the extdiff extension passes the program it runs a belaran@964: pair of directories, each of which can contain an arbitrary number of belaran@964: files. We thus need a small program that will run interdiff belaran@964: on each pair of files in these two directories. This program is belaran@964: available as hg-interdiff in the examples belaran@964: directory of the source code repository that accompanies this book. belaran@964: belaran@964: belaran@964: belaran@964: With the hg-interdiff program in your shell's search path, belaran@964: you can run it as follows, from inside an MQ patch directory: belaran@964: belaran@964: belaran@964: hg extdiff -p hg-interdiff -r A:B my-change.patch belaran@964: belaran@964: belaran@964: Since you'll probably want to use this long-winded command a lot, you belaran@964: can get hgext to make it available as a normal Mercurial belaran@964: command, again by editing your /.hgrc. belaran@964: belaran@964: belaran@964: [extdiff] belaran@964: cmd.interdiff = hg-interdiff belaran@964: belaran@964: belaran@964: This directs hgext to make an interdiff command belaran@964: available, so you can now shorten the previous invocation of belaran@964: extdiff to something a little more wieldy. belaran@964: belaran@964: belaran@964: hg interdiff -r A:B my-change.patch belaran@964: belaran@964: belaran@964: belaran@964: belaran@964: The interdiff command works well only if the underlying belaran@964: files against which versions of a patch are generated remain the belaran@964: same. If you create a patch, modify the underlying files, and then belaran@964: regenerate the patch, interdiff may not produce useful belaran@964: output. belaran@964: belaran@964: belaran@964: belaran@964: The extdiff extension is useful for more than merely improving belaran@964: the presentation of MQ patches. To read more about it, go to belaran@964: section . belaran@964: belaran@964: belaran@964: belaran@964: belaran@964: belaran@964: belaran@964: