jerojasro@336: \chapter{Advanced uses of Mercurial Queues} jerojasro@336: \label{chap:mq-collab} jerojasro@336: jerojasro@336: While it's easy to pick up straightforward uses of Mercurial Queues, jerojasro@336: use of a little discipline and some of MQ's less frequently used jerojasro@336: capabilities makes it possible to work in complicated development jerojasro@336: environments. jerojasro@336: jerojasro@336: In this chapter, I will use as an example a technique I have used to jerojasro@336: manage the development of an Infiniband device driver for the Linux jerojasro@336: kernel. The driver in question is large (at least as drivers go), jerojasro@336: with 25,000 lines of code spread across 35 source files. It is jerojasro@336: maintained by a small team of developers. jerojasro@336: jerojasro@336: While much of the material in this chapter is specific to Linux, the jerojasro@336: same principles apply to any code base for which you're not the jerojasro@336: primary owner, and upon which you need to do a lot of development. jerojasro@336: jerojasro@336: \section{The problem of many targets} jerojasro@336: jerojasro@336: The Linux kernel changes rapidly, and has never been internally jerojasro@336: stable; developers frequently make drastic changes between releases. jerojasro@336: This means that a version of the driver that works well with a jerojasro@336: particular released version of the kernel will not even \emph{compile} jerojasro@336: correctly against, typically, any other version. jerojasro@336: jerojasro@336: To maintain a driver, we have to keep a number of distinct versions of jerojasro@336: Linux in mind. jerojasro@336: \begin{itemize} jerojasro@336: \item One target is the main Linux kernel development tree. jerojasro@336: Maintenance of the code is in this case partly shared by other jerojasro@336: developers in the kernel community, who make ``drive-by'' jerojasro@336: modifications to the driver as they develop and refine kernel jerojasro@336: subsystems. jerojasro@336: \item We also maintain a number of ``backports'' to older versions of jerojasro@336: the Linux kernel, to support the needs of customers who are running jerojasro@336: older Linux distributions that do not incorporate our drivers. (To jerojasro@336: \emph{backport} a piece of code is to modify it to work in an older jerojasro@336: version of its target environment than the version it was developed jerojasro@336: for.) jerojasro@336: \item Finally, we make software releases on a schedule that is jerojasro@336: necessarily not aligned with those used by Linux distributors and jerojasro@336: kernel developers, so that we can deliver new features to customers jerojasro@336: without forcing them to upgrade their entire kernels or jerojasro@336: distributions. jerojasro@336: \end{itemize} jerojasro@336: jerojasro@336: \subsection{Tempting approaches that don't work well} jerojasro@336: jerojasro@336: There are two ``standard'' ways to maintain a piece of software that jerojasro@336: has to target many different environments. jerojasro@336: jerojasro@336: The first is to maintain a number of branches, each intended for a jerojasro@336: single target. The trouble with this approach is that you must jerojasro@336: maintain iron discipline in the flow of changes between repositories. jerojasro@336: A new feature or bug fix must start life in a ``pristine'' repository, jerojasro@336: then percolate out to every backport repository. Backport changes are jerojasro@336: more limited in the branches they should propagate to; a backport jerojasro@336: change that is applied to a branch where it doesn't belong will jerojasro@336: probably stop the driver from compiling. jerojasro@336: jerojasro@336: The second is to maintain a single source tree filled with conditional jerojasro@336: statements that turn chunks of code on or off depending on the jerojasro@336: intended target. Because these ``ifdefs'' are not allowed in the jerojasro@336: Linux kernel tree, a manual or automatic process must be followed to jerojasro@336: strip them out and yield a clean tree. A code base maintained in this jerojasro@336: fashion rapidly becomes a rat's nest of conditional blocks that are jerojasro@336: difficult to understand and maintain. jerojasro@336: jerojasro@336: Neither of these approaches is well suited to a situation where you jerojasro@336: don't ``own'' the canonical copy of a source tree. In the case of a jerojasro@336: Linux driver that is distributed with the standard kernel, Linus's jerojasro@336: tree contains the copy of the code that will be treated by the world jerojasro@336: as canonical. The upstream version of ``my'' driver can be modified jerojasro@336: by people I don't know, without me even finding out about it until jerojasro@336: after the changes show up in Linus's tree. jerojasro@336: jerojasro@336: These approaches have the added weakness of making it difficult to jerojasro@336: generate well-formed patches to submit upstream. jerojasro@336: jerojasro@336: In principle, Mercurial Queues seems like a good candidate to manage a jerojasro@336: development scenario such as the above. While this is indeed the jerojasro@336: case, MQ contains a few added features that make the job more jerojasro@336: pleasant. jerojasro@336: jerojasro@336: \section{Conditionally applying patches with jerojasro@336: guards} jerojasro@336: jerojasro@336: Perhaps the best way to maintain sanity with so many targets is to be jerojasro@336: able to choose specific patches to apply for a given situation. MQ jerojasro@336: provides a feature called ``guards'' (which originates with quilt's jerojasro@336: \texttt{guards} command) that does just this. To start off, let's jerojasro@336: create a simple repository for experimenting in. jerojasro@336: \interaction{mq.guards.init} jerojasro@336: This gives us a tiny repository that contains two patches that don't jerojasro@336: have any dependencies on each other, because they touch different files. jerojasro@336: jerojasro@336: The idea behind conditional application is that you can ``tag'' a jerojasro@336: patch with a \emph{guard}, which is simply a text string of your jerojasro@336: choosing, then tell MQ to select specific guards to use when applying jerojasro@336: patches. MQ will then either apply, or skip over, a guarded patch, jerojasro@336: depending on the guards that you have selected. jerojasro@336: jerojasro@336: A patch can have an arbitrary number of guards; jerojasro@336: each one is \emph{positive} (``apply this patch if this guard is jerojasro@336: selected'') or \emph{negative} (``skip this patch if this guard is jerojasro@336: selected''). A patch with no guards is always applied. jerojasro@336: jerojasro@336: \section{Controlling the guards on a patch} jerojasro@336: jerojasro@336: The \hgxcmd{mq}{qguard} command lets you determine which guards should jerojasro@336: apply to a patch, or display the guards that are already in effect. jerojasro@336: Without any arguments, it displays the guards on the current topmost jerojasro@336: patch. jerojasro@336: \interaction{mq.guards.qguard} jerojasro@336: To set a positive guard on a patch, prefix the name of the guard with jerojasro@336: a ``\texttt{+}''. jerojasro@336: \interaction{mq.guards.qguard.pos} jerojasro@336: To set a negative guard on a patch, prefix the name of the guard with jerojasro@336: a ``\texttt{-}''. jerojasro@336: \interaction{mq.guards.qguard.neg} jerojasro@336: jerojasro@336: \begin{note} jerojasro@336: The \hgxcmd{mq}{qguard} command \emph{sets} the guards on a patch; it jerojasro@336: doesn't \emph{modify} them. What this means is that if you run jerojasro@336: \hgcmdargs{qguard}{+a +b} on a patch, then \hgcmdargs{qguard}{+c} on jerojasro@336: the same patch, the \emph{only} guard that will be set on it jerojasro@336: afterwards is \texttt{+c}. jerojasro@336: \end{note} jerojasro@336: jerojasro@336: Mercurial stores guards in the \sfilename{series} file; the form in jerojasro@336: which they are stored is easy both to understand and to edit by hand. jerojasro@336: (In other words, you don't have to use the \hgxcmd{mq}{qguard} command if jerojasro@336: you don't want to; it's okay to simply edit the \sfilename{series} jerojasro@336: file.) jerojasro@336: \interaction{mq.guards.series} jerojasro@336: jerojasro@336: \section{Selecting the guards to use} jerojasro@336: jerojasro@336: The \hgxcmd{mq}{qselect} command determines which guards are active at a jerojasro@336: given time. The effect of this is to determine which patches MQ will jerojasro@336: apply the next time you run \hgxcmd{mq}{qpush}. It has no other effect; in jerojasro@336: particular, it doesn't do anything to patches that are already jerojasro@336: applied. jerojasro@336: jerojasro@336: With no arguments, the \hgxcmd{mq}{qselect} command lists the guards jerojasro@336: currently in effect, one per line of output. Each argument is treated jerojasro@336: as the name of a guard to apply. jerojasro@336: \interaction{mq.guards.qselect.foo} jerojasro@336: In case you're interested, the currently selected guards are stored in jerojasro@336: the \sfilename{guards} file. jerojasro@336: \interaction{mq.guards.qselect.cat} jerojasro@336: We can see the effect the selected guards have when we run jerojasro@336: \hgxcmd{mq}{qpush}. jerojasro@336: \interaction{mq.guards.qselect.qpush} jerojasro@336: jerojasro@336: A guard cannot start with a ``\texttt{+}'' or ``\texttt{-}'' jerojasro@336: character. The name of a guard must not contain white space, but most jerojasro@336: other characters are acceptable. If you try to use a guard with an jerojasro@336: invalid name, MQ will complain: jerojasro@336: \interaction{mq.guards.qselect.error} jerojasro@336: Changing the selected guards changes the patches that are applied. jerojasro@336: \interaction{mq.guards.qselect.quux} jerojasro@336: You can see in the example below that negative guards take precedence jerojasro@336: over positive guards. jerojasro@336: \interaction{mq.guards.qselect.foobar} jerojasro@336: jerojasro@336: \section{MQ's rules for applying patches} jerojasro@336: jerojasro@336: The rules that MQ uses when deciding whether to apply a patch jerojasro@336: are as follows. jerojasro@336: \begin{itemize} jerojasro@336: \item A patch that has no guards is always applied. jerojasro@336: \item If the patch has any negative guard that matches any currently jerojasro@336: selected guard, the patch is skipped. jerojasro@336: \item If the patch has any positive guard that matches any currently jerojasro@336: selected guard, the patch is applied. jerojasro@336: \item If the patch has positive or negative guards, but none matches jerojasro@336: any currently selected guard, the patch is skipped. jerojasro@336: \end{itemize} jerojasro@336: jerojasro@336: \section{Trimming the work environment} jerojasro@336: jerojasro@336: In working on the device driver I mentioned earlier, I don't apply the jerojasro@336: patches to a normal Linux kernel tree. Instead, I use a repository jerojasro@336: that contains only a snapshot of the source files and headers that are jerojasro@336: relevant to Infiniband development. This repository is~1\% the size jerojasro@336: of a kernel repository, so it's easier to work with. jerojasro@336: jerojasro@336: I then choose a ``base'' version on top of which the patches are jerojasro@336: applied. This is a snapshot of the Linux kernel tree as of a revision jerojasro@336: of my choosing. When I take the snapshot, I record the changeset ID jerojasro@336: from the kernel repository in the commit message. Since the snapshot jerojasro@336: preserves the ``shape'' and content of the relevant parts of the jerojasro@336: kernel tree, I can apply my patches on top of either my tiny jerojasro@336: repository or a normal kernel tree. jerojasro@336: jerojasro@336: Normally, the base tree atop which the patches apply should be a jerojasro@336: snapshot of a very recent upstream tree. This best facilitates the jerojasro@336: development of patches that can easily be submitted upstream with few jerojasro@336: or no modifications. jerojasro@336: jerojasro@336: \section{Dividing up the \sfilename{series} file} jerojasro@336: jerojasro@336: I categorise the patches in the \sfilename{series} file into a number jerojasro@336: of logical groups. Each section of like patches begins with a block jerojasro@336: of comments that describes the purpose of the patches that follow. jerojasro@336: jerojasro@336: The sequence of patch groups that I maintain follows. The ordering of jerojasro@336: these groups is important; I'll describe why after I introduce the jerojasro@336: groups. jerojasro@336: \begin{itemize} jerojasro@336: \item The ``accepted'' group. Patches that the development team has jerojasro@336: submitted to the maintainer of the Infiniband subsystem, and which jerojasro@336: he has accepted, but which are not present in the snapshot that the jerojasro@336: tiny repository is based on. These are ``read only'' patches, jerojasro@336: present only to transform the tree into a similar state as it is in jerojasro@336: the upstream maintainer's repository. jerojasro@336: \item The ``rework'' group. Patches that I have submitted, but that jerojasro@336: the upstream maintainer has requested modifications to before he jerojasro@336: will accept them. jerojasro@336: \item The ``pending'' group. Patches that I have not yet submitted to jerojasro@336: the upstream maintainer, but which we have finished working on. jerojasro@336: These will be ``read only'' for a while. If the upstream maintainer jerojasro@336: accepts them upon submission, I'll move them to the end of the jerojasro@336: ``accepted'' group. If he requests that I modify any, I'll move jerojasro@336: them to the beginning of the ``rework'' group. jerojasro@336: \item The ``in progress'' group. Patches that are actively being jerojasro@336: developed, and should not be submitted anywhere yet. jerojasro@336: \item The ``backport'' group. Patches that adapt the source tree to jerojasro@336: older versions of the kernel tree. jerojasro@336: \item The ``do not ship'' group. Patches that for some reason should jerojasro@336: never be submitted upstream. For example, one such patch might jerojasro@336: change embedded driver identification strings to make it easier to jerojasro@336: distinguish, in the field, between an out-of-tree version of the jerojasro@336: driver and a version shipped by a distribution vendor. jerojasro@336: \end{itemize} jerojasro@336: jerojasro@336: Now to return to the reasons for ordering groups of patches in this jerojasro@336: way. We would like the lowest patches in the stack to be as stable as jerojasro@336: possible, so that we will not need to rework higher patches due to jerojasro@336: changes in context. Putting patches that will never be changed first jerojasro@336: in the \sfilename{series} file serves this purpose. jerojasro@336: jerojasro@336: We would also like the patches that we know we'll need to modify to be jerojasro@336: applied on top of a source tree that resembles the upstream tree as jerojasro@336: closely as possible. This is why we keep accepted patches around for jerojasro@336: a while. jerojasro@336: jerojasro@336: The ``backport'' and ``do not ship'' patches float at the end of the jerojasro@336: \sfilename{series} file. The backport patches must be applied on top jerojasro@336: of all other patches, and the ``do not ship'' patches might as well jerojasro@336: stay out of harm's way. jerojasro@336: jerojasro@336: \section{Maintaining the patch series} jerojasro@336: jerojasro@336: In my work, I use a number of guards to control which patches are to jerojasro@336: be applied. jerojasro@336: jerojasro@336: \begin{itemize} jerojasro@336: \item ``Accepted'' patches are guarded with \texttt{accepted}. I jerojasro@336: enable this guard most of the time. When I'm applying the patches jerojasro@336: on top of a tree where the patches are already present, I can turn jerojasro@336: this patch off, and the patches that follow it will apply cleanly. jerojasro@336: \item Patches that are ``finished'', but not yet submitted, have no jerojasro@336: guards. If I'm applying the patch stack to a copy of the upstream jerojasro@336: tree, I don't need to enable any guards in order to get a reasonably jerojasro@336: safe source tree. jerojasro@336: \item Those patches that need reworking before being resubmitted are jerojasro@336: guarded with \texttt{rework}. jerojasro@336: \item For those patches that are still under development, I use jerojasro@336: \texttt{devel}. jerojasro@336: \item A backport patch may have several guards, one for each version jerojasro@336: of the kernel to which it applies. For example, a patch that jerojasro@336: backports a piece of code to~2.6.9 will have a~\texttt{2.6.9} guard. jerojasro@336: \end{itemize} jerojasro@336: This variety of guards gives me considerable flexibility in jerojasro@336: qdetermining what kind of source tree I want to end up with. For most jerojasro@336: situations, the selection of appropriate guards is automated during jerojasro@336: the build process, but I can manually tune the guards to use for less jerojasro@336: common circumstances. jerojasro@336: jerojasro@336: \subsection{The art of writing backport patches} jerojasro@336: jerojasro@336: Using MQ, writing a backport patch is a simple process. All such a jerojasro@336: patch has to do is modify a piece of code that uses a kernel feature jerojasro@336: not present in the older version of the kernel, so that the driver jerojasro@336: continues to work correctly under that older version. jerojasro@336: jerojasro@336: A useful goal when writing a good backport patch is to make your code jerojasro@336: look as if it was written for the older version of the kernel you're jerojasro@336: targeting. The less obtrusive the patch, the easier it will be to jerojasro@336: understand and maintain. If you're writing a collection of backport jerojasro@336: patches to avoid the ``rat's nest'' effect of lots of jerojasro@336: \texttt{\#ifdef}s (hunks of source code that are only used jerojasro@336: conditionally) in your code, don't introduce version-dependent jerojasro@336: \texttt{\#ifdef}s into the patches. Instead, write several patches, jerojasro@336: each of which makes unconditional changes, and control their jerojasro@336: application using guards. jerojasro@336: jerojasro@336: There are two reasons to divide backport patches into a distinct jerojasro@336: group, away from the ``regular'' patches whose effects they modify. jerojasro@336: The first is that intermingling the two makes it more difficult to use jerojasro@336: a tool like the \hgext{patchbomb} extension to automate the process of jerojasro@336: submitting the patches to an upstream maintainer. The second is that jerojasro@336: a backport patch could perturb the context in which a subsequent jerojasro@336: regular patch is applied, making it impossible to apply the regular jerojasro@336: patch cleanly \emph{without} the earlier backport patch already being jerojasro@336: applied. jerojasro@336: jerojasro@336: \section{Useful tips for developing with MQ} jerojasro@336: jerojasro@336: \subsection{Organising patches in directories} jerojasro@336: jerojasro@336: If you're working on a substantial project with MQ, it's not difficult jerojasro@336: to accumulate a large number of patches. For example, I have one jerojasro@336: patch repository that contains over 250 patches. jerojasro@336: jerojasro@336: If you can group these patches into separate logical categories, you jerojasro@336: can if you like store them in different directories; MQ has no jerojasro@336: problems with patch names that contain path separators. jerojasro@336: jerojasro@336: \subsection{Viewing the history of a patch} jerojasro@336: \label{mq-collab:tips:interdiff} jerojasro@336: jerojasro@336: If you're developing a set of patches over a long time, it's a good jerojasro@336: idea to maintain them in a repository, as discussed in jerojasro@336: section~\ref{sec:mq:repo}. If you do so, you'll quickly discover that jerojasro@336: using the \hgcmd{diff} command to look at the history of changes to a jerojasro@336: patch is unworkable. This is in part because you're looking at the jerojasro@336: second derivative of the real code (a diff of a diff), but also jerojasro@336: because MQ adds noise to the process by modifying time stamps and jerojasro@336: directory names when it updates a patch. jerojasro@336: jerojasro@336: However, you can use the \hgext{extdiff} extension, which is bundled jerojasro@336: with Mercurial, to turn a diff of two versions of a patch into jerojasro@336: something readable. To do this, you will need a third-party package jerojasro@336: called \package{patchutils}~\cite{web:patchutils}. This provides a jerojasro@336: command named \command{interdiff}, which shows the differences between jerojasro@336: two diffs as a diff. Used on two versions of the same diff, it jerojasro@336: generates a diff that represents the diff from the first to the second jerojasro@336: version. jerojasro@336: jerojasro@336: You can enable the \hgext{extdiff} extension in the usual way, by jerojasro@336: adding a line to the \rcsection{extensions} section of your \hgrc. jerojasro@336: \begin{codesample2} jerojasro@336: [extensions] jerojasro@336: extdiff = jerojasro@336: \end{codesample2} jerojasro@336: The \command{interdiff} command expects to be passed the names of two jerojasro@336: files, but the \hgext{extdiff} extension passes the program it runs a jerojasro@336: pair of directories, each of which can contain an arbitrary number of jerojasro@336: files. We thus need a small program that will run \command{interdiff} jerojasro@336: on each pair of files in these two directories. This program is jerojasro@336: available as \sfilename{hg-interdiff} in the \dirname{examples} jerojasro@336: directory of the source code repository that accompanies this book. jerojasro@336: \excode{hg-interdiff} jerojasro@336: jerojasro@336: With the \sfilename{hg-interdiff} program in your shell's search path, jerojasro@336: you can run it as follows, from inside an MQ patch directory: jerojasro@336: \begin{codesample2} jerojasro@336: hg extdiff -p hg-interdiff -r A:B my-change.patch jerojasro@336: \end{codesample2} jerojasro@336: Since you'll probably want to use this long-winded command a lot, you jerojasro@336: can get \hgext{hgext} to make it available as a normal Mercurial jerojasro@336: command, again by editing your \hgrc. jerojasro@336: \begin{codesample2} jerojasro@336: [extdiff] jerojasro@336: cmd.interdiff = hg-interdiff jerojasro@336: \end{codesample2} jerojasro@336: This directs \hgext{hgext} to make an \texttt{interdiff} command jerojasro@336: available, so you can now shorten the previous invocation of jerojasro@336: \hgxcmd{extdiff}{extdiff} to something a little more wieldy. jerojasro@336: \begin{codesample2} jerojasro@336: hg interdiff -r A:B my-change.patch jerojasro@336: \end{codesample2} jerojasro@336: jerojasro@336: \begin{note} jerojasro@336: The \command{interdiff} command works well only if the underlying jerojasro@336: files against which versions of a patch are generated remain the jerojasro@336: same. If you create a patch, modify the underlying files, and then jerojasro@336: regenerate the patch, \command{interdiff} may not produce useful jerojasro@336: output. jerojasro@336: \end{note} jerojasro@336: jerojasro@336: The \hgext{extdiff} extension is useful for more than merely improving jerojasro@336: the presentation of MQ~patches. To read more about it, go to jerojasro@336: section~\ref{sec:hgext:extdiff}. jerojasro@336: jerojasro@336: %%% Local Variables: jerojasro@336: %%% mode: latex jerojasro@336: %%% TeX-master: "00book" jerojasro@336: %%% End: