bos@1: \chapter{Managing change with Mercurial Queues} bos@1: \label{chap:mq} bos@1: bos@1: \section{The patch management problem} bos@1: \label{sec:mq:patch-mgmt} bos@1: bos@1: Here is a common scenario: you need to install a software package from bos@1: source, but you find a bug that you must fix in the source before you bos@1: can start using the package. You make your changes, forget about the bos@1: package for a while, and a few months later you need to upgrade to a bos@1: newer version of the package. If the newer version of the package bos@1: still has the bug, you must extract your fix from the older source bos@1: tree and apply it against the newer version. This is a tedious task, bos@1: and it's easy to make mistakes. bos@1: bos@1: This is a simple case of the ``patch management'' problem. You have bos@1: an ``upstream'' source tree that you can't change; you need to make bos@1: some local changes on top of the upstream tree; and you'd like to be bos@1: able to keep those changes separate, so that you can apply them to bos@1: newer versions of the upstream source. bos@1: bos@1: The patch management problem arises in many situations. Probably the bos@1: most visible is that a user of an open source software project will bos@3: contribute a bug fix or new feature to the project's maintainers in the bos@1: form of a patch. bos@1: bos@1: Distributors of operating systems that include open source software bos@1: often need to make changes to the packages they distribute so that bos@1: they will build properly in their environments. bos@1: bos@1: When you have few changes to maintain, it is easy to manage a single bos@1: patch using the standard \texttt{diff} and \texttt{patch} programs. bos@1: Once the number of changes grows, it starts to makes sense to maintain bos@1: patches as discrete ``chunks of work,'' so that for example a single bos@1: patch will contain only one bug fix (the patch might modify several bos@1: files, but it's doing ``only one thing''), and you may have a number bos@1: of such patches for different bugs you need fixed and local changes bos@3: you require. In this situation, if you submit a bug fix patch to the bos@1: upstream maintainers of a package and they include your fix in a bos@1: subsequent release, you can simply drop that single patch when you're bos@1: updating to the newer release. bos@1: bos@1: Maintaining a single patch against an upstream tree is a little bos@1: tedious and error-prone, but not difficult. However, the complexity bos@1: of the problem grows rapidly as the number of patches you have to bos@1: maintain increases. With more than a tiny number of patches in hand, bos@1: understanding which ones you have applied and maintaining them moves bos@1: from messy to overwhelming. bos@1: bos@1: Fortunately, Mercurial includes a powerful extension, Mercurial Queues bos@1: (or simply ``MQ''), that massively simplifies the patch management bos@1: problem. bos@1: bos@1: \section{The prehistory of Mercurial Queues} bos@1: \label{sec:mq:history} bos@1: bos@1: During the late 1990s, several Linux kernel developers started to bos@1: maintain ``patch series'' that modified the behaviour of the Linux bos@1: kernel. Some of these series were focused on stability, some on bos@1: feature coverage, and others were more speculative. bos@1: bos@1: The sizes of these patch series grew rapidly. In 2002, Andrew Morton bos@1: published some shell scripts he had been using to automate the task of bos@1: managing his patch queues. Andrew was successfully using these bos@1: scripts to manage hundreds (sometimes thousands) of patches on top of bos@1: the Linux kernel. bos@1: bos@1: \subsection{A patchwork quilt} bos@1: \label{sec:mq:quilt} bos@1: bos@1: bos@1: In early 2003, Andreas Gruenbacher and Martin Quinson borrowed the bos@2: approach of Andrew's scripts and published a tool called ``patchwork bos@2: quilt''~\cite{web:quilt}, or simply ``quilt'' bos@2: (see~\cite{gruenbacher:2005} for a paper describing it). Because bos@2: quilt substantially automated patch management, it rapidly gained a bos@2: large following among open source software developers. bos@1: bos@1: Quilt manages a \emph{stack of patches} on top of a directory tree. bos@1: To begin, you tell quilt to manage a directory tree; it stores away bos@1: the names and contents of all files in the tree. To fix a bug, you bos@1: create a new patch (using a single command), edit the files you need bos@1: to fix, then ``refresh'' the patch. bos@1: bos@1: The refresh step causes quilt to scan the directory tree; it updates bos@1: the patch with all of the changes you have made. You can create bos@1: another patch on top of the first, which will track the changes bos@1: required to modify the tree from ``tree with one patch applied'' to bos@1: ``tree with two patches applied''. bos@1: bos@1: You can \emph{change} which patches are applied to the tree. If you bos@1: ``pop'' a patch, the changes made by that patch will vanish from the bos@1: directory tree. Quilt remembers which patches you have popped, bos@1: though, so you can ``push'' a popped patch again, and the directory bos@1: tree will be restored to contain the modifications in the patch. Most bos@1: importantly, you can run the ``refresh'' command at any time, and the bos@1: topmost applied patch will be updated. This means that you can, at bos@1: any time, change both which patches are applied and what bos@1: modifications those patches make. bos@1: bos@1: Quilt knows nothing about revision control tools, so it works equally bos@3: well on top of an unpacked tarball or a Subversion repository. bos@1: bos@1: \subsection{From patchwork quilt to Mercurial Queues} bos@1: \label{sec:mq:quilt-mq} bos@1: bos@1: In mid-2005, Chris Mason took the features of quilt and wrote an bos@1: extension that he called Mercurial Queues, which added quilt-like bos@1: behaviour to Mercurial. bos@1: bos@1: The key difference between quilt and MQ is that quilt knows nothing bos@1: about revision control systems, while MQ is \emph{integrated} into bos@1: Mercurial. Each patch that you push is represented as a Mercurial bos@1: changeset. Pop a patch, and the changeset goes away. bos@1: bos@1: This integration makes understanding patches and debugging their bos@1: effects \emph{enormously} easier. Since every applied patch has an bos@1: associated changeset, you can use \hgcmdargs{log}{\emph{filename}} to bos@1: see which changesets and patches affected a file. You can use the bos@1: \hgext{bisect} extension to binary-search through all changesets and bos@1: applied patches to see where a bug got introduced or fixed. You can bos@1: use the \hgcmd{annotate} command to see which changeset or patch bos@1: modified a particular line of a source file. And so on. bos@1: bos@1: Because quilt does not care about revision control tools, it is still bos@1: a tremendously useful piece of software to know about for situations bos@1: where you cannot use Mercurial and MQ. bos@2: \section{Getting started with Mercurial Queues} bos@2: \label{sec:mq:start} bos@1: bos@3: Because MQ is implemented as an extension, you must explicitly enable bos@3: before you can use it. (You don't need to download anything; MQ ships bos@3: with the standard Mercurial distribution.) To enable MQ, edit your bos@4: \tildefile{.hgrc} file, and add the lines in figure~\ref{ex:mq:config}. bos@2: bos@12: \begin{figure}[ht] bos@4: \begin{codesample4} bos@4: [extensions] bos@4: hgext.mq = bos@4: \end{codesample4} bos@4: \label{ex:mq:config} bos@4: \caption{Contents to add to \tildefile{.hgrc} to enable the MQ extension} bos@4: \end{figure} bos@3: bos@3: Once the extension is enabled, it will make a number of new commands bos@7: available. To verify that the extension is working, you can use bos@7: \hgcmd{help} to see if the \hgcmd{qinit} command is now available; see bos@7: the example in figure~\ref{ex:mq:enabled}. bos@3: bos@12: \begin{figure}[ht] bos@4: \interaction{mq.qinit-help.help} bos@4: \caption{How to verify that MQ is enabled} bos@4: \label{ex:mq:enabled} bos@4: \end{figure} bos@1: bos@8: You can use MQ with \emph{any} Mercurial repository, and its commands bos@8: only operate within that repository. To get started, simply prepare bos@8: the repository using the \hgcmd{qinit} command (see bos@7: figure~\ref{ex:mq:qinit}). This command creates an empty directory bos@7: called \filename{.hg/patches}, where MQ will keep its metadata. As bos@7: with many Mercurial commands, the \hgcmd{qinit} command prints nothing bos@7: if it succeeds. bos@7: bos@12: \begin{figure}[ht] bos@7: \interaction{mq.tutorial.qinit} bos@7: \caption{Preparing a repository for use with MQ} bos@7: \label{ex:mq:qinit} bos@7: \end{figure} bos@7: bos@12: \begin{figure}[ht] bos@7: \interaction{mq.tutorial.qnew} bos@7: \caption{Creating a new patch} bos@7: \label{ex:mq:qnew} bos@7: \end{figure} bos@7: bos@8: \subsection{Creating a new patch} bos@8: bos@8: To begin work on a new patch, use the \hgcmd{qnew} command. This bos@7: command takes one argument, the name of the patch to create. MQ will bos@7: use this as the name of an actual file in the \filename{.hg/patches} bos@7: directory, as you can see in figure~\ref{ex:mq:qnew}. bos@7: bos@8: Also newly present in the \filename{.hg/patches} directory are two bos@8: other files, \filename{series} and \filename{status}. The bos@8: \filename{series} file lists all of the patches that MQ knows about bos@8: for this repository, with one patch per line. Mercurial uses the bos@8: \filename{status} file for internal book-keeping; it tracks all of the bos@7: patches that MQ has \emph{applied} in this repository. bos@7: bos@7: \begin{note} bos@7: You may sometimes want to edit the \filename{series} file by hand; bos@7: for example, to change the sequence in which some patches are bos@7: applied. However, manually editing the \filename{status} file is bos@7: almost always a bad idea, as it's easy to corrupt MQ's idea of what bos@7: is happening. bos@7: \end{note} bos@7: bos@8: Once you have created your new patch, you can edit files in the bos@8: working directory as you usually would. All of the normal Mercurial bos@8: commands, such as \hgcmd{diff} and \hgcmd{annotate}, work exactly as bos@8: they did before. bos@8: \subsection{Refreshing a patch} bos@8: bos@8: When you reach a point where you want to save your work, use the bos@8: \hgcmd{qrefresh} command (figure~\ref{ex:mq:qnew}) to update the patch bos@8: you are working on. This command folds the changes you have made in bos@8: the working directory into your patch, and updates its corresponding bos@8: changeset to contain those changes. bos@8: bos@12: \begin{figure}[ht] bos@8: \interaction{mq.tutorial.qrefresh} bos@8: \caption{Refreshing a patch} bos@8: \label{ex:mq:qrefresh} bos@8: \end{figure} bos@8: bos@8: You can run \hgcmd{qrefresh} as often as you like, so it's a good way bos@13: to ``checkpoint'' your work. Refresh your patch at an opportune bos@8: time; try an experiment; and if the experiment doesn't work out, bos@8: \hgcmd{revert} your modifications back to the last time you refreshed. bos@8: bos@12: \begin{figure}[ht] bos@8: \interaction{mq.tutorial.qrefresh2} bos@8: \caption{Refresh a patch many times to accumulate changes} bos@8: \label{ex:mq:qrefresh2} bos@8: \end{figure} bos@8: bos@8: \subsection{Stacking and tracking patches} bos@8: bos@8: Once you have finished working on a patch, or need to work on another, bos@8: you can use the \hgcmd{qnew} command again to create a new patch. bos@8: Mercurial will apply this patch on top of your existing patch. See bos@8: figure~\ref{ex:mq:qnew2} for an example. Notice that the patch bos@8: contains the changes in our prior patch as part of its context (you bos@8: can see this more clearly in the output of \hgcmd{annotate}). bos@8: bos@12: \begin{figure}[ht] bos@8: \interaction{mq.tutorial.qnew2} bos@8: \caption{Stacking a second patch on top of the first} bos@8: \label{ex:mq:qnew2} bos@8: \end{figure} bos@8: bos@8: So far, with the exception of \hgcmd{qnew} and \hgcmd{qrefresh}, we've bos@8: been careful to only use regular Mercurial commands. However, there bos@8: are more ``natural'' commands you can use when thinking about patches bos@8: with MQ, as illustrated in figure~\ref{ex:mq:qseries}: bos@8: bos@8: \begin{itemize} bos@8: \item The \hgcmd{qseries} command lists every patch that MQ knows bos@8: about in this repository, from oldest to newest (most recently bos@8: \emph{created}). bos@8: \item The \hgcmd{qapplied} command lists every patch that MQ has bos@8: \emph{applied} in this repository, again from oldest to newest (most bos@8: recently applied). bos@8: \end{itemize} bos@8: bos@12: \begin{figure}[ht] bos@8: \interaction{mq.tutorial.qseries} bos@8: \caption{Understanding the patch stack with \hgcmd{qseries} and bos@8: \hgcmd{qapplied}} bos@8: \label{ex:mq:qseries} bos@8: \end{figure} bos@8: bos@8: \subsection{Manipulating the patch stack} bos@8: bos@8: The previous discussion implied that there must be a difference bos@11: between ``known'' and ``applied'' patches, and there is. MQ can bos@11: manage a patch without it being applied in the repository. bos@8: bos@8: An \emph{applied} patch has a corresponding changeset in the bos@8: repository, and the effects of the patch and changeset are visible in bos@8: the working directory. You can undo the application of a patch using bos@12: the \hgcmd{qpop} command. MQ still \emph{knows about}, or manages, a bos@12: popped patch, but the patch no longer has a corresponding changeset in bos@12: the repository, and the working directory does not contain the changes bos@12: made by the patch. Figure~\ref{fig:mq:stack} illustrates the bos@12: difference between applied and tracked patches. bos@12: bos@12: \begin{figure}[ht] bos@12: \centering bos@12: \grafix{mq-stack} bos@12: \caption{Applied and unapplied patches in the MQ patch stack} bos@12: \label{fig:mq:stack} bos@8: \end{figure} bos@8: bos@8: You can reapply an unapplied, or popped, patch using the \hgcmd{qpush} bos@8: command. This creates a new changeset to correspond to the patch, and bos@8: the patch's changes once again become present in the working bos@8: directory. See figure~\ref{ex:mq:qpop} for examples of \hgcmd{qpop} bos@8: and \hgcmd{qpush} in action. Notice that once we have popped a patch bos@8: or two patches, the output of \hgcmd{qseries} remains the same, while bos@8: that of \hgcmd{qapplied} has changed. bos@8: bos@12: \begin{figure}[ht] bos@12: \interaction{mq.tutorial.qpop} bos@12: \caption{Modifying the stack of applied patches} bos@12: \label{ex:mq:qpop} bos@11: \end{figure} bos@11: bos@8: MQ does not limit you to pushing or popping one patch. You can have bos@8: no patches, all of them, or any number in between applied at some bos@8: point in time. bos@8: bos@13: \subsection{Working on several patches at once} bos@13: bos@13: The \hgcmd{qrefresh} command always refreshes the \emph{topmost} bos@13: applied patch. This means that you can suspend work on one patch (by bos@13: refreshing it), pop or push to make a different patch the top, and bos@13: work on \emph{that} patch for a while. bos@13: bos@13: Here's an example that illustrates how you can use this ability. bos@13: Let's say you're developing a new feature as two patches. The first bos@13: is a change to the core of your software, and the second--layered on bos@13: top of the first--changes the user interface to use the code you just bos@13: added to the core. If you notice a bug in the core while you're bos@13: working on the UI patch, it's easy to fix the core. Simply bos@13: \hgcmd{qrefresh} the UI patch to save your in-progress changes, and bos@13: \hgcmd{qpop} down to the core patch. Fix the core bug, bos@13: \hgcmd{qrefresh} the core patch, and \hgcmd{qpush} back to the UI bos@13: patch to continue where you left off. bos@13: bos@14: \section{Mercurial Queues and GNU patch} bos@14: bos@14: MQ uses the GNU \command{patch} command to apply patches. It will bos@14: help you to understand the data that MQ and \command{patch} work with, bos@14: and a few aspects of how \command{patch} operates. bos@14: bos@14: A patch file can start with arbitrary text; MQ uses this text as the bos@14: commit message when creating changesets. It treats the first line bos@14: that starts with the string ``\texttt{diff~-}'' as the separator bos@14: between header and content. bos@14: bos@14: MQ works with \emph{unified diffs} (\command{patch} can accept several bos@14: other kinds of diff, but MQ doesn't). A unified diff contains two bos@14: kinds of header. The \emph{file header} describes the file being bos@14: modified; it contains the name of the file to modify. When bos@14: \command{patch} sees a new file header, it looks for a file of that bos@14: name to start modifying. bos@14: bos@14: After the file header come a series of \emph{hunks}. Each hunk starts bos@14: with a header; this identifies the range of line numbers within the bos@14: file that the hunk should modify. Following the header, a hunk starts bos@14: and ends with a few lines of text from the unmodified file; these are bos@14: called the \emph{context} for the hunk. Each unmodified line begins bos@14: with a space characters. Within the hunk, a line that begins with bos@14: ``\texttt{-}'' means ``remove this line,'' while a line that begins bos@14: with ``\texttt{+}'' means ``insert this line.'' For example, a line bos@14: that is modified is represented by one deletion and one insertion. bos@14: bos@14: When \command{patch} applies a hunk, it tries a handful of bos@14: successively less accurate strategies to try to make the hunk apply. bos@14: This falling-back technique often makes it possible to take a patch bos@14: that was generated against an old version of a file, and apply it bos@14: against a newer version of that file. bos@14: bos@14: First, \command{patch} tries an exact match, where the line numbers, bos@14: the context, and the text to be modified must apply exactly. If it bos@14: cannot make an exact match, it tries to find an exact match for the bos@14: context, without honouring the line numbering information. If this bos@14: succeeds, it prints a line of output saying that the hunk was applied, bos@14: but at some \emph{offset} from the original line number. bos@14: bos@14: If a context-only match fails, \command{patch} removes the first and bos@14: last lines of the context, and tries a \emph{reduced} context-only bos@14: match. If the hunk with reduced context succeeds, it prints a message bos@14: saying that it applied the hunk with a \emph{fuzz factor} (the number bos@14: after the fuzz factor indicates how many lines of context bos@14: \command{patch} had to trim before the patch applied). bos@14: bos@14: When neither of these techniques works, \command{patch} prints a bos@14: message saying that the hunk in question was rejected. It saves bos@14: rejected hunks to a file with the same name, and an added bos@14: \filename{.rej} extension. If \hgcmd{qpush} fails to apply a patch, bos@14: it will print an error message and exit. bos@14: bos@14: \subsection{Beware the fuzz} bos@14: bos@14: While applying a hunk at an offset, or with a fuzz factor, will often bos@14: be completely successful, these inexact techniques naturally leave bos@14: open the possibility of corrupting the patched file. The most common bos@14: cases typically involve applying a patch twice, or at an incorrect bos@14: location in the file. If \command{patch} or \hgcmd{qpush} ever bos@14: mentions an offset or fuzz factor, you should make sure that the bos@14: modified files are correct afterwards. bos@14: bos@14: It's often a good idea to refresh a patch that has applied with an bos@14: offset or fuzz factor; refreshing the patch generates new context bos@14: information that will make it apply cleanly. I say ``often,'' not bos@14: ``always,'' because sometimes refreshing a patch will make it fail to bos@14: apply against a different revision of the underlying files. In some bos@14: cases, such as when you're maintaining a patch that must sit on top of bos@14: multiple versions of a source tree, it's acceptable to have a patch bos@14: apply with some fuzz, provided you've verified the results of the bos@14: patching process in such cases. bos@14: bos@13: bos@1: %%% Local Variables: bos@1: %%% mode: latex bos@1: %%% TeX-master: "00book" bos@1: %%% End: