hgbook: en/mq.tex annotate

hgbook

annotate en/mq.tex @ 19:187702df428b

Piles of new content for MQ chapter - cookbook stuff.

author	Bryan O'Sullivan <bos@serpentine.com>
date	Fri Jul 07 19:56:53 2006 -0700 (2006-07-07)
parents	e6f4088ebe52
children	9d5b6d303ef5

rev	line source
bos@1	1 \chapter{Managing change with Mercurial Queues}
bos@1	2 \label{chap:mq}
bos@1	3
bos@1	4 \section{The patch management problem}
bos@1	5 \label{sec:mq:patch-mgmt}
bos@1	6
bos@1	7 Here is a common scenario: you need to install a software package from
bos@1	8 source, but you find a bug that you must fix in the source before you
bos@1	9 can start using the package. You make your changes, forget about the
bos@1	10 package for a while, and a few months later you need to upgrade to a
bos@1	11 newer version of the package. If the newer version of the package
bos@1	12 still has the bug, you must extract your fix from the older source
bos@1	13 tree and apply it against the newer version. This is a tedious task,
bos@1	14 and it's easy to make mistakes.
bos@1	15
bos@1	16 This is a simple case of the ``patch management'' problem. You have
bos@1	17 an ``upstream'' source tree that you can't change; you need to make
bos@1	18 some local changes on top of the upstream tree; and you'd like to be
bos@1	19 able to keep those changes separate, so that you can apply them to
bos@1	20 newer versions of the upstream source.
bos@1	21
bos@1	22 The patch management problem arises in many situations. Probably the
bos@1	23 most visible is that a user of an open source software project will
bos@3	24 contribute a bug fix or new feature to the project's maintainers in the
bos@1	25 form of a patch.
bos@1	26
bos@1	27 Distributors of operating systems that include open source software
bos@1	28 often need to make changes to the packages they distribute so that
bos@1	29 they will build properly in their environments.
bos@1	30
bos@1	31 When you have few changes to maintain, it is easy to manage a single
bos@15	32 patch using the standard \texttt{diff} and \texttt{patch} programs
bos@15	33 (see section~\ref{sec:mq:patch} for a discussion of these tools).
bos@1	34 Once the number of changes grows, it starts to makes sense to maintain
bos@1	35 patches as discrete ``chunks of work,'' so that for example a single
bos@1	36 patch will contain only one bug fix (the patch might modify several
bos@1	37 files, but it's doing ``only one thing''), and you may have a number
bos@1	38 of such patches for different bugs you need fixed and local changes
bos@3	39 you require. In this situation, if you submit a bug fix patch to the
bos@1	40 upstream maintainers of a package and they include your fix in a
bos@1	41 subsequent release, you can simply drop that single patch when you're
bos@1	42 updating to the newer release.
bos@1	43
bos@1	44 Maintaining a single patch against an upstream tree is a little
bos@1	45 tedious and error-prone, but not difficult. However, the complexity
bos@1	46 of the problem grows rapidly as the number of patches you have to
bos@1	47 maintain increases. With more than a tiny number of patches in hand,
bos@1	48 understanding which ones you have applied and maintaining them moves
bos@1	49 from messy to overwhelming.
bos@1	50
bos@1	51 Fortunately, Mercurial includes a powerful extension, Mercurial Queues
bos@1	52 (or simply ``MQ''), that massively simplifies the patch management
bos@1	53 problem.
bos@1	54
bos@1	55 \section{The prehistory of Mercurial Queues}
bos@1	56 \label{sec:mq:history}
bos@1	57
bos@1	58 During the late 1990s, several Linux kernel developers started to
bos@1	59 maintain ``patch series'' that modified the behaviour of the Linux
bos@1	60 kernel. Some of these series were focused on stability, some on
bos@1	61 feature coverage, and others were more speculative.
bos@1	62
bos@1	63 The sizes of these patch series grew rapidly. In 2002, Andrew Morton
bos@1	64 published some shell scripts he had been using to automate the task of
bos@1	65 managing his patch queues. Andrew was successfully using these
bos@1	66 scripts to manage hundreds (sometimes thousands) of patches on top of
bos@1	67 the Linux kernel.
bos@1	68
bos@1	69 \subsection{A patchwork quilt}
bos@1	70 \label{sec:mq:quilt}
bos@1	71
bos@1	72
bos@1	73 In early 2003, Andreas Gruenbacher and Martin Quinson borrowed the
bos@2	74 approach of Andrew's scripts and published a tool called ``patchwork
bos@2	75 quilt''~\cite{web:quilt}, or simply ``quilt''
bos@2	76 (see~\cite{gruenbacher:2005} for a paper describing it). Because
bos@2	77 quilt substantially automated patch management, it rapidly gained a
bos@2	78 large following among open source software developers.
bos@1	79
bos@1	80 Quilt manages a \emph{stack of patches} on top of a directory tree.
bos@1	81 To begin, you tell quilt to manage a directory tree; it stores away
bos@1	82 the names and contents of all files in the tree. To fix a bug, you
bos@1	83 create a new patch (using a single command), edit the files you need
bos@1	84 to fix, then ``refresh'' the patch.
bos@1	85
bos@1	86 The refresh step causes quilt to scan the directory tree; it updates
bos@1	87 the patch with all of the changes you have made. You can create
bos@1	88 another patch on top of the first, which will track the changes
bos@1	89 required to modify the tree from ``tree with one patch applied'' to
bos@1	90 ``tree with two patches applied''.
bos@1	91
bos@1	92 You can \emph{change} which patches are applied to the tree. If you
bos@1	93 ``pop'' a patch, the changes made by that patch will vanish from the
bos@1	94 directory tree. Quilt remembers which patches you have popped,
bos@1	95 though, so you can ``push'' a popped patch again, and the directory
bos@1	96 tree will be restored to contain the modifications in the patch. Most
bos@1	97 importantly, you can run the ``refresh'' command at any time, and the
bos@1	98 topmost applied patch will be updated. This means that you can, at
bos@1	99 any time, change both which patches are applied and what
bos@1	100 modifications those patches make.
bos@1	101
bos@1	102 Quilt knows nothing about revision control tools, so it works equally
bos@3	103 well on top of an unpacked tarball or a Subversion repository.
bos@1	104
bos@1	105 \subsection{From patchwork quilt to Mercurial Queues}
bos@1	106 \label{sec:mq:quilt-mq}
bos@1	107
bos@1	108 In mid-2005, Chris Mason took the features of quilt and wrote an
bos@1	109 extension that he called Mercurial Queues, which added quilt-like
bos@1	110 behaviour to Mercurial.
bos@1	111
bos@1	112 The key difference between quilt and MQ is that quilt knows nothing
bos@1	113 about revision control systems, while MQ is \emph{integrated} into
bos@1	114 Mercurial. Each patch that you push is represented as a Mercurial
bos@1	115 changeset. Pop a patch, and the changeset goes away.
bos@1	116
bos@1	117 This integration makes understanding patches and debugging their
bos@1	118 effects \emph{enormously} easier. Since every applied patch has an
bos@1	119 associated changeset, you can use \hgcmdargs{log}{\emph{filename}} to
bos@1	120 see which changesets and patches affected a file. You can use the
bos@1	121 \hgext{bisect} extension to binary-search through all changesets and
bos@1	122 applied patches to see where a bug got introduced or fixed. You can
bos@1	123 use the \hgcmd{annotate} command to see which changeset or patch
bos@1	124 modified a particular line of a source file. And so on.
bos@1	125
bos@1	126 Because quilt does not care about revision control tools, it is still
bos@1	127 a tremendously useful piece of software to know about for situations
bos@1	128 where you cannot use Mercurial and MQ.
bos@19	129
bos@19	130 \section{Understanding patches}
bos@19	131
bos@19	132 Because MQ doesn't hide its patch-oriented nature, it is helpful to
bos@19	133 understand what patches are, and a little about the tools that work
bos@19	134 with them.
bos@19	135
bos@19	136 The traditional Unix \command{diff} command compares two files, and
bos@19	137 prints a list of differences between them. The \command{patch} command
bos@19	138 understands these differences as \emph{modifications} to make to a
bos@19	139 file. Take a look at figure~\ref{ex:mq:diff} for a simple example of
bos@19	140 these commands in action.
bos@19	141
bos@19	142 \begin{figure}[ht]
bos@19	143 \interaction{mq.diff.diff}
bos@19	144 \caption{Simple uses of the \command{diff} and \command{patch} commands}
bos@19	145 \label{ex:mq:diff}
bos@19	146 \end{figure}
bos@19	147
bos@19	148 The type of file that \command{diff} generates (and \command{patch}
bos@19	149 takes as input) is called a ``patch'' or a ``diff''; there is no
bos@19	150 difference between a patch and a diff. (We'll use the term ``patch'',
bos@19	151 since it's more commonly used.)
bos@19	152
bos@19	153 A patch file can start with arbitrary text; the \command{patch}
bos@19	154 command ignores this text, but MQ uses it as the commit message when
bos@19	155 creating changesets. To find the beginning of the patch content,
bos@19	156 \command{patch} searches for the first line that starts with the
bos@19	157 string ``\texttt{diff~-}''.
bos@19	158
bos@19	159 MQ works with \emph{unified} diffs (\command{patch} can accept several
bos@19	160 other diff formats, but MQ doesn't). A unified diff contains two
bos@19	161 kinds of header. The \emph{file header} describes the file being
bos@19	162 modified; it contains the name of the file to modify. When
bos@19	163 \command{patch} sees a new file header, it looks for a file with that
bos@19	164 name to start modifying.
bos@19	165
bos@19	166 After the file header comes a series of \emph{hunks}. Each hunk
bos@19	167 starts with a header; this identifies the range of line numbers within
bos@19	168 the file that the hunk should modify. Following the header, a hunk
bos@19	169 starts and ends with a few (usually three) lines of text from the
bos@19	170 unmodified file; these are called the \emph{context} for the hunk. If
bos@19	171 there's only a small amount of context between successive hunks,
bos@19	172 \command{diff} doesn't print a new hunk header; it just runs the hunks
bos@19	173 together, with a few lines of context between modifications.
bos@19	174
bos@19	175 Each line of context begins with a space character. Within the hunk,
bos@19	176 a line that begins with ``\texttt{-}'' means ``remove this line,''
bos@19	177 while a line that begins with ``\texttt{+}'' means ``insert this
bos@19	178 line.'' For example, a line that is modified is represented by one
bos@19	179 deletion and one insertion.
bos@19	180
bos@19	181 We will return to ome of the more subtle aspects of patches later (in
bos@19	182 section~\ref{ex:mq:adv-patch}), but you should have enough information
bos@19	183 now to use MQ.
bos@19	184
bos@2	185 \section{Getting started with Mercurial Queues}
bos@2	186 \label{sec:mq:start}
bos@1	187
bos@3	188 Because MQ is implemented as an extension, you must explicitly enable
bos@3	189 before you can use it. (You don't need to download anything; MQ ships
bos@3	190 with the standard Mercurial distribution.) To enable MQ, edit your
bos@4	191 \tildefile{.hgrc} file, and add the lines in figure~\ref{ex:mq:config}.
bos@2	192
bos@12	193 \begin{figure}[ht]
bos@4	194 \begin{codesample4}
bos@4	195 [extensions]
bos@4	196 hgext.mq =
bos@4	197 \end{codesample4}
bos@4	198 \label{ex:mq:config}
bos@4	199 \caption{Contents to add to \tildefile{.hgrc} to enable the MQ extension}
bos@4	200 \end{figure}
bos@3	201
bos@3	202 Once the extension is enabled, it will make a number of new commands
bos@7	203 available. To verify that the extension is working, you can use
bos@7	204 \hgcmd{help} to see if the \hgcmd{qinit} command is now available; see
bos@7	205 the example in figure~\ref{ex:mq:enabled}.
bos@3	206
bos@12	207 \begin{figure}[ht]
bos@4	208 \interaction{mq.qinit-help.help}
bos@4	209 \caption{How to verify that MQ is enabled}
bos@4	210 \label{ex:mq:enabled}
bos@4	211 \end{figure}
bos@1	212
bos@8	213 You can use MQ with \emph{any} Mercurial repository, and its commands
bos@8	214 only operate within that repository. To get started, simply prepare
bos@8	215 the repository using the \hgcmd{qinit} command (see
bos@7	216 figure~\ref{ex:mq:qinit}). This command creates an empty directory
bos@16	217 called \sdirname{.hg/patches}, where MQ will keep its metadata. As
bos@7	218 with many Mercurial commands, the \hgcmd{qinit} command prints nothing
bos@7	219 if it succeeds.
bos@7	220
bos@12	221 \begin{figure}[ht]
bos@7	222 \interaction{mq.tutorial.qinit}
bos@7	223 \caption{Preparing a repository for use with MQ}
bos@7	224 \label{ex:mq:qinit}
bos@7	225 \end{figure}
bos@7	226
bos@12	227 \begin{figure}[ht]
bos@7	228 \interaction{mq.tutorial.qnew}
bos@7	229 \caption{Creating a new patch}
bos@7	230 \label{ex:mq:qnew}
bos@7	231 \end{figure}
bos@7	232
bos@8	233 \subsection{Creating a new patch}
bos@8	234
bos@8	235 To begin work on a new patch, use the \hgcmd{qnew} command. This
bos@7	236 command takes one argument, the name of the patch to create. MQ will
bos@16	237 use this as the name of an actual file in the \sdirname{.hg/patches}
bos@7	238 directory, as you can see in figure~\ref{ex:mq:qnew}.
bos@7	239
bos@16	240 Also newly present in the \sdirname{.hg/patches} directory are two
bos@16	241 other files, \sfilename{series} and \sfilename{status}. The
bos@16	242 \sfilename{series} file lists all of the patches that MQ knows about
bos@8	243 for this repository, with one patch per line. Mercurial uses the
bos@16	244 \sfilename{status} file for internal book-keeping; it tracks all of the
bos@7	245 patches that MQ has \emph{applied} in this repository.
bos@7	246
bos@7	247 \begin{note}
bos@16	248 You may sometimes want to edit the \sfilename{series} file by hand;
bos@7	249 for example, to change the sequence in which some patches are
bos@16	250 applied. However, manually editing the \sfilename{status} file is
bos@7	251 almost always a bad idea, as it's easy to corrupt MQ's idea of what
bos@7	252 is happening.
bos@7	253 \end{note}
bos@7	254
bos@8	255 Once you have created your new patch, you can edit files in the
bos@8	256 working directory as you usually would. All of the normal Mercurial
bos@8	257 commands, such as \hgcmd{diff} and \hgcmd{annotate}, work exactly as
bos@8	258 they did before.
bos@19	259
bos@8	260 \subsection{Refreshing a patch}
bos@8	261
bos@8	262 When you reach a point where you want to save your work, use the
bos@8	263 \hgcmd{qrefresh} command (figure~\ref{ex:mq:qnew}) to update the patch
bos@8	264 you are working on. This command folds the changes you have made in
bos@8	265 the working directory into your patch, and updates its corresponding
bos@8	266 changeset to contain those changes.
bos@8	267
bos@12	268 \begin{figure}[ht]
bos@8	269 \interaction{mq.tutorial.qrefresh}
bos@8	270 \caption{Refreshing a patch}
bos@8	271 \label{ex:mq:qrefresh}
bos@8	272 \end{figure}
bos@8	273
bos@8	274 You can run \hgcmd{qrefresh} as often as you like, so it's a good way
bos@13	275 to ``checkpoint'' your work. Refresh your patch at an opportune
bos@8	276 time; try an experiment; and if the experiment doesn't work out,
bos@8	277 \hgcmd{revert} your modifications back to the last time you refreshed.
bos@8	278
bos@12	279 \begin{figure}[ht]
bos@8	280 \interaction{mq.tutorial.qrefresh2}
bos@8	281 \caption{Refresh a patch many times to accumulate changes}
bos@8	282 \label{ex:mq:qrefresh2}
bos@8	283 \end{figure}
bos@8	284
bos@8	285 \subsection{Stacking and tracking patches}
bos@8	286
bos@8	287 Once you have finished working on a patch, or need to work on another,
bos@8	288 you can use the \hgcmd{qnew} command again to create a new patch.
bos@8	289 Mercurial will apply this patch on top of your existing patch. See
bos@8	290 figure~\ref{ex:mq:qnew2} for an example. Notice that the patch
bos@8	291 contains the changes in our prior patch as part of its context (you
bos@8	292 can see this more clearly in the output of \hgcmd{annotate}).
bos@8	293
bos@12	294 \begin{figure}[ht]
bos@8	295 \interaction{mq.tutorial.qnew2}
bos@8	296 \caption{Stacking a second patch on top of the first}
bos@8	297 \label{ex:mq:qnew2}
bos@8	298 \end{figure}
bos@8	299
bos@8	300 So far, with the exception of \hgcmd{qnew} and \hgcmd{qrefresh}, we've
bos@8	301 been careful to only use regular Mercurial commands. However, there
bos@8	302 are more ``natural'' commands you can use when thinking about patches
bos@8	303 with MQ, as illustrated in figure~\ref{ex:mq:qseries}:
bos@8	304
bos@8	305 \begin{itemize}
bos@8	306 \item The \hgcmd{qseries} command lists every patch that MQ knows
bos@8	307 about in this repository, from oldest to newest (most recently
bos@8	308 \emph{created}).
bos@8	309 \item The \hgcmd{qapplied} command lists every patch that MQ has
bos@8	310 \emph{applied} in this repository, again from oldest to newest (most
bos@8	311 recently applied).
bos@8	312 \end{itemize}
bos@8	313
bos@12	314 \begin{figure}[ht]
bos@8	315 \interaction{mq.tutorial.qseries}
bos@8	316 \caption{Understanding the patch stack with \hgcmd{qseries} and
bos@8	317 \hgcmd{qapplied}}
bos@8	318 \label{ex:mq:qseries}
bos@8	319 \end{figure}
bos@8	320
bos@8	321 \subsection{Manipulating the patch stack}
bos@8	322
bos@8	323 The previous discussion implied that there must be a difference
bos@11	324 between ``known'' and ``applied'' patches, and there is. MQ can
bos@11	325 manage a patch without it being applied in the repository.
bos@8	326
bos@8	327 An \emph{applied} patch has a corresponding changeset in the
bos@8	328 repository, and the effects of the patch and changeset are visible in
bos@8	329 the working directory. You can undo the application of a patch using
bos@12	330 the \hgcmd{qpop} command. MQ still \emph{knows about}, or manages, a
bos@12	331 popped patch, but the patch no longer has a corresponding changeset in
bos@12	332 the repository, and the working directory does not contain the changes
bos@12	333 made by the patch. Figure~\ref{fig:mq:stack} illustrates the
bos@12	334 difference between applied and tracked patches.
bos@12	335
bos@12	336 \begin{figure}[ht]
bos@12	337 \centering
bos@12	338 \grafix{mq-stack}
bos@12	339 \caption{Applied and unapplied patches in the MQ patch stack}
bos@12	340 \label{fig:mq:stack}
bos@8	341 \end{figure}
bos@8	342
bos@8	343 You can reapply an unapplied, or popped, patch using the \hgcmd{qpush}
bos@8	344 command. This creates a new changeset to correspond to the patch, and
bos@8	345 the patch's changes once again become present in the working
bos@8	346 directory. See figure~\ref{ex:mq:qpop} for examples of \hgcmd{qpop}
bos@8	347 and \hgcmd{qpush} in action. Notice that once we have popped a patch
bos@8	348 or two patches, the output of \hgcmd{qseries} remains the same, while
bos@8	349 that of \hgcmd{qapplied} has changed.
bos@8	350
bos@12	351 \begin{figure}[ht]
bos@12	352 \interaction{mq.tutorial.qpop}
bos@12	353 \caption{Modifying the stack of applied patches}
bos@12	354 \label{ex:mq:qpop}
bos@11	355 \end{figure}
bos@11	356
bos@8	357 MQ does not limit you to pushing or popping one patch. You can have
bos@8	358 no patches, all of them, or any number in between applied at some
bos@8	359 point in time.
bos@8	360
bos@13	361 \subsection{Working on several patches at once}
bos@13	362
bos@13	363 The \hgcmd{qrefresh} command always refreshes the \emph{topmost}
bos@13	364 applied patch. This means that you can suspend work on one patch (by
bos@13	365 refreshing it), pop or push to make a different patch the top, and
bos@13	366 work on \emph{that} patch for a while.
bos@13	367
bos@13	368 Here's an example that illustrates how you can use this ability.
bos@13	369 Let's say you're developing a new feature as two patches. The first
bos@18	370 is a change to the core of your software, and the second---layered on
bos@18	371 top of the first---changes the user interface to use the code you just
bos@13	372 added to the core. If you notice a bug in the core while you're
bos@13	373 working on the UI patch, it's easy to fix the core. Simply
bos@13	374 \hgcmd{qrefresh} the UI patch to save your in-progress changes, and
bos@13	375 \hgcmd{qpop} down to the core patch. Fix the core bug,
bos@13	376 \hgcmd{qrefresh} the core patch, and \hgcmd{qpush} back to the UI
bos@13	377 patch to continue where you left off.
bos@13	378
bos@19	379 \section{More about patches}
bos@19	380 \label{sec:mq:adv-patch}
bos@19	381
bos@19	382 MQ uses the GNU \command{patch} command to apply patches, so it's
bos@19	383 helpful to know about a few more detailed aspects of how
bos@19	384 \command{patch} works.
bos@14	385
bos@14	386 When \command{patch} applies a hunk, it tries a handful of
bos@14	387 successively less accurate strategies to try to make the hunk apply.
bos@14	388 This falling-back technique often makes it possible to take a patch
bos@14	389 that was generated against an old version of a file, and apply it
bos@14	390 against a newer version of that file.
bos@14	391
bos@14	392 First, \command{patch} tries an exact match, where the line numbers,
bos@14	393 the context, and the text to be modified must apply exactly. If it
bos@14	394 cannot make an exact match, it tries to find an exact match for the
bos@14	395 context, without honouring the line numbering information. If this
bos@14	396 succeeds, it prints a line of output saying that the hunk was applied,
bos@14	397 but at some \emph{offset} from the original line number.
bos@14	398
bos@14	399 If a context-only match fails, \command{patch} removes the first and
bos@14	400 last lines of the context, and tries a \emph{reduced} context-only
bos@14	401 match. If the hunk with reduced context succeeds, it prints a message
bos@14	402 saying that it applied the hunk with a \emph{fuzz factor} (the number
bos@14	403 after the fuzz factor indicates how many lines of context
bos@14	404 \command{patch} had to trim before the patch applied).
bos@14	405
bos@14	406 When neither of these techniques works, \command{patch} prints a
bos@14	407 message saying that the hunk in question was rejected. It saves
bos@17	408 rejected hunks (also simply called ``rejects'') to a file with the
bos@17	409 same name, and an added \sfilename{.rej} extension. It also saves an
bos@17	410 unmodified copy of the file with a \sfilename{.orig} extension; the
bos@17	411 copy of the file without any extensions will contain any changes made
bos@17	412 by hunks that \emph{did} apply cleanly. If you have a patch that
bos@17	413 modifies \filename{foo} with six hunks, and one of them fails to
bos@17	414 apply, you will have: an unmodified \filename{foo.orig}, a
bos@17	415 \filename{foo.rej} containing one hunk, and \filename{foo}, containing
bos@17	416 the changes made by the five successful five hunks.
bos@14	417
bos@14	418 \subsection{Beware the fuzz}
bos@14	419
bos@14	420 While applying a hunk at an offset, or with a fuzz factor, will often
bos@14	421 be completely successful, these inexact techniques naturally leave
bos@14	422 open the possibility of corrupting the patched file. The most common
bos@14	423 cases typically involve applying a patch twice, or at an incorrect
bos@14	424 location in the file. If \command{patch} or \hgcmd{qpush} ever
bos@14	425 mentions an offset or fuzz factor, you should make sure that the
bos@14	426 modified files are correct afterwards.
bos@14	427
bos@14	428 It's often a good idea to refresh a patch that has applied with an
bos@14	429 offset or fuzz factor; refreshing the patch generates new context
bos@14	430 information that will make it apply cleanly. I say ``often,'' not
bos@14	431 ``always,'' because sometimes refreshing a patch will make it fail to
bos@14	432 apply against a different revision of the underlying files. In some
bos@14	433 cases, such as when you're maintaining a patch that must sit on top of
bos@14	434 multiple versions of a source tree, it's acceptable to have a patch
bos@14	435 apply with some fuzz, provided you've verified the results of the
bos@14	436 patching process in such cases.
bos@14	437
bos@15	438 \subsection{Handling rejection}
bos@15	439
bos@15	440 If \hgcmd{qpush} fails to apply a patch, it will print an error
bos@16	441 message and exit. If it has left \sfilename{.rej} files behind, it is
bos@15	442 usually best to fix up the rejected hunks before you push more patches
bos@15	443 or do any further work.
bos@15	444
bos@15	445 If your patch \emph{used to} apply cleanly, and no longer does because
bos@15	446 you've changed the underlying code that your patches are based on,
bos@17	447 Mercurial Queues can help; see section~\ref{sec:mq:merge} for details.
bos@15	448
bos@15	449 Unfortunately, there aren't any great techniques for dealing with
bos@16	450 rejected hunks. Most often, you'll need to view the \sfilename{.rej}
bos@15	451 file and edit the target file, applying the rejected hunks by hand.
bos@15	452
bos@16	453 If you're feeling adventurous, Neil Brown, a Linux kernel hacker,
bos@16	454 wrote a tool called \command{wiggle}~\cite{web:wiggle}, which is more
bos@16	455 vigorous than \command{patch} in its attempts to make a patch apply.
bos@15	456
bos@15	457 Another Linux kernel hacker, Chris Mason (the author of Mercurial
bos@15	458 Queues), wrote a similar tool called \command{rej}~\cite{web:rej},
bos@15	459 which takes a simple approach to automating the application of hunks
bos@15	460 rejected by \command{patch}. \command{rej} can help with four common
bos@15	461 reasons that a hunk may be rejected:
bos@15	462
bos@15	463 \begin{itemize}
bos@15	464 \item The context in the middle of a hunk has changed.
bos@15	465 \item A hunk is missing some context at the beginning or end.
bos@18	466 \item A large hunk might apply better---either entirely or in
bos@18	467 part---if it was broken up into smaller hunks.
bos@15	468 \item A hunk removes lines with slightly different content than those
bos@15	469 currently present in the file.
bos@15	470 \end{itemize}
bos@15	471
bos@15	472 If you use \command{wiggle} or \command{rej}, you should be doubly
bos@15	473 careful to check your results when you're done.
bos@15	474
bos@17	475 \section{Getting the best performance out of MQ}
bos@17	476
bos@17	477 MQ is very efficient at handling a large number of patches. I ran
bos@17	478 some performance experiments in mid-2006 for a talk that I gave at the
bos@17	479 2006 EuroPython conference~\cite{web:europython}. I used as my data
bos@17	480 set the Linux 2.6.17-mm1 patch series, which consists of 1,738
bos@17	481 patches. I applied thes on top of a Linux kernel repository
bos@17	482 containing all 27,472 revisions between Linux 2.6.12-rc2 and Linux
bos@17	483 2.6.17.
bos@17	484
bos@17	485 On my old, slow laptop, I was able to
bos@17	486 \hgcmdargs{qpush}{\hgopt{qpush}{-a}} all 1,738 patches in 3.5 minutes,
bos@17	487 and \hgcmdargs{qpop}{\hgopt{qpop}{-a}} them all in 30 seconds. I
bos@17	488 could \hgcmd{qrefresh} one of the biggest patches (which made 22,779
bos@17	489 lines of changes to 287 files) in 6.6 seconds.
bos@17	490
bos@17	491 Clearly, MQ is well suited to working in large trees, but there are a
bos@17	492 few tricks you can use to get the best performance of it.
bos@17	493
bos@17	494 First of all, try to ``batch'' operations together. Every time you
bos@17	495 run \hgcmd{qpush} or \hgcmd{qpop}, these commands scan the working
bos@17	496 directory once to make sure you haven't made some changes and then
bos@17	497 forgotten to run \hgcmd{qrefresh}. On a small tree, the time that
bos@17	498 this scan takes is unnoticeable. However, on a medium-sized tree
bos@17	499 (containing tens of thousands of files), it can take a second or more.
bos@17	500
bos@17	501 The \hgcmd{qpush} and \hgcmd{qpop} commands allow you to push and pop
bos@17	502 multiple patches at a time. You can identify the ``destination
bos@17	503 patch'' that you want to end up at. When you \hgcmd{qpush} with a
bos@17	504 destination specified, it will push patches until that patch is at the
bos@17	505 top of the applied stack. When you \hgcmd{qpop} to a destination, MQ
bos@17	506 will pop patches until the destination patch \emph{is no longer}
bos@17	507 applied.
bos@17	508
bos@17	509 You can identify a destination patch using either the name of the
bos@17	510 patch, or by number. If you use numeric addressing, patches are
bos@17	511 counted from zero; this means that the first patch is zero, the second
bos@17	512 is one, and so on.
bos@17	513
bos@15	514 \section{Updating your patches when the underlying code changes}
bos@15	515 \label{sec:mq:merge}
bos@15	516
bos@17	517 It's common to have a stack of patches on top of an underlying
bos@17	518 repository that you don't modify directly. If you're working on
bos@17	519 changes to third-party code, or on a feature that is taking longer to
bos@17	520 develop than the rate of change of the code beneath, you will often
bos@17	521 need to sync up with the underlying code, and fix up any hunks in your
bos@17	522 patches that no longer apply. This is called \emph{rebasing} your
bos@17	523 patch series.
bos@17	524
bos@17	525 The simplest way to do this is to \hgcmdargs{qpop}{\hgopt{qpop}{-a}}
bos@17	526 your patches, then \hgcmd{pull} changes into the underlying
bos@17	527 repository, and finally \hgcmdargs{qpush}{\hgopt{qpop}{-a}} your
bos@17	528 patches again. MQ will stop pushing any time it runs across a patch
bos@17	529 that fails to apply during conflicts, allowing you to fix your
bos@17	530 conflicts, \hgcmd{qrefresh} the affected patch, and continue pushing
bos@17	531 until you have fixed your entire stack.
bos@17	532
bos@17	533 This approach is easy to use and works well if you don't expect
bos@17	534 changes to the underlying code to affect how well your patches apply.
bos@17	535 If your patch stack touches code that is modified frequently or
bos@17	536 invasively in the underlying repository, however, fixing up rejected
bos@17	537 hunks by hand quickly becomes tiresome.
bos@17	538
bos@17	539 It's possible to partially automate the rebasing process. If your
bos@17	540 patches apply cleanly against some revision of the underlying repo, MQ
bos@17	541 can use this information to help you to resolve conflicts between your
bos@17	542 patches and a different revision.
bos@17	543
bos@17	544 The process is a little involved.
bos@17	545 \begin{enumerate}
bos@17	546 \item To begin, \hgcmdargs{qpush}{-a} all of your patches on top of
bos@17	547 the revision where you know that they apply cleanly.
bos@17	548 \item Save a backup copy of your patch directory using
bos@17	549 \hgcmdargs{qsave}{\hgopt{qsave}{-e} \hgopt{qsave}{-c}}. This prints
bos@17	550 the name of the directory that it has saved the patches in. It will
bos@17	551 save the patches to a directory called
bos@17	552 \sdirname{.hg/patches.\emph{N}}, where \texttt{\emph{N}} is a small
bos@17	553 integer. It also commits a ``save changeset'' on top of your
bos@17	554 applied patches; this is for internal book-keeping, and records the
bos@17	555 states of the \sfilename{series} and \sfilename{status} files.
bos@17	556 \item Use \hgcmd{pull} to bring new changes into the underlying
bos@17	557 repository. (Don't run \hgcmdargs{pull}{-u}; see below for why.)
bos@17	558 \item Update to the new tip revision, using
bos@17	559 \hgcmdargs{update}{\hgopt{update}{-C}} to override the patches you
bos@17	560 have pushed.
bos@17	561 \item Merge all patches using \hgcmdargs{qpush}{\hgopt{qpush}{-m}
bos@17	562 \hgopt{qpush}{-a}}. The \hgopt{qpush}{-m} option to \hgcmd{qpush}
bos@17	563 tells MQ to perform a three-way merge if the patch fails to apply.
bos@17	564 \end{enumerate}
bos@17	565
bos@17	566 During the \hgcmdargs{qpush}{\hgopt{qpush}{-m}}, each patch in the
bos@17	567 \sfilename{series} file is applied normally. If a patch applies with
bos@17	568 fuzz or rejects, MQ looks at the queue you \hgcmd{qsave}d, and
bos@17	569 performs a three-way merge with the corresponding changeset. This
bos@17	570 merge uses Mercurial's normal merge machinery, so it may pop up a GUI
bos@17	571 merge tool to help you to resolve problems.
bos@17	572
bos@17	573 When you finish resolving the effects of a patch, MQ refreshes your
bos@17	574 patch based on the result of the merge.
bos@17	575
bos@17	576 At the end of this process, your repository will have one extra head
bos@17	577 from the old patch queue, and a copy of the old patch queue will be in
bos@17	578 \sdirname{.hg/patches.\emph{N}}. You can remove the extra head using
bos@17	579 \hgcmdargs{qpop}{\hgopt{qpop}{-a} \hgopt{qpop}{-n} patches.\emph{N}}
bos@17	580 or \hgcmd{strip}. You can delete \sdirname{.hg/patches.\emph{N}} once
bos@17	581 you are sure that you no longer need it as a backup.
bos@13	582
bos@16	583 \section{Managing patches in a repository}
bos@16	584
bos@16	585 Because MQ's \sdirname{.hg/patches} directory resides outside a
bos@16	586 Mercurial repository's working directory, the ``underlying'' Mercurial
bos@16	587 repository knows nothing about the management or presence of patches.
bos@16	588
bos@16	589 This presents the interesting possibility of managing the contents of
bos@16	590 the patch directory as a Mercurial repository in its own right. This
bos@16	591 can be a useful way to work. For example, you can work on a patch for
bos@16	592 a while, \hgcmd{qrefresh} it, then \hgcmd{commit} the current state of
bos@16	593 the patch. This lets you ``roll back'' to that version of the patch
bos@16	594 later on.
bos@16	595
bos@16	596 In addition, you can then share different versions of the same patch
bos@16	597 stack among multiple underlying repositories. I use this when I am
bos@16	598 developing a Linux kernel feature. I have a pristine copy of my
bos@16	599 kernel sources for each of several CPU architectures, and a cloned
bos@16	600 repository under each that contains the patches I am working on. When
bos@16	601 I want to test a change on a different architecture, I push my current
bos@16	602 patches to the patch repository associated with that kernel tree, pop
bos@16	603 and push all of my patches, and build and test that kernel.
bos@16	604
bos@16	605 Managing patches in a repository makes it possible for multiple
bos@16	606 developers to work on the same patch series without colliding with
bos@16	607 each other, all on top of an underlying source base that they may or
bos@16	608 may not control.
bos@16	609
bos@17	610 \subsection{MQ support for patch repositories}
bos@16	611
bos@16	612 MQ helps you to work with the \sdirname{.hg/patches} directory as a
bos@16	613 repository; when you prepare a repository for working with patches
bos@17	614 using \hgcmd{qinit}, you can pass the \hgopt{qinit}{-c} option to
bos@16	615 create the \sdirname{.hg/patches} directory as a Mercurial repository.
bos@16	616
bos@16	617 \begin{note}
bos@16	618 If you forget to use the \hgopt{qinit}{-c} option, you can simply go
bos@16	619 into the \sdirname{.hg/patches} directory at any time and run
bos@16	620 \hgcmd{init}. Don't forget to add an entry for the
bos@17	621 \sfilename{status} file to the \sfilename{.hgignore} file, though
bos@17	622 (\hgcmdargs{qinit}{\hgopt{qinit}{-c}} does this for you
bos@17	623 automatically); you \emph{really} don't want to manage the
bos@17	624 \sfilename{status} file.
bos@16	625 \end{note}
bos@16	626
bos@16	627 As a convenience, if MQ notices that the \dirname{.hg/patches}
bos@16	628 directory is a repository, it will automatically \hgcmd{add} every
bos@16	629 patch that you create and import.
bos@16	630
bos@16	631 Finally, MQ provides a shortcut command, \hgcmd{qcommit}, that runs
bos@16	632 \hgcmd{commit} in the \sdirname{.hg/patches} directory. This saves
bos@16	633 some cumbersome typing.
bos@16	634
bos@16	635 \subsection{A few things to watch out for}
bos@16	636
bos@16	637 MQ's support for working with a repository full of patches is limited
bos@16	638 in a few small respects.
bos@16	639
bos@16	640 MQ cannot automatically detect changes that you make to the patch
bos@16	641 directory. If you \hgcmd{pull}, manually edit, or \hgcmd{update}
bos@16	642 changes to patches or the \sfilename{series} file, you will have to
bos@17	643 \hgcmdargs{qpop}{\hgopt{qpop}{-a}} and then
bos@17	644 \hgcmdargs{qpush}{\hgopt{qpush}{-a}} in the underlying repository to
bos@17	645 see those changes show up there. If you forget to do this, you can
bos@17	646 confuse MQ's idea of which patches are applied.
bos@16	647
bos@16	648 \section{Commands for working with patches}
bos@19	649 \label{sec:mq:tools}
bos@16	650
bos@16	651 Once you've been working with patches for a while, you'll find
bos@16	652 yourself hungry for tools that will help you to understand and
bos@16	653 manipulate the patches you're dealing with.
bos@16	654
bos@16	655 The \command{diffstat} command~\cite{web:diffstat} generates a
bos@16	656 histogram of the modifications made to each file in a patch. It
bos@18	657 provides a good way to ``get a sense of'' a patch---which files it
bos@16	658 affects, and how much change it introduces to each file and as a
bos@16	659 whole. (I find that it's a good idea to use \command{diffstat}'s
bos@16	660 \texttt{-p} option as a matter of course, as otherwise it will try to
bos@16	661 do clever things with prefixes of file names that inevitably confuse
bos@16	662 at least me.)
bos@16	663
bos@19	664 \begin{figure}[ht]
bos@19	665 \interaction{mq.tools.tools}
bos@19	666 \caption{The \command{diffstat}, \command{filterdiff}, and \command{lsdiff} commands}
bos@19	667 \label{ex:mq:tools}
bos@19	668 \end{figure}
bos@19	669
bos@16	670 The \package{patchutils} package~\cite{web:patchutils} is invaluable.
bos@16	671 It provides a set of small utilities that follow the ``Unix
bos@16	672 philosophy;'' each does one useful thing with a patch. The
bos@16	673 \package{patchutils} command I use most is \command{filterdiff}, which
bos@16	674 extracts subsets from a patch file. For example, given a patch that
bos@16	675 modifies hundreds of files across dozens of directories, a single
bos@16	676 invocation of \command{filterdiff} can generate a smaller patch that
bos@16	677 only touches files whose names match a particular glob pattern.
bos@16	678
bos@19	679 \section{Good ways to work with patches}
bos@19	680
bos@19	681 Whether you are working on a patch series to submit to a free software
bos@19	682 or open source project, or a series that you intend to treat as a
bos@19	683 sequence of regular changesets when you're done, you can use some
bos@19	684 simple techniques to keep your work well organised.
bos@19	685
bos@19	686 Give your patches descriptive names. A good name for a patch might be
bos@19	687 \filename{rework-device-alloc.patch}, because it will immediately give
bos@19	688 you a hint what the purpose of the patch is. Long names shouldn't be
bos@19	689 a problem; you won't be typing the names often, but you \emph{will} be
bos@19	690 running commands like \hgcmd{qapplied} and \hgcmd{qtop} over and over.
bos@19	691 Good naming becomes especially important when you have a number of
bos@19	692 patches to work with, or if you are juggling a number of different
bos@19	693 tasks and your patches only get a fraction of your attention.
bos@19	694
bos@19	695 Be aware of what patch you're working on. Use the \hgcmd{qtop}
bos@19	696 command and skim over the text of your patches frequently---for
bos@19	697 example, using \hgcmdargs{tip}{\hgopt{tip}{-p}})---to be sure of where
bos@19	698 you stand. I have several times worked on and \hgcmd{qrefresh}ed a
bos@19	699 patch other than the one I intended, and it's often tricky to migrate
bos@19	700 changes into the right patch after making them in the wrong one.
bos@19	701
bos@19	702 For this reason, it is very much worth investing a little time to
bos@19	703 learn how to use some of the third-party tools I described in
bos@19	704 section~\ref{sec:mq:tools}, particularly \command{diffstat} and
bos@19	705 \command{filterdiff}. The former will give you a quick idea of what
bos@19	706 changes your patch is making, while the latter makes it easy to splice
bos@19	707 hunks selectively out of one patch and into another.
bos@19	708
bos@19	709 \section{MQ cookbook}
bos@19	710
bos@19	711 \subsection{Manage ``trivial'' patches}
bos@19	712
bos@19	713 Because the overhead of dropping files into a new Mercurial repository
bos@19	714 is so low, it makes a lot of sense to manage patches this way even if
bos@19	715 you simply want to make a few changes to a source tarball that you
bos@19	716 downloaded.
bos@19	717
bos@19	718 Begin by downloading and unpacking the source tarball,
bos@19	719 and turning it into a Mercurial repository.
bos@19	720 \interaction{mq.tarball.download}
bos@19	721
bos@19	722 Continue by creating a patch stack and making your changes.
bos@19	723 \interaction{mq.tarball.qinit}
bos@19	724
bos@19	725 Let's say a few weeks or months pass, and your package author releases
bos@19	726 a new version. First, bring their changes into the repository.
bos@19	727 \interaction{mq.tarball.newsource}
bos@19	728 The pipeline starting with \hgcmd{locate} above deletes all files in
bos@19	729 the working directory, so that \hgcmd{commit}'s
bos@19	730 \hgopt{commit}{--addremove} option can actually tell which files have
bos@19	731 really been removed in the newer version of the source.
bos@19	732
bos@19	733 Finally, you can apply your patches on top of the new tree.
bos@19	734 \interaction{mq.tarball.repush}
bos@19	735
bos@19	736 \subsection{Combining entire patches}
bos@19	737 \label{sec:mq:combine}
bos@19	738
bos@19	739 It's easy to combine entire patches.
bos@19	740
bos@19	741 \begin{enumerate}
bos@19	742 \item \hgcmd{qpop} your applied patches until neither patch is
bos@19	743 applied.
bos@19	744 \item Concatenate the patches that you want to combine together:
bos@19	745 \begin{codesample4}
bos@19	746 cat patch-to-drop.patch >> patch-to-augment.patch
bos@19	747 \end{codesample4}
bos@19	748 The description from the first patch (if you have one) will be used
bos@19	749 as the commit comment when you \hgcmd{qpush} the combined patch.
bos@19	750 Edit the patch description if you need to.
bos@19	751 \item Use the \hgcmd{qdel} command to delete the patch you're dropping
bos@19	752 from the \sfilename{series} file.
bos@19	753 \item \hgcmd{qpush} the combined patch. Fix up any rejects.
bos@19	754 \item \hgcmd{qrefresh} the combined patch to tidy it up.
bos@19	755 \end{enumerate}
bos@19	756
bos@19	757 \subsection{Merging part of one patch into another}
bos@19	758
bos@19	759 Merging \emph{part} of one patch into another is more difficult than
bos@19	760 combining entire patches.
bos@19	761
bos@19	762 If you want to move changes to entire files, you can use
bos@19	763 \command{filterdiff}'s \cmdopt{filterdiff}{-i} and
bos@19	764 \cmdopt{filterdiff}{-x} options to choose the modifications to snip
bos@19	765 out of one patch, concatenating its output onto the end of the patch
bos@19	766 you want to merge into. You usually won't need to modify the patch
bos@19	767 you've merged the changes from. Instead, MQ will report some rejected
bos@19	768 hunks when you \hgcmd{qpush} it (from the hunks you moved into the
bos@19	769 other patch), and you can simply \hgcmd{qrefresh} the patch to drop
bos@19	770 the duplicate hunks.
bos@19	771
bos@19	772 If you have a patch that has multiple hunks modifying a file, and you
bos@19	773 only want to move a few of those hunks, the job becomes more messy,
bos@19	774 but you can still partly automate it. Use \cmdargs{lsdiff}{-nvv} to
bos@19	775 print some metadata about the patch.
bos@19	776 \interaction{mq.tools.lsdiff}
bos@19	777
bos@19	778 This command prints three different kinds of number:
bos@19	779 \begin{itemize}
bos@19	780 \item a \emph{file number} to identify each file modified in the patch;
bos@19	781 \item the line number within a modified file that a hunk starts at; and
bos@19	782 \item a \emph{hunk number} to identify that hunk.
bos@19	783 \end{itemize}
bos@19	784
bos@19	785 You'll have to use some visual inspection, and reading of the patch,
bos@19	786 to identify the file and hunk numbers you'll want, but you can then
bos@19	787 pass them to to \command{filterdiff}'s \cmdopt{filterdiff}{--files}
bos@19	788 and \cmdopt{filterdiff}{--hunks} options, to select exactly the file
bos@19	789 and hunk you want to extract.
bos@19	790
bos@19	791 Once you have this hunk, you can concatenate it onto the end of your
bos@19	792 destination patch and continue with the remainder of
bos@19	793 section~\ref{sec:mq:combine}.
bos@19	794
bos@1	795 %%% Local Variables:
bos@1	796 %%% mode: latex
bos@1	797 %%% TeX-master: "00book"
bos@1	798 %%% End: