hgbook

view en/ch09-undo.xml @ 563:44d1363234d2

Move example output files into examples/results
author Bryan O'Sullivan <bos@serpentine.com>
date Mon Mar 09 21:37:47 2009 -0700 (2009-03-09)
parents
children 8fcd44708f41
line source
1 <!-- vim: set filetype=docbkxml shiftwidth=2 autoindent expandtab tw=77 : -->
3 <chapter id="chap:undo">
4 <title>Finding and fixing your mistakes</title>
6 <para>To err might be human, but to really handle the consequences
7 well takes a top-notch revision control system. In this chapter,
8 we'll discuss some of the techniques you can use when you find
9 that a problem has crept into your project. Mercurial has some
10 highly capable features that will help you to isolate the sources
11 of problems, and to handle them appropriately.</para>
13 <sect1>
14 <title>Erasing local history</title>
16 <sect2>
17 <title>The accidental commit</title>
19 <para>I have the occasional but persistent problem of typing
20 rather more quickly than I can think, which sometimes results
21 in me committing a changeset that is either incomplete or
22 plain wrong. In my case, the usual kind of incomplete
23 changeset is one in which I've created a new source file, but
24 forgotten to <command role="hg-cmd">hg add</command> it. A
25 <quote>plain wrong</quote> changeset is not as common, but no
26 less annoying.</para>
28 </sect2>
29 <sect2 id="sec:undo:rollback">
30 <title>Rolling back a transaction</title>
32 <para>In section <xref linkend="sec:concepts:txn"/>, I mentioned
33 that Mercurial treats each modification of a repository as a
34 <emphasis>transaction</emphasis>. Every time you commit a
35 changeset or pull changes from another repository, Mercurial
36 remembers what you did. You can undo, or <emphasis>roll
37 back</emphasis>, exactly one of these actions using the
38 <command role="hg-cmd">hg rollback</command> command. (See
39 section <xref linkend="sec:undo:rollback-after-push"/> for an
40 important caveat about the use of this command.)</para>
42 <para>Here's a mistake that I often find myself making:
43 committing a change in which I've created a new file, but
44 forgotten to <command role="hg-cmd">hg add</command> it. <!--
45 &interaction.rollback.commit; --> Looking at the output of
46 <command role="hg-cmd">hg status</command> after the commit
47 immediately confirms the error. <!--
48 &interaction.rollback.status; --> The commit captured the
49 changes to the file <filename>a</filename>, but not the new
50 file <filename>b</filename>. If I were to push this changeset
51 to a repository that I shared with a colleague, the chances
52 are high that something in <filename>a</filename> would refer
53 to <filename>b</filename>, which would not be present in their
54 repository when they pulled my changes. I would thus become
55 the object of some indignation.</para>
57 <para>However, luck is with me&emdash;I've caught my error
58 before I pushed the changeset. I use the <command
59 role="hg-cmd">hg rollback</command> command, and Mercurial
60 makes that last changeset vanish. <!--
61 &interaction.rollback.rollback; --> Notice that the changeset
62 is no longer present in the repository's history, and the
63 working directory once again thinks that the file
64 <filename>a</filename> is modified. The commit and rollback
65 have left the working directory exactly as it was prior to the
66 commit; the changeset has been completely erased. I can now
67 safely <command role="hg-cmd">hg add</command> the file
68 <filename>b</filename>, and rerun my commit. <!--
69 &interaction.rollback.add; --></para>
71 </sect2>
72 <sect2>
73 <title>The erroneous pull</title>
75 <para>It's common practice with Mercurial to maintain separate
76 development branches of a project in different repositories.
77 Your development team might have one shared repository for
78 your project's <quote>0.9</quote> release, and another,
79 containing different changes, for the <quote>1.0</quote>
80 release.</para>
82 <para>Given this, you can imagine that the consequences could be
83 messy if you had a local <quote>0.9</quote> repository, and
84 accidentally pulled changes from the shared <quote>1.0</quote>
85 repository into it. At worst, you could be paying
86 insufficient attention, and push those changes into the shared
87 <quote>0.9</quote> tree, confusing your entire team (but don't
88 worry, we'll return to this horror scenario later). However,
89 it's more likely that you'll notice immediately, because
90 Mercurial will display the URL it's pulling from, or you will
91 see it pull a suspiciously large number of changes into the
92 repository.</para>
94 <para>The <command role="hg-cmd">hg rollback</command> command
95 will work nicely to expunge all of the changesets that you
96 just pulled. Mercurial groups all changes from one <command
97 role="hg-cmd">hg pull</command> into a single transaction,
98 so one <command role="hg-cmd">hg rollback</command> is all you
99 need to undo this mistake.</para>
101 </sect2>
102 <sect2 id="sec:undo:rollback-after-push">
103 <title>Rolling back is useless once you've pushed</title>
105 <para>The value of the <command role="hg-cmd">hg
106 rollback</command> command drops to zero once you've pushed
107 your changes to another repository. Rolling back a change
108 makes it disappear entirely, but <emphasis>only</emphasis> in
109 the repository in which you perform the <command
110 role="hg-cmd">hg rollback</command>. Because a rollback
111 eliminates history, there's no way for the disappearance of a
112 change to propagate between repositories.</para>
114 <para>If you've pushed a change to another
115 repository&emdash;particularly if it's a shared
116 repository&emdash;it has essentially <quote>escaped into the
117 wild,</quote> and you'll have to recover from your mistake
118 in a different way. What will happen if you push a changeset
119 somewhere, then roll it back, then pull from the repository
120 you pushed to, is that the changeset will reappear in your
121 repository.</para>
123 <para>(If you absolutely know for sure that the change you want
124 to roll back is the most recent change in the repository that
125 you pushed to, <emphasis>and</emphasis> you know that nobody
126 else could have pulled it from that repository, you can roll
127 back the changeset there, too, but you really should really
128 not rely on this working reliably. If you do this, sooner or
129 later a change really will make it into a repository that you
130 don't directly control (or have forgotten about), and come
131 back to bite you.)</para>
133 </sect2>
134 <sect2>
135 <title>You can only roll back once</title>
137 <para>Mercurial stores exactly one transaction in its
138 transaction log; that transaction is the most recent one that
139 occurred in the repository. This means that you can only roll
140 back one transaction. If you expect to be able to roll back
141 one transaction, then its predecessor, this is not the
142 behaviour you will get. <!-- &interaction.rollback.twice; -->
143 Once you've rolled back one transaction in a repository, you
144 can't roll back again in that repository until you perform
145 another commit or pull.</para>
147 </sect2>
148 </sect1>
149 <sect1>
150 <title>Reverting the mistaken change</title>
152 <para>If you make a modification to a file, and decide that you
153 really didn't want to change the file at all, and you haven't
154 yet committed your changes, the <command role="hg-cmd">hg
155 revert</command> command is the one you'll need. It looks at
156 the changeset that's the parent of the working directory, and
157 restores the contents of the file to their state as of that
158 changeset. (That's a long-winded way of saying that, in the
159 normal case, it undoes your modifications.)</para>
161 <para>Let's illustrate how the <command role="hg-cmd">hg
162 revert</command> command works with yet another small example.
163 We'll begin by modifying a file that Mercurial is already
164 tracking. <!-- &interaction.daily.revert.modify; --> If we don't
165 want that change, we can simply <command role="hg-cmd">hg
166 revert</command> the file. <!--
167 &interaction.daily.revert.unmodify; --> The <command
168 role="hg-cmd">hg revert</command> command provides us with an
169 extra degree of safety by saving our modified file with a
170 <filename>.orig</filename> extension. <!--
171 &interaction.daily.revert.status; --></para>
173 <para>Here is a summary of the cases that the <command
174 role="hg-cmd">hg revert</command> command can deal with. We
175 will describe each of these in more detail in the section that
176 follows.</para>
177 <itemizedlist>
178 <listitem><para>If you modify a file, it will restore the file
179 to its unmodified state.</para>
180 </listitem>
181 <listitem><para>If you <command role="hg-cmd">hg add</command> a
182 file, it will undo the <quote>added</quote> state of the
183 file, but leave the file itself untouched.</para>
184 </listitem>
185 <listitem><para>If you delete a file without telling Mercurial,
186 it will restore the file to its unmodified contents.</para>
187 </listitem>
188 <listitem><para>If you use the <command role="hg-cmd">hg
189 remove</command> command to remove a file, it will undo
190 the <quote>removed</quote> state of the file, and restore
191 the file to its unmodified contents.</para>
192 </listitem></itemizedlist>
194 <sect2 id="sec:undo:mgmt">
195 <title>File management errors</title>
197 <para>The <command role="hg-cmd">hg revert</command> command is
198 useful for more than just modified files. It lets you reverse
199 the results of all of Mercurial's file management
200 commands&emdash;<command role="hg-cmd">hg add</command>,
201 <command role="hg-cmd">hg remove</command>, and so on.</para>
203 <para>If you <command role="hg-cmd">hg add</command> a file,
204 then decide that in fact you don't want Mercurial to track it,
205 use <command role="hg-cmd">hg revert</command> to undo the
206 add. Don't worry; Mercurial will not modify the file in any
207 way. It will just <quote>unmark</quote> the file. <!--
208 &interaction.daily.revert.add; --></para>
210 <para>Similarly, if you ask Mercurial to <command
211 role="hg-cmd">hg remove</command> a file, you can use
212 <command role="hg-cmd">hg revert</command> to restore it to
213 the contents it had as of the parent of the working directory.
214 <!-- &interaction.daily.revert.remove; --> This works just as
215 well for a file that you deleted by hand, without telling
216 Mercurial (recall that in Mercurial terminology, this kind of
217 file is called <quote>missing</quote>). <!--
218 &interaction.daily.revert.missing; --></para>
220 <para>If you revert a <command role="hg-cmd">hg copy</command>,
221 the copied-to file remains in your working directory
222 afterwards, untracked. Since a copy doesn't affect the
223 copied-from file in any way, Mercurial doesn't do anything
224 with the copied-from file. <!--
225 &interaction.daily.revert.copy; --></para>
227 <sect3>
228 <title>A slightly special case: reverting a rename</title>
230 <para>If you <command role="hg-cmd">hg rename</command> a
231 file, there is one small detail that you should remember.
232 When you <command role="hg-cmd">hg revert</command> a
233 rename, it's not enough to provide the name of the
234 renamed-to file, as you can see here. <!--
235 &interaction.daily.revert.rename; --> As you can see from
236 the output of <command role="hg-cmd">hg status</command>,
237 the renamed-to file is no longer identified as added, but
238 the renamed-<emphasis>from</emphasis> file is still removed!
239 This is counter-intuitive (at least to me), but at least
240 it's easy to deal with. <!--
241 &interaction.daily.revert.rename-orig; --> So remember, to
242 revert a <command role="hg-cmd">hg rename</command>, you
243 must provide <emphasis>both</emphasis> the source and
244 destination names.</para>
246 <para>% TODO: the output doesn't look like it will be
247 removed!</para>
249 <para>(By the way, if you rename a file, then modify the
250 renamed-to file, then revert both components of the rename,
251 when Mercurial restores the file that was removed as part of
252 the rename, it will be unmodified. If you need the
253 modifications in the renamed-to file to show up in the
254 renamed-from file, don't forget to copy them over.)</para>
256 <para>These fiddly aspects of reverting a rename arguably
257 constitute a small bug in Mercurial.</para>
259 </sect3>
260 </sect2>
261 </sect1>
262 <sect1>
263 <title>Dealing with committed changes</title>
265 <para>Consider a case where you have committed a change $a$, and
266 another change $b$ on top of it; you then realise that change
267 $a$ was incorrect. Mercurial lets you <quote>back out</quote>
268 an entire changeset automatically, and building blocks that let
269 you reverse part of a changeset by hand.</para>
271 <para>Before you read this section, here's something to keep in
272 mind: the <command role="hg-cmd">hg backout</command> command
273 undoes changes by <emphasis>adding</emphasis> history, not by
274 modifying or erasing it. It's the right tool to use if you're
275 fixing bugs, but not if you're trying to undo some change that
276 has catastrophic consequences. To deal with those, see section
277 <xref linkend="sec:undo:aaaiiieee"/>.</para>
279 <sect2>
280 <title>Backing out a changeset</title>
282 <para>The <command role="hg-cmd">hg backout</command> command
283 lets you <quote>undo</quote> the effects of an entire
284 changeset in an automated fashion. Because Mercurial's
285 history is immutable, this command <emphasis>does
286 not</emphasis> get rid of the changeset you want to undo.
287 Instead, it creates a new changeset that
288 <emphasis>reverses</emphasis> the effect of the to-be-undone
289 changeset.</para>
291 <para>The operation of the <command role="hg-cmd">hg
292 backout</command> command is a little intricate, so let's
293 illustrate it with some examples. First, we'll create a
294 repository with some simple changes. <!--
295 &interaction.backout.init; --></para>
297 <para>The <command role="hg-cmd">hg backout</command> command
298 takes a single changeset ID as its argument; this is the
299 changeset to back out. Normally, <command role="hg-cmd">hg
300 backout</command> will drop you into a text editor to write
301 a commit message, so you can record why you're backing the
302 change out. In this example, we provide a commit message on
303 the command line using the <option
304 role="hg-opt-backout">-m</option> option.</para>
306 </sect2>
307 <sect2>
308 <title>Backing out the tip changeset</title>
310 <para>We're going to start by backing out the last changeset we
311 committed. <!-- &interaction.backout.simple; --> You can see
312 that the second line from <filename>myfile</filename> is no
313 longer present. Taking a look at the output of <command
314 role="hg-cmd">hg log</command> gives us an idea of what the
315 <command role="hg-cmd">hg backout</command> command has done.
316 <!-- &interaction.backout.simple.log; --> Notice that the new
317 changeset that <command role="hg-cmd">hg backout</command> has
318 created is a child of the changeset we backed out. It's
319 easier to see this in figure <xref
320 linkend="fig:undo:backout"/>, which presents a graphical
321 view of the change history. As you can see, the history is
322 nice and linear.</para>
324 <informalfigure id="fig:undo:backout">
325 <mediaobject><imageobject><imagedata
326 fileref="undo-simple"/></imageobject><textobject><phrase>XXX
327 add text</phrase></textobject><caption><para>Backing out
328 a change using the <command role="hg-cmd">hg
329 backout</command>
330 command</para></caption></mediaobject>
332 </informalfigure>
334 </sect2>
335 <sect2>
336 <title>Backing out a non-tip change</title>
338 <para>If you want to back out a change other than the last one
339 you committed, pass the <option
340 role="hg-opt-backout">--merge</option> option to the
341 <command role="hg-cmd">hg backout</command> command. <!--
342 &interaction.backout.non-tip.clone; --> This makes backing out
343 any changeset a <quote>one-shot</quote> operation that's
344 usually simple and fast. <!--
345 &interaction.backout.non-tip.backout; --></para>
347 <para>If you take a look at the contents of
348 <filename>myfile</filename> after the backout finishes, you'll
349 see that the first and third changes are present, but not the
350 second. <!-- &interaction.backout.non-tip.cat; --></para>
352 <para>As the graphical history in figure <xref
353 linkend="fig:undo:backout-non-tip"/> illustrates, Mercurial
354 actually commits <emphasis>two</emphasis> changes in this kind
355 of situation (the box-shaped nodes are the ones that Mercurial
356 commits automatically). Before Mercurial begins the backout
357 process, it first remembers what the current parent of the
358 working directory is. It then backs out the target changeset,
359 and commits that as a changeset. Finally, it merges back to
360 the previous parent of the working directory, and commits the
361 result of the merge.</para>
363 <para>% TODO: to me it looks like mercurial doesn't commit the
364 second merge automatically!</para>
366 <informalfigure id="fig:undo:backout-non-tip">
367 <mediaobject><imageobject><imagedata
368 fileref="undo-non-tip"/></imageobject><textobject><phrase>XXX
369 add text</phrase></textobject><caption><para>Automated
370 backout of a non-tip change using the <command
371 role="hg-cmd">hg backout</command>
372 command</para></caption></mediaobject>
373 </informalfigure>
375 <para>The result is that you end up <quote>back where you
376 were</quote>, only with some extra history that undoes the
377 effect of the changeset you wanted to back out.</para>
379 <sect3>
380 <title>Always use the <option
381 role="hg-opt-backout">--merge</option> option</title>
383 <para>In fact, since the <option
384 role="hg-opt-backout">--merge</option> option will do the
385 <quote>right thing</quote> whether or not the changeset
386 you're backing out is the tip (i.e. it won't try to merge if
387 it's backing out the tip, since there's no need), you should
388 <emphasis>always</emphasis> use this option when you run the
389 <command role="hg-cmd">hg backout</command> command.</para>
391 </sect3>
392 </sect2>
393 <sect2>
394 <title>Gaining more control of the backout process</title>
396 <para>While I've recommended that you always use the <option
397 role="hg-opt-backout">--merge</option> option when backing
398 out a change, the <command role="hg-cmd">hg backout</command>
399 command lets you decide how to merge a backout changeset.
400 Taking control of the backout process by hand is something you
401 will rarely need to do, but it can be useful to understand
402 what the <command role="hg-cmd">hg backout</command> command
403 is doing for you automatically. To illustrate this, let's
404 clone our first repository, but omit the backout change that
405 it contains.</para>
407 <para><!-- &interaction.backout.manual.clone; --> As with our
408 earlier example, We'll commit a third changeset, then back out
409 its parent, and see what happens. <!--
410 &interaction.backout.manual.backout; --> Our new changeset is
411 again a descendant of the changeset we backout out; it's thus
412 a new head, <emphasis>not</emphasis> a descendant of the
413 changeset that was the tip. The <command role="hg-cmd">hg
414 backout</command> command was quite explicit in telling us
415 this. <!-- &interaction.backout.manual.log; --></para>
417 <para>Again, it's easier to see what has happened by looking at
418 a graph of the revision history, in figure <xref
419 linkend="fig:undo:backout-manual"/>. This makes it clear
420 that when we use <command role="hg-cmd">hg backout</command>
421 to back out a change other than the tip, Mercurial adds a new
422 head to the repository (the change it committed is
423 box-shaped).</para>
425 <informalfigure id="fig:undo:backout-manual">
426 <mediaobject><imageobject><imagedata
427 fileref="undo-manual"/></imageobject><textobject><phrase>XXX
428 add text</phrase></textobject><caption><para>Backing out
429 a change using the <command role="hg-cmd">hg
430 backout</command>
431 command</para></caption></mediaobject>
433 </informalfigure>
435 <para>After the <command role="hg-cmd">hg backout</command>
436 command has completed, it leaves the new
437 <quote>backout</quote> changeset as the parent of the working
438 directory. <!-- &interaction.backout.manual.parents; --> Now
439 we have two isolated sets of changes. <!--
440 &interaction.backout.manual.heads; --></para>
442 <para>Let's think about what we expect to see as the contents of
443 <filename>myfile</filename> now. The first change should be
444 present, because we've never backed it out. The second change
445 should be missing, as that's the change we backed out. Since
446 the history graph shows the third change as a separate head,
447 we <emphasis>don't</emphasis> expect to see the third change
448 present in <filename>myfile</filename>. <!--
449 &interaction.backout.manual.cat; --> To get the third change
450 back into the file, we just do a normal merge of our two
451 heads. <!-- &interaction.backout.manual.merge; --> Afterwards,
452 the graphical history of our repository looks like figure
453 <xref linkend="fig:undo:backout-manual-merge"/>.</para>
455 <informalfigure id="fig:undo:backout-manual-merge">
456 <mediaobject><imageobject><imagedata
457 fileref="undo-manual-merge"/></imageobject><textobject><phrase>XXX
458 add text</phrase></textobject><caption><para>Manually
459 merging a backout change</para></caption></mediaobject>
461 </informalfigure>
463 </sect2>
464 <sect2>
465 <title>Why <command role="hg-cmd">hg backout</command> works as
466 it does</title>
468 <para>Here's a brief description of how the <command
469 role="hg-cmd">hg backout</command> command works.</para>
470 <orderedlist>
471 <listitem><para>It ensures that the working directory is
472 <quote>clean</quote>, i.e. that the output of <command
473 role="hg-cmd">hg status</command> would be empty.</para>
474 </listitem>
475 <listitem><para>It remembers the current parent of the working
476 directory. Let's call this changeset
477 <literal>orig</literal></para>
478 </listitem>
479 <listitem><para>It does the equivalent of a <command
480 role="hg-cmd">hg update</command> to sync the working
481 directory to the changeset you want to back out. Let's
482 call this changeset <literal>backout</literal></para>
483 </listitem>
484 <listitem><para>It finds the parent of that changeset. Let's
485 call that changeset <literal>parent</literal>.</para>
486 </listitem>
487 <listitem><para>For each file that the
488 <literal>backout</literal> changeset affected, it does the
489 equivalent of a <command role="hg-cmd">hg revert -r
490 parent</command> on that file, to restore it to the
491 contents it had before that changeset was
492 committed.</para>
493 </listitem>
494 <listitem><para>It commits the result as a new changeset.
495 This changeset has <literal>backout</literal> as its
496 parent.</para>
497 </listitem>
498 <listitem><para>If you specify <option
499 role="hg-opt-backout">--merge</option> on the command
500 line, it merges with <literal>orig</literal>, and commits
501 the result of the merge.</para>
502 </listitem></orderedlist>
504 <para>An alternative way to implement the <command
505 role="hg-cmd">hg backout</command> command would be to
506 <command role="hg-cmd">hg export</command> the
507 to-be-backed-out changeset as a diff, then use the <option
508 role="cmd-opt-patch">--reverse</option> option to the
509 <command>patch</command> command to reverse the effect of the
510 change without fiddling with the working directory. This
511 sounds much simpler, but it would not work nearly as
512 well.</para>
514 <para>The reason that <command role="hg-cmd">hg
515 backout</command> does an update, a commit, a merge, and
516 another commit is to give the merge machinery the best chance
517 to do a good job when dealing with all the changes
518 <emphasis>between</emphasis> the change you're backing out and
519 the current tip.</para>
521 <para>If you're backing out a changeset that's 100 revisions
522 back in your project's history, the chances that the
523 <command>patch</command> command will be able to apply a
524 reverse diff cleanly are not good, because intervening changes
525 are likely to have <quote>broken the context</quote> that
526 <command>patch</command> uses to determine whether it can
527 apply a patch (if this sounds like gibberish, see <xref
528 linkend="sec:mq:patch"/> for a
529 discussion of the <command>patch</command> command). Also,
530 Mercurial's merge machinery will handle files and directories
531 being renamed, permission changes, and modifications to binary
532 files, none of which <command>patch</command> can deal
533 with.</para>
535 </sect2>
536 </sect1>
537 <sect1 id="sec:undo:aaaiiieee">
538 <title>Changes that should never have been</title>
540 <para>Most of the time, the <command role="hg-cmd">hg
541 backout</command> command is exactly what you need if you want
542 to undo the effects of a change. It leaves a permanent record
543 of exactly what you did, both when committing the original
544 changeset and when you cleaned up after it.</para>
546 <para>On rare occasions, though, you may find that you've
547 committed a change that really should not be present in the
548 repository at all. For example, it would be very unusual, and
549 usually considered a mistake, to commit a software project's
550 object files as well as its source files. Object files have
551 almost no intrinsic value, and they're <emphasis>big</emphasis>,
552 so they increase the size of the repository and the amount of
553 time it takes to clone or pull changes.</para>
555 <para>Before I discuss the options that you have if you commit a
556 <quote>brown paper bag</quote> change (the kind that's so bad
557 that you want to pull a brown paper bag over your head), let me
558 first discuss some approaches that probably won't work.</para>
560 <para>Since Mercurial treats history as accumulative&emdash;every
561 change builds on top of all changes that preceded it&emdash;you
562 generally can't just make disastrous changes disappear. The one
563 exception is when you've just committed a change, and it hasn't
564 been pushed or pulled into another repository. That's when you
565 can safely use the <command role="hg-cmd">hg rollback</command>
566 command, as I detailed in section <xref
567 linkend="sec:undo:rollback"/>.</para>
569 <para>After you've pushed a bad change to another repository, you
570 <emphasis>could</emphasis> still use <command role="hg-cmd">hg
571 rollback</command> to make your local copy of the change
572 disappear, but it won't have the consequences you want. The
573 change will still be present in the remote repository, so it
574 will reappear in your local repository the next time you
575 pull.</para>
577 <para>If a situation like this arises, and you know which
578 repositories your bad change has propagated into, you can
579 <emphasis>try</emphasis> to get rid of the changeefrom
580 <emphasis>every</emphasis> one of those repositories. This is,
581 of course, not a satisfactory solution: if you miss even a
582 single repository while you're expunging, the change is still
583 <quote>in the wild</quote>, and could propagate further.</para>
585 <para>If you've committed one or more changes
586 <emphasis>after</emphasis> the change that you'd like to see
587 disappear, your options are further reduced. Mercurial doesn't
588 provide a way to <quote>punch a hole</quote> in history, leaving
589 changesets intact.</para>
591 <para>XXX This needs filling out. The
592 <literal>hg-replay</literal> script in the
593 <literal>examples</literal> directory works, but doesn't handle
594 merge changesets. Kind of an important omission.</para>
596 <sect2>
597 <title>Protect yourself from <quote>escaped</quote>
598 changes</title>
600 <para>If you've committed some changes to your local repository
601 and they've been pushed or pulled somewhere else, this isn't
602 necessarily a disaster. You can protect yourself ahead of
603 time against some classes of bad changeset. This is
604 particularly easy if your team usually pulls changes from a
605 central repository.</para>
607 <para>By configuring some hooks on that repository to validate
608 incoming changesets (see chapter <xref linkend="chap:hook"/>),
609 you can
610 automatically prevent some kinds of bad changeset from being
611 pushed to the central repository at all. With such a
612 configuration in place, some kinds of bad changeset will
613 naturally tend to <quote>die out</quote> because they can't
614 propagate into the central repository. Better yet, this
615 happens without any need for explicit intervention.</para>
617 <para>For instance, an incoming change hook that verifies that a
618 changeset will actually compile can prevent people from
619 inadvertantly <quote>breaking the build</quote>.</para>
621 </sect2>
622 </sect1>
623 <sect1 id="sec:undo:bisect">
624 <title>Finding the source of a bug</title>
626 <para>While it's all very well to be able to back out a changeset
627 that introduced a bug, this requires that you know which
628 changeset to back out. Mercurial provides an invaluable
629 command, called <command role="hg-cmd">hg bisect</command>, that
630 helps you to automate this process and accomplish it very
631 efficiently.</para>
633 <para>The idea behind the <command role="hg-cmd">hg
634 bisect</command> command is that a changeset has introduced
635 some change of behaviour that you can identify with a simple
636 binary test. You don't know which piece of code introduced the
637 change, but you know how to test for the presence of the bug.
638 The <command role="hg-cmd">hg bisect</command> command uses your
639 test to direct its search for the changeset that introduced the
640 code that caused the bug.</para>
642 <para>Here are a few scenarios to help you understand how you
643 might apply this command.</para>
644 <itemizedlist>
645 <listitem><para>The most recent version of your software has a
646 bug that you remember wasn't present a few weeks ago, but
647 you don't know when it was introduced. Here, your binary
648 test checks for the presence of that bug.</para>
649 </listitem>
650 <listitem><para>You fixed a bug in a rush, and now it's time to
651 close the entry in your team's bug database. The bug
652 database requires a changeset ID when you close an entry,
653 but you don't remember which changeset you fixed the bug in.
654 Once again, your binary test checks for the presence of the
655 bug.</para>
656 </listitem>
657 <listitem><para>Your software works correctly, but runs 15%
658 slower than the last time you measured it. You want to know
659 which changeset introduced the performance regression. In
660 this case, your binary test measures the performance of your
661 software, to see whether it's <quote>fast</quote> or
662 <quote>slow</quote>.</para>
663 </listitem>
664 <listitem><para>The sizes of the components of your project that
665 you ship exploded recently, and you suspect that something
666 changed in the way you build your project.</para>
667 </listitem></itemizedlist>
669 <para>From these examples, it should be clear that the <command
670 role="hg-cmd">hg bisect</command> command is not useful only
671 for finding the sources of bugs. You can use it to find any
672 <quote>emergent property</quote> of a repository (anything that
673 you can't find from a simple text search of the files in the
674 tree) for which you can write a binary test.</para>
676 <para>We'll introduce a little bit of terminology here, just to
677 make it clear which parts of the search process are your
678 responsibility, and which are Mercurial's. A
679 <emphasis>test</emphasis> is something that
680 <emphasis>you</emphasis> run when <command role="hg-cmd">hg
681 bisect</command> chooses a changeset. A
682 <emphasis>probe</emphasis> is what <command role="hg-cmd">hg
683 bisect</command> runs to tell whether a revision is good.
684 Finally, we'll use the word <quote>bisect</quote>, as both a
685 noun and a verb, to stand in for the phrase <quote>search using
686 the <command role="hg-cmd">hg bisect</command>
687 command</quote>.</para>
689 <para>One simple way to automate the searching process would be
690 simply to probe every changeset. However, this scales poorly.
691 If it took ten minutes to test a single changeset, and you had
692 10,000 changesets in your repository, the exhaustive approach
693 would take on average 35 <emphasis>days</emphasis> to find the
694 changeset that introduced a bug. Even if you knew that the bug
695 was introduced by one of the last 500 changesets, and limited
696 your search to those, you'd still be looking at over 40 hours to
697 find the changeset that introduced your bug.</para>
699 <para>What the <command role="hg-cmd">hg bisect</command> command
700 does is use its knowledge of the <quote>shape</quote> of your
701 project's revision history to perform a search in time
702 proportional to the <emphasis>logarithm</emphasis> of the number
703 of changesets to check (the kind of search it performs is called
704 a dichotomic search). With this approach, searching through
705 10,000 changesets will take less than three hours, even at ten
706 minutes per test (the search will require about 14 tests).
707 Limit your search to the last hundred changesets, and it will
708 take only about an hour (roughly seven tests).</para>
710 <para>The <command role="hg-cmd">hg bisect</command> command is
711 aware of the <quote>branchy</quote> nature of a Mercurial
712 project's revision history, so it has no problems dealing with
713 branches, merges, or multiple heads in a repository. It can
714 prune entire branches of history with a single probe, which is
715 how it operates so efficiently.</para>
717 <sect2>
718 <title>Using the <command role="hg-cmd">hg bisect</command>
719 command</title>
721 <para>Here's an example of <command role="hg-cmd">hg
722 bisect</command> in action.</para>
724 <note>
725 <para> In versions 0.9.5 and earlier of Mercurial, <command
726 role="hg-cmd">hg bisect</command> was not a core command:
727 it was distributed with Mercurial as an extension. This
728 section describes the built-in command, not the old
729 extension.</para>
730 </note>
732 <para>Now let's create a repository, so that we can try out the
733 <command role="hg-cmd">hg bisect</command> command in
734 isolation. <!-- &interaction.bisect.init; --> We'll simulate a
735 project that has a bug in it in a simple-minded way: create
736 trivial changes in a loop, and nominate one specific change
737 that will have the <quote>bug</quote>. This loop creates 35
738 changesets, each adding a single file to the repository.
739 We'll represent our <quote>bug</quote> with a file that
740 contains the text <quote>i have a gub</quote>. <!--
741 &interaction.bisect.commits; --></para>
743 <para>The next thing that we'd like to do is figure out how to
744 use the <command role="hg-cmd">hg bisect</command> command.
745 We can use Mercurial's normal built-in help mechanism for
746 this. <!-- &interaction.bisect.help; --></para>
748 <para>The <command role="hg-cmd">hg bisect</command> command
749 works in steps. Each step proceeds as follows.</para>
750 <orderedlist>
751 <listitem><para>You run your binary test.</para>
752 <itemizedlist>
753 <listitem><para>If the test succeeded, you tell <command
754 role="hg-cmd">hg bisect</command> by running the
755 <command role="hg-cmd">hg bisect good</command>
756 command.</para>
757 </listitem>
758 <listitem><para>If it failed, run the <command
759 role="hg-cmd">hg bisect bad</command>
760 command.</para></listitem></itemizedlist>
761 </listitem>
762 <listitem><para>The command uses your information to decide
763 which changeset to test next.</para>
764 </listitem>
765 <listitem><para>It updates the working directory to that
766 changeset, and the process begins again.</para>
767 </listitem></orderedlist>
768 <para>The process ends when <command role="hg-cmd">hg
769 bisect</command> identifies a unique changeset that marks
770 the point where your test transitioned from
771 <quote>succeeding</quote> to <quote>failing</quote>.</para>
773 <para>To start the search, we must run the <command
774 role="hg-cmd">hg bisect --reset</command> command. <!--
775 &interaction.bisect.search.init; --></para>
777 <para>In our case, the binary test we use is simple: we check to
778 see if any file in the repository contains the string <quote>i
779 have a gub</quote>. If it does, this changeset contains the
780 change that <quote>caused the bug</quote>. By convention, a
781 changeset that has the property we're searching for is
782 <quote>bad</quote>, while one that doesn't is
783 <quote>good</quote>.</para>
785 <para>Most of the time, the revision to which the working
786 directory is synced (usually the tip) already exhibits the
787 problem introduced by the buggy change, so we'll mark it as
788 <quote>bad</quote>. <!-- &interaction.bisect.search.bad-init;
789 --></para>
791 <para>Our next task is to nominate a changeset that we know
792 <emphasis>doesn't</emphasis> have the bug; the <command
793 role="hg-cmd">hg bisect</command> command will
794 <quote>bracket</quote> its search between the first pair of
795 good and bad changesets. In our case, we know that revision
796 10 didn't have the bug. (I'll have more words about choosing
797 the first <quote>good</quote> changeset later.) <!--
798 &interaction.bisect.search.good-init; --></para>
800 <para>Notice that this command printed some output.</para>
801 <itemizedlist>
802 <listitem><para>It told us how many changesets it must
803 consider before it can identify the one that introduced
804 the bug, and how many tests that will require.</para>
805 </listitem>
806 <listitem><para>It updated the working directory to the next
807 changeset to test, and told us which changeset it's
808 testing.</para>
809 </listitem></itemizedlist>
811 <para>We now run our test in the working directory. We use the
812 <command>grep</command> command to see if our
813 <quote>bad</quote> file is present in the working directory.
814 If it is, this revision is bad; if not, this revision is good.
815 <!-- &interaction.bisect.search.step1; --></para>
817 <para>This test looks like a perfect candidate for automation,
818 so let's turn it into a shell function. <!--
819 &interaction.bisect.search.mytest; --> We can now run an
820 entire test step with a single command,
821 <literal>mytest</literal>. <!--
822 &interaction.bisect.search.step2; --> A few more invocations
823 of our canned test step command, and we're done. <!--
824 &interaction.bisect.search.rest; --></para>
826 <para>Even though we had 40 changesets to search through, the
827 <command role="hg-cmd">hg bisect</command> command let us find
828 the changeset that introduced our <quote>bug</quote> with only
829 five tests. Because the number of tests that the <command
830 role="hg-cmd">hg bisect</command> command performs grows
831 logarithmically with the number of changesets to search, the
832 advantage that it has over the <quote>brute force</quote>
833 search approach increases with every changeset you add.</para>
835 </sect2>
836 <sect2>
837 <title>Cleaning up after your search</title>
839 <para>When you're finished using the <command role="hg-cmd">hg
840 bisect</command> command in a repository, you can use the
841 <command role="hg-cmd">hg bisect reset</command> command to
842 drop the information it was using to drive your search. The
843 command doesn't use much space, so it doesn't matter if you
844 forget to run this command. However, <command
845 role="hg-cmd">hg bisect</command> won't let you start a new
846 search in that repository until you do a <command
847 role="hg-cmd">hg bisect reset</command>. <!--
848 &interaction.bisect.search.reset; --></para>
850 </sect2>
851 </sect1>
852 <sect1>
853 <title>Tips for finding bugs effectively</title>
855 <sect2>
856 <title>Give consistent input</title>
858 <para>The <command role="hg-cmd">hg bisect</command> command
859 requires that you correctly report the result of every test
860 you perform. If you tell it that a test failed when it really
861 succeeded, it <emphasis>might</emphasis> be able to detect the
862 inconsistency. If it can identify an inconsistency in your
863 reports, it will tell you that a particular changeset is both
864 good and bad. However, it can't do this perfectly; it's about
865 as likely to report the wrong changeset as the source of the
866 bug.</para>
868 </sect2>
869 <sect2>
870 <title>Automate as much as possible</title>
872 <para>When I started using the <command role="hg-cmd">hg
873 bisect</command> command, I tried a few times to run my
874 tests by hand, on the command line. This is an approach that
875 I, at least, am not suited to. After a few tries, I found
876 that I was making enough mistakes that I was having to restart
877 my searches several times before finally getting correct
878 results.</para>
880 <para>My initial problems with driving the <command
881 role="hg-cmd">hg bisect</command> command by hand occurred
882 even with simple searches on small repositories; if the
883 problem you're looking for is more subtle, or the number of
884 tests that <command role="hg-cmd">hg bisect</command> must
885 perform increases, the likelihood of operator error ruining
886 the search is much higher. Once I started automating my
887 tests, I had much better results.</para>
889 <para>The key to automated testing is twofold:</para>
890 <itemizedlist>
891 <listitem><para>always test for the same symptom, and</para>
892 </listitem>
893 <listitem><para>always feed consistent input to the <command
894 role="hg-cmd">hg bisect</command> command.</para>
895 </listitem></itemizedlist>
896 <para>In my tutorial example above, the <command>grep</command>
897 command tests for the symptom, and the <literal>if</literal>
898 statement takes the result of this check and ensures that we
899 always feed the same input to the <command role="hg-cmd">hg
900 bisect</command> command. The <literal>mytest</literal>
901 function marries these together in a reproducible way, so that
902 every test is uniform and consistent.</para>
904 </sect2>
905 <sect2>
906 <title>Check your results</title>
908 <para>Because the output of a <command role="hg-cmd">hg
909 bisect</command> search is only as good as the input you
910 give it, don't take the changeset it reports as the absolute
911 truth. A simple way to cross-check its report is to manually
912 run your test at each of the following changesets:</para>
913 <itemizedlist>
914 <listitem><para>The changeset that it reports as the first bad
915 revision. Your test should still report this as
916 bad.</para>
917 </listitem>
918 <listitem><para>The parent of that changeset (either parent,
919 if it's a merge). Your test should report this changeset
920 as good.</para>
921 </listitem>
922 <listitem><para>A child of that changeset. Your test should
923 report this changeset as bad.</para>
924 </listitem></itemizedlist>
926 </sect2>
927 <sect2>
928 <title>Beware interference between bugs</title>
930 <para>It's possible that your search for one bug could be
931 disrupted by the presence of another. For example, let's say
932 your software crashes at revision 100, and worked correctly at
933 revision 50. Unknown to you, someone else introduced a
934 different crashing bug at revision 60, and fixed it at
935 revision 80. This could distort your results in one of
936 several ways.</para>
938 <para>It is possible that this other bug completely
939 <quote>masks</quote> yours, which is to say that it occurs
940 before your bug has a chance to manifest itself. If you can't
941 avoid that other bug (for example, it prevents your project
942 from building), and so can't tell whether your bug is present
943 in a particular changeset, the <command role="hg-cmd">hg
944 bisect</command> command cannot help you directly. Instead,
945 you can mark a changeset as untested by running <command
946 role="hg-cmd">hg bisect --skip</command>.</para>
948 <para>A different problem could arise if your test for a bug's
949 presence is not specific enough. If you check for <quote>my
950 program crashes</quote>, then both your crashing bug and an
951 unrelated crashing bug that masks it will look like the same
952 thing, and mislead <command role="hg-cmd">hg
953 bisect</command>.</para>
955 <para>Another useful situation in which to use <command
956 role="hg-cmd">hg bisect --skip</command> is if you can't
957 test a revision because your project was in a broken and hence
958 untestable state at that revision, perhaps because someone
959 checked in a change that prevented the project from
960 building.</para>
962 </sect2>
963 <sect2>
964 <title>Bracket your search lazily</title>
966 <para>Choosing the first <quote>good</quote> and
967 <quote>bad</quote> changesets that will mark the end points of
968 your search is often easy, but it bears a little discussion
969 nevertheless. From the perspective of <command
970 role="hg-cmd">hg bisect</command>, the <quote>newest</quote>
971 changeset is conventionally <quote>bad</quote>, and the older
972 changeset is <quote>good</quote>.</para>
974 <para>If you're having trouble remembering when a suitable
975 <quote>good</quote> change was, so that you can tell <command
976 role="hg-cmd">hg bisect</command>, you could do worse than
977 testing changesets at random. Just remember to eliminate
978 contenders that can't possibly exhibit the bug (perhaps
979 because the feature with the bug isn't present yet) and those
980 where another problem masks the bug (as I discussed
981 above).</para>
983 <para>Even if you end up <quote>early</quote> by thousands of
984 changesets or months of history, you will only add a handful
985 of tests to the total number that <command role="hg-cmd">hg
986 bisect</command> must perform, thanks to its logarithmic
987 behaviour.</para>
989 </sect2>
990 </sect1>
991 </chapter>
993 <!--
994 local variables:
995 sgml-parent-document: ("00book.xml" "book" "chapter")
996 end:
997 -->