hgbook

view en/ch09-undo.xml @ 569:60ee738fdc0e

Fix quoting of entity declarations.
author Bryan O'Sullivan <bos@serpentine.com>
date Mon Mar 09 23:32:15 2009 -0700 (2009-03-09)
parents b90b024729f1
children 13513d2a128d
line source
1 <!-- vim: set filetype=docbkxml shiftwidth=2 autoindent expandtab tw=77 : -->
3 <chapter id="chap:undo">
4 <title>Finding and fixing your mistakes</title>
6 <para>To err might be human, but to really handle the consequences
7 well takes a top-notch revision control system. In this chapter,
8 we'll discuss some of the techniques you can use when you find
9 that a problem has crept into your project. Mercurial has some
10 highly capable features that will help you to isolate the sources
11 of problems, and to handle them appropriately.</para>
13 <sect1>
14 <title>Erasing local history</title>
16 <sect2>
17 <title>The accidental commit</title>
19 <para>I have the occasional but persistent problem of typing
20 rather more quickly than I can think, which sometimes results
21 in me committing a changeset that is either incomplete or
22 plain wrong. In my case, the usual kind of incomplete
23 changeset is one in which I've created a new source file, but
24 forgotten to <command role="hg-cmd">hg add</command> it. A
25 <quote>plain wrong</quote> changeset is not as common, but no
26 less annoying.</para>
28 </sect2>
29 <sect2 id="sec:undo:rollback">
30 <title>Rolling back a transaction</title>
32 <para>In section <xref linkend="sec:concepts:txn"/>, I mentioned
33 that Mercurial treats each modification of a repository as a
34 <emphasis>transaction</emphasis>. Every time you commit a
35 changeset or pull changes from another repository, Mercurial
36 remembers what you did. You can undo, or <emphasis>roll
37 back</emphasis>, exactly one of these actions using the
38 <command role="hg-cmd">hg rollback</command> command. (See
39 section <xref linkend="sec:undo:rollback-after-push"/> for an
40 important caveat about the use of this command.)</para>
42 <para>Here's a mistake that I often find myself making:
43 committing a change in which I've created a new file, but
44 forgotten to <command role="hg-cmd">hg add</command>
45 it.</para>
47 &interaction.rollback.commit;
49 <para>Looking at the output of <command role="hg-cmd">hg
50 status</command> after the commit immediately confirms the
51 error.</para>
53 &interaction.rollback.status;
55 <para>The commit captured the changes to the file
56 <filename>a</filename>, but not the new file
57 <filename>b</filename>. If I were to push this changeset to a
58 repository that I shared with a colleague, the chances are
59 high that something in <filename>a</filename> would refer to
60 <filename>b</filename>, which would not be present in their
61 repository when they pulled my changes. I would thus become
62 the object of some indignation.</para>
64 <para>However, luck is with me&emdash;I've caught my error
65 before I pushed the changeset. I use the <command
66 role="hg-cmd">hg rollback</command> command, and Mercurial
67 makes that last changeset vanish.</para>
69 &interaction.rollback.rollback;
71 <para>Notice that the changeset is no longer present in the
72 repository's history, and the working directory once again
73 thinks that the file <filename>a</filename> is modified. The
74 commit and rollback have left the working directory exactly as
75 it was prior to the commit; the changeset has been completely
76 erased. I can now safely <command role="hg-cmd">hg
77 add</command> the file <filename>b</filename>, and rerun my
78 commit.</para>
80 &interaction.rollback.add;
82 </sect2>
83 <sect2>
84 <title>The erroneous pull</title>
86 <para>It's common practice with Mercurial to maintain separate
87 development branches of a project in different repositories.
88 Your development team might have one shared repository for
89 your project's <quote>0.9</quote> release, and another,
90 containing different changes, for the <quote>1.0</quote>
91 release.</para>
93 <para>Given this, you can imagine that the consequences could be
94 messy if you had a local <quote>0.9</quote> repository, and
95 accidentally pulled changes from the shared <quote>1.0</quote>
96 repository into it. At worst, you could be paying
97 insufficient attention, and push those changes into the shared
98 <quote>0.9</quote> tree, confusing your entire team (but don't
99 worry, we'll return to this horror scenario later). However,
100 it's more likely that you'll notice immediately, because
101 Mercurial will display the URL it's pulling from, or you will
102 see it pull a suspiciously large number of changes into the
103 repository.</para>
105 <para>The <command role="hg-cmd">hg rollback</command> command
106 will work nicely to expunge all of the changesets that you
107 just pulled. Mercurial groups all changes from one <command
108 role="hg-cmd">hg pull</command> into a single transaction,
109 so one <command role="hg-cmd">hg rollback</command> is all you
110 need to undo this mistake.</para>
112 </sect2>
113 <sect2 id="sec:undo:rollback-after-push">
114 <title>Rolling back is useless once you've pushed</title>
116 <para>The value of the <command role="hg-cmd">hg
117 rollback</command> command drops to zero once you've pushed
118 your changes to another repository. Rolling back a change
119 makes it disappear entirely, but <emphasis>only</emphasis> in
120 the repository in which you perform the <command
121 role="hg-cmd">hg rollback</command>. Because a rollback
122 eliminates history, there's no way for the disappearance of a
123 change to propagate between repositories.</para>
125 <para>If you've pushed a change to another
126 repository&emdash;particularly if it's a shared
127 repository&emdash;it has essentially <quote>escaped into the
128 wild,</quote> and you'll have to recover from your mistake
129 in a different way. What will happen if you push a changeset
130 somewhere, then roll it back, then pull from the repository
131 you pushed to, is that the changeset will reappear in your
132 repository.</para>
134 <para>(If you absolutely know for sure that the change you want
135 to roll back is the most recent change in the repository that
136 you pushed to, <emphasis>and</emphasis> you know that nobody
137 else could have pulled it from that repository, you can roll
138 back the changeset there, too, but you really should really
139 not rely on this working reliably. If you do this, sooner or
140 later a change really will make it into a repository that you
141 don't directly control (or have forgotten about), and come
142 back to bite you.)</para>
144 </sect2>
145 <sect2>
146 <title>You can only roll back once</title>
148 <para>Mercurial stores exactly one transaction in its
149 transaction log; that transaction is the most recent one that
150 occurred in the repository. This means that you can only roll
151 back one transaction. If you expect to be able to roll back
152 one transaction, then its predecessor, this is not the
153 behaviour you will get.</para>
155 &interaction.rollback.twice;
157 <para>Once you've rolled back one transaction in a repository,
158 you can't roll back again in that repository until you perform
159 another commit or pull.</para>
161 </sect2>
162 </sect1>
163 <sect1>
164 <title>Reverting the mistaken change</title>
166 <para>If you make a modification to a file, and decide that you
167 really didn't want to change the file at all, and you haven't
168 yet committed your changes, the <command role="hg-cmd">hg
169 revert</command> command is the one you'll need. It looks at
170 the changeset that's the parent of the working directory, and
171 restores the contents of the file to their state as of that
172 changeset. (That's a long-winded way of saying that, in the
173 normal case, it undoes your modifications.)</para>
175 <para>Let's illustrate how the <command role="hg-cmd">hg
176 revert</command> command works with yet another small example.
177 We'll begin by modifying a file that Mercurial is already
178 tracking.</para>
180 &interaction.daily.revert.modify;
182 <para>If we don't
183 want that change, we can simply <command role="hg-cmd">hg
184 revert</command> the file.</para>
186 &interaction.daily.revert.unmodify;
188 <para>The <command role="hg-cmd">hg revert</command> command
189 provides us with an extra degree of safety by saving our
190 modified file with a <filename>.orig</filename>
191 extension.</para>
193 &interaction.daily.revert.status;
195 <para>Here is a summary of the cases that the <command
196 role="hg-cmd">hg revert</command> command can deal with. We
197 will describe each of these in more detail in the section that
198 follows.</para>
199 <itemizedlist>
200 <listitem><para>If you modify a file, it will restore the file
201 to its unmodified state.</para>
202 </listitem>
203 <listitem><para>If you <command role="hg-cmd">hg add</command> a
204 file, it will undo the <quote>added</quote> state of the
205 file, but leave the file itself untouched.</para>
206 </listitem>
207 <listitem><para>If you delete a file without telling Mercurial,
208 it will restore the file to its unmodified contents.</para>
209 </listitem>
210 <listitem><para>If you use the <command role="hg-cmd">hg
211 remove</command> command to remove a file, it will undo
212 the <quote>removed</quote> state of the file, and restore
213 the file to its unmodified contents.</para>
214 </listitem></itemizedlist>
216 <sect2 id="sec:undo:mgmt">
217 <title>File management errors</title>
219 <para>The <command role="hg-cmd">hg revert</command> command is
220 useful for more than just modified files. It lets you reverse
221 the results of all of Mercurial's file management
222 commands&emdash;<command role="hg-cmd">hg add</command>,
223 <command role="hg-cmd">hg remove</command>, and so on.</para>
225 <para>If you <command role="hg-cmd">hg add</command> a file,
226 then decide that in fact you don't want Mercurial to track it,
227 use <command role="hg-cmd">hg revert</command> to undo the
228 add. Don't worry; Mercurial will not modify the file in any
229 way. It will just <quote>unmark</quote> the file.</para>
231 &interaction.daily.revert.add;
233 <para>Similarly, if you ask Mercurial to <command
234 role="hg-cmd">hg remove</command> a file, you can use
235 <command role="hg-cmd">hg revert</command> to restore it to
236 the contents it had as of the parent of the working directory.
237 &interaction.daily.revert.remove; This works just as
238 well for a file that you deleted by hand, without telling
239 Mercurial (recall that in Mercurial terminology, this kind of
240 file is called <quote>missing</quote>).</para>
242 &interaction.daily.revert.missing;
244 <para>If you revert a <command role="hg-cmd">hg copy</command>,
245 the copied-to file remains in your working directory
246 afterwards, untracked. Since a copy doesn't affect the
247 copied-from file in any way, Mercurial doesn't do anything
248 with the copied-from file.</para>
250 &interaction.daily.revert.copy;
252 <sect3>
253 <title>A slightly special case: reverting a rename</title>
255 <para>If you <command role="hg-cmd">hg rename</command> a
256 file, there is one small detail that you should remember.
257 When you <command role="hg-cmd">hg revert</command> a
258 rename, it's not enough to provide the name of the
259 renamed-to file, as you can see here.</para>
261 &interaction.daily.revert.rename;
263 <para>As you can see from the output of <command
264 role="hg-cmd">hg status</command>, the renamed-to file is
265 no longer identified as added, but the
266 renamed-<emphasis>from</emphasis> file is still removed!
267 This is counter-intuitive (at least to me), but at least
268 it's easy to deal with.</para>
270 &interaction.daily.revert.rename-orig;
272 <para>So remember, to revert a <command role="hg-cmd">hg
273 rename</command>, you must provide
274 <emphasis>both</emphasis> the source and destination
275 names.</para>
277 <para>% TODO: the output doesn't look like it will be
278 removed!</para>
280 <para>(By the way, if you rename a file, then modify the
281 renamed-to file, then revert both components of the rename,
282 when Mercurial restores the file that was removed as part of
283 the rename, it will be unmodified. If you need the
284 modifications in the renamed-to file to show up in the
285 renamed-from file, don't forget to copy them over.)</para>
287 <para>These fiddly aspects of reverting a rename arguably
288 constitute a small bug in Mercurial.</para>
290 </sect3>
291 </sect2>
292 </sect1>
293 <sect1>
294 <title>Dealing with committed changes</title>
296 <para>Consider a case where you have committed a change $a$, and
297 another change $b$ on top of it; you then realise that change
298 $a$ was incorrect. Mercurial lets you <quote>back out</quote>
299 an entire changeset automatically, and building blocks that let
300 you reverse part of a changeset by hand.</para>
302 <para>Before you read this section, here's something to keep in
303 mind: the <command role="hg-cmd">hg backout</command> command
304 undoes changes by <emphasis>adding</emphasis> history, not by
305 modifying or erasing it. It's the right tool to use if you're
306 fixing bugs, but not if you're trying to undo some change that
307 has catastrophic consequences. To deal with those, see section
308 <xref linkend="sec:undo:aaaiiieee"/>.</para>
310 <sect2>
311 <title>Backing out a changeset</title>
313 <para>The <command role="hg-cmd">hg backout</command> command
314 lets you <quote>undo</quote> the effects of an entire
315 changeset in an automated fashion. Because Mercurial's
316 history is immutable, this command <emphasis>does
317 not</emphasis> get rid of the changeset you want to undo.
318 Instead, it creates a new changeset that
319 <emphasis>reverses</emphasis> the effect of the to-be-undone
320 changeset.</para>
322 <para>The operation of the <command role="hg-cmd">hg
323 backout</command> command is a little intricate, so let's
324 illustrate it with some examples. First, we'll create a
325 repository with some simple changes.</para>
327 &interaction.backout.init;
329 <para>The <command role="hg-cmd">hg backout</command> command
330 takes a single changeset ID as its argument; this is the
331 changeset to back out. Normally, <command role="hg-cmd">hg
332 backout</command> will drop you into a text editor to write
333 a commit message, so you can record why you're backing the
334 change out. In this example, we provide a commit message on
335 the command line using the <option
336 role="hg-opt-backout">-m</option> option.</para>
338 </sect2>
339 <sect2>
340 <title>Backing out the tip changeset</title>
342 <para>We're going to start by backing out the last changeset we
343 committed.</para>
345 &interaction.backout.simple;
347 <para>You can see that the second line from
348 <filename>myfile</filename> is no longer present. Taking a
349 look at the output of <command role="hg-cmd">hg log</command>
350 gives us an idea of what the <command role="hg-cmd">hg
351 backout</command> command has done.
352 &interaction.backout.simple.log; Notice that the new changeset
353 that <command role="hg-cmd">hg backout</command> has created
354 is a child of the changeset we backed out. It's easier to see
355 this in figure <xref
356 linkend="fig:undo:backout"/>, which presents a graphical
357 view of the change history. As you can see, the history is
358 nice and linear.</para>
360 <informalfigure id="fig:undo:backout">
361 <mediaobject><imageobject><imagedata
362 fileref="undo-simple"/></imageobject><textobject><phrase>XXX
363 add text</phrase></textobject><caption><para>Backing out
364 a change using the <command role="hg-cmd">hg
365 backout</command>
366 command</para></caption></mediaobject>
368 </informalfigure>
370 </sect2>
371 <sect2>
372 <title>Backing out a non-tip change</title>
374 <para>If you want to back out a change other than the last one
375 you committed, pass the <option
376 role="hg-opt-backout">--merge</option> option to the
377 <command role="hg-cmd">hg backout</command> command.</para>
379 &interaction.backout.non-tip.clone;
381 <para>This makes backing out any changeset a
382 <quote>one-shot</quote> operation that's usually simple and
383 fast.</para>
385 &interaction.backout.non-tip.backout;
387 <para>If you take a look at the contents of
388 <filename>myfile</filename> after the backout finishes, you'll
389 see that the first and third changes are present, but not the
390 second.</para>
392 &interaction.backout.non-tip.cat;
394 <para>As the graphical history in figure <xref
395 linkend="fig:undo:backout-non-tip"/> illustrates, Mercurial
396 actually commits <emphasis>two</emphasis> changes in this kind
397 of situation (the box-shaped nodes are the ones that Mercurial
398 commits automatically). Before Mercurial begins the backout
399 process, it first remembers what the current parent of the
400 working directory is. It then backs out the target changeset,
401 and commits that as a changeset. Finally, it merges back to
402 the previous parent of the working directory, and commits the
403 result of the merge.</para>
405 <para>% TODO: to me it looks like mercurial doesn't commit the
406 second merge automatically!</para>
408 <informalfigure id="fig:undo:backout-non-tip">
409 <mediaobject><imageobject><imagedata
410 fileref="undo-non-tip"/></imageobject><textobject><phrase>XXX
411 add text</phrase></textobject><caption><para>Automated
412 backout of a non-tip change using the <command
413 role="hg-cmd">hg backout</command>
414 command</para></caption></mediaobject>
415 </informalfigure>
417 <para>The result is that you end up <quote>back where you
418 were</quote>, only with some extra history that undoes the
419 effect of the changeset you wanted to back out.</para>
421 <sect3>
422 <title>Always use the <option
423 role="hg-opt-backout">--merge</option> option</title>
425 <para>In fact, since the <option
426 role="hg-opt-backout">--merge</option> option will do the
427 <quote>right thing</quote> whether or not the changeset
428 you're backing out is the tip (i.e. it won't try to merge if
429 it's backing out the tip, since there's no need), you should
430 <emphasis>always</emphasis> use this option when you run the
431 <command role="hg-cmd">hg backout</command> command.</para>
433 </sect3>
434 </sect2>
435 <sect2>
436 <title>Gaining more control of the backout process</title>
438 <para>While I've recommended that you always use the <option
439 role="hg-opt-backout">--merge</option> option when backing
440 out a change, the <command role="hg-cmd">hg backout</command>
441 command lets you decide how to merge a backout changeset.
442 Taking control of the backout process by hand is something you
443 will rarely need to do, but it can be useful to understand
444 what the <command role="hg-cmd">hg backout</command> command
445 is doing for you automatically. To illustrate this, let's
446 clone our first repository, but omit the backout change that
447 it contains.</para>
449 &interaction.backout.manual.clone;
451 <para>As with our
452 earlier example, We'll commit a third changeset, then back out
453 its parent, and see what happens.</para>
455 &interaction.backout.manual.backout;
457 <para>Our new changeset is again a descendant of the changeset
458 we backout out; it's thus a new head, <emphasis>not</emphasis>
459 a descendant of the changeset that was the tip. The <command
460 role="hg-cmd">hg backout</command> command was quite
461 explicit in telling us this.</para>
463 &interaction.backout.manual.log;
465 <para>Again, it's easier to see what has happened by looking at
466 a graph of the revision history, in figure <xref
467 linkend="fig:undo:backout-manual"/>. This makes it clear
468 that when we use <command role="hg-cmd">hg backout</command>
469 to back out a change other than the tip, Mercurial adds a new
470 head to the repository (the change it committed is
471 box-shaped).</para>
473 <informalfigure id="fig:undo:backout-manual">
474 <mediaobject><imageobject><imagedata
475 fileref="undo-manual"/></imageobject><textobject><phrase>XXX
476 add text</phrase></textobject><caption><para>Backing out
477 a change using the <command role="hg-cmd">hg
478 backout</command>
479 command</para></caption></mediaobject>
481 </informalfigure>
483 <para>After the <command role="hg-cmd">hg backout</command>
484 command has completed, it leaves the new
485 <quote>backout</quote> changeset as the parent of the working
486 directory.</para>
488 &interaction.backout.manual.parents;
490 <para>Now we have two isolated sets of changes.</para>
492 &interaction.backout.manual.heads;
494 <para>Let's think about what we expect to see as the contents of
495 <filename>myfile</filename> now. The first change should be
496 present, because we've never backed it out. The second change
497 should be missing, as that's the change we backed out. Since
498 the history graph shows the third change as a separate head,
499 we <emphasis>don't</emphasis> expect to see the third change
500 present in <filename>myfile</filename>.</para>
502 &interaction.backout.manual.cat;
504 <para>To get the third change back into the file, we just do a
505 normal merge of our two heads.</para>
507 &interaction.backout.manual.merge;
509 <para>Afterwards, the graphical history of our repository looks
510 like figure
511 <xref linkend="fig:undo:backout-manual-merge"/>.</para>
513 <informalfigure id="fig:undo:backout-manual-merge">
514 <mediaobject><imageobject><imagedata
515 fileref="undo-manual-merge"/></imageobject><textobject><phrase>XXX
516 add text</phrase></textobject><caption><para>Manually
517 merging a backout change</para></caption></mediaobject>
519 </informalfigure>
521 </sect2>
522 <sect2>
523 <title>Why <command role="hg-cmd">hg backout</command> works as
524 it does</title>
526 <para>Here's a brief description of how the <command
527 role="hg-cmd">hg backout</command> command works.</para>
528 <orderedlist>
529 <listitem><para>It ensures that the working directory is
530 <quote>clean</quote>, i.e. that the output of <command
531 role="hg-cmd">hg status</command> would be empty.</para>
532 </listitem>
533 <listitem><para>It remembers the current parent of the working
534 directory. Let's call this changeset
535 <literal>orig</literal></para>
536 </listitem>
537 <listitem><para>It does the equivalent of a <command
538 role="hg-cmd">hg update</command> to sync the working
539 directory to the changeset you want to back out. Let's
540 call this changeset <literal>backout</literal></para>
541 </listitem>
542 <listitem><para>It finds the parent of that changeset. Let's
543 call that changeset <literal>parent</literal>.</para>
544 </listitem>
545 <listitem><para>For each file that the
546 <literal>backout</literal> changeset affected, it does the
547 equivalent of a <command role="hg-cmd">hg revert -r
548 parent</command> on that file, to restore it to the
549 contents it had before that changeset was
550 committed.</para>
551 </listitem>
552 <listitem><para>It commits the result as a new changeset.
553 This changeset has <literal>backout</literal> as its
554 parent.</para>
555 </listitem>
556 <listitem><para>If you specify <option
557 role="hg-opt-backout">--merge</option> on the command
558 line, it merges with <literal>orig</literal>, and commits
559 the result of the merge.</para>
560 </listitem></orderedlist>
562 <para>An alternative way to implement the <command
563 role="hg-cmd">hg backout</command> command would be to
564 <command role="hg-cmd">hg export</command> the
565 to-be-backed-out changeset as a diff, then use the <option
566 role="cmd-opt-patch">--reverse</option> option to the
567 <command>patch</command> command to reverse the effect of the
568 change without fiddling with the working directory. This
569 sounds much simpler, but it would not work nearly as
570 well.</para>
572 <para>The reason that <command role="hg-cmd">hg
573 backout</command> does an update, a commit, a merge, and
574 another commit is to give the merge machinery the best chance
575 to do a good job when dealing with all the changes
576 <emphasis>between</emphasis> the change you're backing out and
577 the current tip.</para>
579 <para>If you're backing out a changeset that's 100 revisions
580 back in your project's history, the chances that the
581 <command>patch</command> command will be able to apply a
582 reverse diff cleanly are not good, because intervening changes
583 are likely to have <quote>broken the context</quote> that
584 <command>patch</command> uses to determine whether it can
585 apply a patch (if this sounds like gibberish, see <xref
586 linkend="sec:mq:patch"/> for a
587 discussion of the <command>patch</command> command). Also,
588 Mercurial's merge machinery will handle files and directories
589 being renamed, permission changes, and modifications to binary
590 files, none of which <command>patch</command> can deal
591 with.</para>
593 </sect2>
594 </sect1>
595 <sect1 id="sec:undo:aaaiiieee">
596 <title>Changes that should never have been</title>
598 <para>Most of the time, the <command role="hg-cmd">hg
599 backout</command> command is exactly what you need if you want
600 to undo the effects of a change. It leaves a permanent record
601 of exactly what you did, both when committing the original
602 changeset and when you cleaned up after it.</para>
604 <para>On rare occasions, though, you may find that you've
605 committed a change that really should not be present in the
606 repository at all. For example, it would be very unusual, and
607 usually considered a mistake, to commit a software project's
608 object files as well as its source files. Object files have
609 almost no intrinsic value, and they're <emphasis>big</emphasis>,
610 so they increase the size of the repository and the amount of
611 time it takes to clone or pull changes.</para>
613 <para>Before I discuss the options that you have if you commit a
614 <quote>brown paper bag</quote> change (the kind that's so bad
615 that you want to pull a brown paper bag over your head), let me
616 first discuss some approaches that probably won't work.</para>
618 <para>Since Mercurial treats history as accumulative&emdash;every
619 change builds on top of all changes that preceded it&emdash;you
620 generally can't just make disastrous changes disappear. The one
621 exception is when you've just committed a change, and it hasn't
622 been pushed or pulled into another repository. That's when you
623 can safely use the <command role="hg-cmd">hg rollback</command>
624 command, as I detailed in section <xref
625 linkend="sec:undo:rollback"/>.</para>
627 <para>After you've pushed a bad change to another repository, you
628 <emphasis>could</emphasis> still use <command role="hg-cmd">hg
629 rollback</command> to make your local copy of the change
630 disappear, but it won't have the consequences you want. The
631 change will still be present in the remote repository, so it
632 will reappear in your local repository the next time you
633 pull.</para>
635 <para>If a situation like this arises, and you know which
636 repositories your bad change has propagated into, you can
637 <emphasis>try</emphasis> to get rid of the changeefrom
638 <emphasis>every</emphasis> one of those repositories. This is,
639 of course, not a satisfactory solution: if you miss even a
640 single repository while you're expunging, the change is still
641 <quote>in the wild</quote>, and could propagate further.</para>
643 <para>If you've committed one or more changes
644 <emphasis>after</emphasis> the change that you'd like to see
645 disappear, your options are further reduced. Mercurial doesn't
646 provide a way to <quote>punch a hole</quote> in history, leaving
647 changesets intact.</para>
649 <para>XXX This needs filling out. The
650 <literal>hg-replay</literal> script in the
651 <literal>examples</literal> directory works, but doesn't handle
652 merge changesets. Kind of an important omission.</para>
654 <sect2>
655 <title>Protect yourself from <quote>escaped</quote>
656 changes</title>
658 <para>If you've committed some changes to your local repository
659 and they've been pushed or pulled somewhere else, this isn't
660 necessarily a disaster. You can protect yourself ahead of
661 time against some classes of bad changeset. This is
662 particularly easy if your team usually pulls changes from a
663 central repository.</para>
665 <para>By configuring some hooks on that repository to validate
666 incoming changesets (see chapter <xref linkend="chap:hook"/>),
667 you can
668 automatically prevent some kinds of bad changeset from being
669 pushed to the central repository at all. With such a
670 configuration in place, some kinds of bad changeset will
671 naturally tend to <quote>die out</quote> because they can't
672 propagate into the central repository. Better yet, this
673 happens without any need for explicit intervention.</para>
675 <para>For instance, an incoming change hook that verifies that a
676 changeset will actually compile can prevent people from
677 inadvertantly <quote>breaking the build</quote>.</para>
679 </sect2>
680 </sect1>
681 <sect1 id="sec:undo:bisect">
682 <title>Finding the source of a bug</title>
684 <para>While it's all very well to be able to back out a changeset
685 that introduced a bug, this requires that you know which
686 changeset to back out. Mercurial provides an invaluable
687 command, called <command role="hg-cmd">hg bisect</command>, that
688 helps you to automate this process and accomplish it very
689 efficiently.</para>
691 <para>The idea behind the <command role="hg-cmd">hg
692 bisect</command> command is that a changeset has introduced
693 some change of behaviour that you can identify with a simple
694 binary test. You don't know which piece of code introduced the
695 change, but you know how to test for the presence of the bug.
696 The <command role="hg-cmd">hg bisect</command> command uses your
697 test to direct its search for the changeset that introduced the
698 code that caused the bug.</para>
700 <para>Here are a few scenarios to help you understand how you
701 might apply this command.</para>
702 <itemizedlist>
703 <listitem><para>The most recent version of your software has a
704 bug that you remember wasn't present a few weeks ago, but
705 you don't know when it was introduced. Here, your binary
706 test checks for the presence of that bug.</para>
707 </listitem>
708 <listitem><para>You fixed a bug in a rush, and now it's time to
709 close the entry in your team's bug database. The bug
710 database requires a changeset ID when you close an entry,
711 but you don't remember which changeset you fixed the bug in.
712 Once again, your binary test checks for the presence of the
713 bug.</para>
714 </listitem>
715 <listitem><para>Your software works correctly, but runs 15%
716 slower than the last time you measured it. You want to know
717 which changeset introduced the performance regression. In
718 this case, your binary test measures the performance of your
719 software, to see whether it's <quote>fast</quote> or
720 <quote>slow</quote>.</para>
721 </listitem>
722 <listitem><para>The sizes of the components of your project that
723 you ship exploded recently, and you suspect that something
724 changed in the way you build your project.</para>
725 </listitem></itemizedlist>
727 <para>From these examples, it should be clear that the <command
728 role="hg-cmd">hg bisect</command> command is not useful only
729 for finding the sources of bugs. You can use it to find any
730 <quote>emergent property</quote> of a repository (anything that
731 you can't find from a simple text search of the files in the
732 tree) for which you can write a binary test.</para>
734 <para>We'll introduce a little bit of terminology here, just to
735 make it clear which parts of the search process are your
736 responsibility, and which are Mercurial's. A
737 <emphasis>test</emphasis> is something that
738 <emphasis>you</emphasis> run when <command role="hg-cmd">hg
739 bisect</command> chooses a changeset. A
740 <emphasis>probe</emphasis> is what <command role="hg-cmd">hg
741 bisect</command> runs to tell whether a revision is good.
742 Finally, we'll use the word <quote>bisect</quote>, as both a
743 noun and a verb, to stand in for the phrase <quote>search using
744 the <command role="hg-cmd">hg bisect</command>
745 command</quote>.</para>
747 <para>One simple way to automate the searching process would be
748 simply to probe every changeset. However, this scales poorly.
749 If it took ten minutes to test a single changeset, and you had
750 10,000 changesets in your repository, the exhaustive approach
751 would take on average 35 <emphasis>days</emphasis> to find the
752 changeset that introduced a bug. Even if you knew that the bug
753 was introduced by one of the last 500 changesets, and limited
754 your search to those, you'd still be looking at over 40 hours to
755 find the changeset that introduced your bug.</para>
757 <para>What the <command role="hg-cmd">hg bisect</command> command
758 does is use its knowledge of the <quote>shape</quote> of your
759 project's revision history to perform a search in time
760 proportional to the <emphasis>logarithm</emphasis> of the number
761 of changesets to check (the kind of search it performs is called
762 a dichotomic search). With this approach, searching through
763 10,000 changesets will take less than three hours, even at ten
764 minutes per test (the search will require about 14 tests).
765 Limit your search to the last hundred changesets, and it will
766 take only about an hour (roughly seven tests).</para>
768 <para>The <command role="hg-cmd">hg bisect</command> command is
769 aware of the <quote>branchy</quote> nature of a Mercurial
770 project's revision history, so it has no problems dealing with
771 branches, merges, or multiple heads in a repository. It can
772 prune entire branches of history with a single probe, which is
773 how it operates so efficiently.</para>
775 <sect2>
776 <title>Using the <command role="hg-cmd">hg bisect</command>
777 command</title>
779 <para>Here's an example of <command role="hg-cmd">hg
780 bisect</command> in action.</para>
782 <note>
783 <para> In versions 0.9.5 and earlier of Mercurial, <command
784 role="hg-cmd">hg bisect</command> was not a core command:
785 it was distributed with Mercurial as an extension. This
786 section describes the built-in command, not the old
787 extension.</para>
788 </note>
790 <para>Now let's create a repository, so that we can try out the
791 <command role="hg-cmd">hg bisect</command> command in
792 isolation.</para>
794 &interaction.bisect.init;
796 <para>We'll simulate a project that has a bug in it in a
797 simple-minded way: create trivial changes in a loop, and
798 nominate one specific change that will have the
799 <quote>bug</quote>. This loop creates 35 changesets, each
800 adding a single file to the repository. We'll represent our
801 <quote>bug</quote> with a file that contains the text <quote>i
802 have a gub</quote>.</para>
804 &interaction.bisect.commits;
806 <para>The next thing that we'd like to do is figure out how to
807 use the <command role="hg-cmd">hg bisect</command> command.
808 We can use Mercurial's normal built-in help mechanism for
809 this.</para>
811 &interaction.bisect.help;
813 <para>The <command role="hg-cmd">hg bisect</command> command
814 works in steps. Each step proceeds as follows.</para>
815 <orderedlist>
816 <listitem><para>You run your binary test.</para>
817 <itemizedlist>
818 <listitem><para>If the test succeeded, you tell <command
819 role="hg-cmd">hg bisect</command> by running the
820 <command role="hg-cmd">hg bisect good</command>
821 command.</para>
822 </listitem>
823 <listitem><para>If it failed, run the <command
824 role="hg-cmd">hg bisect bad</command>
825 command.</para></listitem></itemizedlist>
826 </listitem>
827 <listitem><para>The command uses your information to decide
828 which changeset to test next.</para>
829 </listitem>
830 <listitem><para>It updates the working directory to that
831 changeset, and the process begins again.</para>
832 </listitem></orderedlist>
833 <para>The process ends when <command role="hg-cmd">hg
834 bisect</command> identifies a unique changeset that marks
835 the point where your test transitioned from
836 <quote>succeeding</quote> to <quote>failing</quote>.</para>
838 <para>To start the search, we must run the <command
839 role="hg-cmd">hg bisect --reset</command> command.</para>
841 &interaction.bisect.search.init;
843 <para>In our case, the binary test we use is simple: we check to
844 see if any file in the repository contains the string <quote>i
845 have a gub</quote>. If it does, this changeset contains the
846 change that <quote>caused the bug</quote>. By convention, a
847 changeset that has the property we're searching for is
848 <quote>bad</quote>, while one that doesn't is
849 <quote>good</quote>.</para>
851 <para>Most of the time, the revision to which the working
852 directory is synced (usually the tip) already exhibits the
853 problem introduced by the buggy change, so we'll mark it as
854 <quote>bad</quote>.</para>
856 &interaction.bisect.search.bad-init;
858 <para>Our next task is to nominate a changeset that we know
859 <emphasis>doesn't</emphasis> have the bug; the <command
860 role="hg-cmd">hg bisect</command> command will
861 <quote>bracket</quote> its search between the first pair of
862 good and bad changesets. In our case, we know that revision
863 10 didn't have the bug. (I'll have more words about choosing
864 the first <quote>good</quote> changeset later.)</para>
866 &interaction.bisect.search.good-init;
868 <para>Notice that this command printed some output.</para>
869 <itemizedlist>
870 <listitem><para>It told us how many changesets it must
871 consider before it can identify the one that introduced
872 the bug, and how many tests that will require.</para>
873 </listitem>
874 <listitem><para>It updated the working directory to the next
875 changeset to test, and told us which changeset it's
876 testing.</para>
877 </listitem></itemizedlist>
879 <para>We now run our test in the working directory. We use the
880 <command>grep</command> command to see if our
881 <quote>bad</quote> file is present in the working directory.
882 If it is, this revision is bad; if not, this revision is good.
883 &interaction.bisect.search.step1;</para>
885 <para>This test looks like a perfect candidate for automation,
886 so let's turn it into a shell function.</para>
887 &interaction.bisect.search.mytest;
889 <para>We can now run an entire test step with a single command,
890 <literal>mytest</literal>.</para>
892 &interaction.bisect.search.step2;
894 <para>A few more invocations of our canned test step command,
895 and we're done.</para>
897 &interaction.bisect.search.rest;
899 <para>Even though we had 40 changesets to search through, the
900 <command role="hg-cmd">hg bisect</command> command let us find
901 the changeset that introduced our <quote>bug</quote> with only
902 five tests. Because the number of tests that the <command
903 role="hg-cmd">hg bisect</command> command performs grows
904 logarithmically with the number of changesets to search, the
905 advantage that it has over the <quote>brute force</quote>
906 search approach increases with every changeset you add.</para>
908 </sect2>
909 <sect2>
910 <title>Cleaning up after your search</title>
912 <para>When you're finished using the <command role="hg-cmd">hg
913 bisect</command> command in a repository, you can use the
914 <command role="hg-cmd">hg bisect reset</command> command to
915 drop the information it was using to drive your search. The
916 command doesn't use much space, so it doesn't matter if you
917 forget to run this command. However, <command
918 role="hg-cmd">hg bisect</command> won't let you start a new
919 search in that repository until you do a <command
920 role="hg-cmd">hg bisect reset</command>.</para>
922 &interaction.bisect.search.reset;
924 </sect2>
925 </sect1>
926 <sect1>
927 <title>Tips for finding bugs effectively</title>
929 <sect2>
930 <title>Give consistent input</title>
932 <para>The <command role="hg-cmd">hg bisect</command> command
933 requires that you correctly report the result of every test
934 you perform. If you tell it that a test failed when it really
935 succeeded, it <emphasis>might</emphasis> be able to detect the
936 inconsistency. If it can identify an inconsistency in your
937 reports, it will tell you that a particular changeset is both
938 good and bad. However, it can't do this perfectly; it's about
939 as likely to report the wrong changeset as the source of the
940 bug.</para>
942 </sect2>
943 <sect2>
944 <title>Automate as much as possible</title>
946 <para>When I started using the <command role="hg-cmd">hg
947 bisect</command> command, I tried a few times to run my
948 tests by hand, on the command line. This is an approach that
949 I, at least, am not suited to. After a few tries, I found
950 that I was making enough mistakes that I was having to restart
951 my searches several times before finally getting correct
952 results.</para>
954 <para>My initial problems with driving the <command
955 role="hg-cmd">hg bisect</command> command by hand occurred
956 even with simple searches on small repositories; if the
957 problem you're looking for is more subtle, or the number of
958 tests that <command role="hg-cmd">hg bisect</command> must
959 perform increases, the likelihood of operator error ruining
960 the search is much higher. Once I started automating my
961 tests, I had much better results.</para>
963 <para>The key to automated testing is twofold:</para>
964 <itemizedlist>
965 <listitem><para>always test for the same symptom, and</para>
966 </listitem>
967 <listitem><para>always feed consistent input to the <command
968 role="hg-cmd">hg bisect</command> command.</para>
969 </listitem></itemizedlist>
970 <para>In my tutorial example above, the <command>grep</command>
971 command tests for the symptom, and the <literal>if</literal>
972 statement takes the result of this check and ensures that we
973 always feed the same input to the <command role="hg-cmd">hg
974 bisect</command> command. The <literal>mytest</literal>
975 function marries these together in a reproducible way, so that
976 every test is uniform and consistent.</para>
978 </sect2>
979 <sect2>
980 <title>Check your results</title>
982 <para>Because the output of a <command role="hg-cmd">hg
983 bisect</command> search is only as good as the input you
984 give it, don't take the changeset it reports as the absolute
985 truth. A simple way to cross-check its report is to manually
986 run your test at each of the following changesets:</para>
987 <itemizedlist>
988 <listitem><para>The changeset that it reports as the first bad
989 revision. Your test should still report this as
990 bad.</para>
991 </listitem>
992 <listitem><para>The parent of that changeset (either parent,
993 if it's a merge). Your test should report this changeset
994 as good.</para>
995 </listitem>
996 <listitem><para>A child of that changeset. Your test should
997 report this changeset as bad.</para>
998 </listitem></itemizedlist>
1000 </sect2>
1001 <sect2>
1002 <title>Beware interference between bugs</title>
1004 <para>It's possible that your search for one bug could be
1005 disrupted by the presence of another. For example, let's say
1006 your software crashes at revision 100, and worked correctly at
1007 revision 50. Unknown to you, someone else introduced a
1008 different crashing bug at revision 60, and fixed it at
1009 revision 80. This could distort your results in one of
1010 several ways.</para>
1012 <para>It is possible that this other bug completely
1013 <quote>masks</quote> yours, which is to say that it occurs
1014 before your bug has a chance to manifest itself. If you can't
1015 avoid that other bug (for example, it prevents your project
1016 from building), and so can't tell whether your bug is present
1017 in a particular changeset, the <command role="hg-cmd">hg
1018 bisect</command> command cannot help you directly. Instead,
1019 you can mark a changeset as untested by running <command
1020 role="hg-cmd">hg bisect --skip</command>.</para>
1022 <para>A different problem could arise if your test for a bug's
1023 presence is not specific enough. If you check for <quote>my
1024 program crashes</quote>, then both your crashing bug and an
1025 unrelated crashing bug that masks it will look like the same
1026 thing, and mislead <command role="hg-cmd">hg
1027 bisect</command>.</para>
1029 <para>Another useful situation in which to use <command
1030 role="hg-cmd">hg bisect --skip</command> is if you can't
1031 test a revision because your project was in a broken and hence
1032 untestable state at that revision, perhaps because someone
1033 checked in a change that prevented the project from
1034 building.</para>
1036 </sect2>
1037 <sect2>
1038 <title>Bracket your search lazily</title>
1040 <para>Choosing the first <quote>good</quote> and
1041 <quote>bad</quote> changesets that will mark the end points of
1042 your search is often easy, but it bears a little discussion
1043 nevertheless. From the perspective of <command
1044 role="hg-cmd">hg bisect</command>, the <quote>newest</quote>
1045 changeset is conventionally <quote>bad</quote>, and the older
1046 changeset is <quote>good</quote>.</para>
1048 <para>If you're having trouble remembering when a suitable
1049 <quote>good</quote> change was, so that you can tell <command
1050 role="hg-cmd">hg bisect</command>, you could do worse than
1051 testing changesets at random. Just remember to eliminate
1052 contenders that can't possibly exhibit the bug (perhaps
1053 because the feature with the bug isn't present yet) and those
1054 where another problem masks the bug (as I discussed
1055 above).</para>
1057 <para>Even if you end up <quote>early</quote> by thousands of
1058 changesets or months of history, you will only add a handful
1059 of tests to the total number that <command role="hg-cmd">hg
1060 bisect</command> must perform, thanks to its logarithmic
1061 behaviour.</para>
1063 </sect2>
1064 </sect1>
1065 </chapter>
1067 <!--
1068 local variables:
1069 sgml-parent-document: ("00book.xml" "book" "chapter")
1070 end:
1071 -->