hgbook

view en/ch08-undo.xml @ 595:e0a4ba81f888

Add throbber for comment submission
author Bryan O'Sullivan <bos@serpentine.com>
date Thu Mar 26 22:07:30 2009 -0700 (2009-03-26)
parents 4ce9d0754af3
children 1c13ed2130a7
line source
1 <!-- vim: set filetype=docbkxml shiftwidth=2 autoindent expandtab tw=77 : -->
3 <chapter id="chap:undo">
4 <?dbhtml filename="finding-and-fixing-mistakes.html"?>
5 <title>Finding and fixing mistakes</title>
7 <para id="x_d2">To err might be human, but to really handle the consequences
8 well takes a top-notch revision control system. In this chapter,
9 we'll discuss some of the techniques you can use when you find
10 that a problem has crept into your project. Mercurial has some
11 highly capable features that will help you to isolate the sources
12 of problems, and to handle them appropriately.</para>
14 <sect1>
15 <title>Erasing local history</title>
17 <sect2>
18 <title>The accidental commit</title>
20 <para id="x_d3">I have the occasional but persistent problem of typing
21 rather more quickly than I can think, which sometimes results
22 in me committing a changeset that is either incomplete or
23 plain wrong. In my case, the usual kind of incomplete
24 changeset is one in which I've created a new source file, but
25 forgotten to <command role="hg-cmd">hg add</command> it. A
26 <quote>plain wrong</quote> changeset is not as common, but no
27 less annoying.</para>
29 </sect2>
30 <sect2 id="sec:undo:rollback">
31 <title>Rolling back a transaction</title>
33 <para id="x_d4">In <xref linkend="sec:concepts:txn"/>, I
34 mentioned that Mercurial treats each modification of a
35 repository as a <emphasis>transaction</emphasis>. Every time
36 you commit a changeset or pull changes from another
37 repository, Mercurial remembers what you did. You can undo,
38 or <emphasis>roll back</emphasis>, exactly one of these
39 actions using the <command role="hg-cmd">hg rollback</command>
40 command. (See <xref linkend="sec:undo:rollback-after-push"/>
41 for an important caveat about the use of this command.)</para>
43 <para id="x_d5">Here's a mistake that I often find myself making:
44 committing a change in which I've created a new file, but
45 forgotten to <command role="hg-cmd">hg add</command>
46 it.</para>
48 &interaction.rollback.commit;
50 <para id="x_d6">Looking at the output of <command role="hg-cmd">hg
51 status</command> after the commit immediately confirms the
52 error.</para>
54 &interaction.rollback.status;
56 <para id="x_d7">The commit captured the changes to the file
57 <filename>a</filename>, but not the new file
58 <filename>b</filename>. If I were to push this changeset to a
59 repository that I shared with a colleague, the chances are
60 high that something in <filename>a</filename> would refer to
61 <filename>b</filename>, which would not be present in their
62 repository when they pulled my changes. I would thus become
63 the object of some indignation.</para>
65 <para id="x_d8">However, luck is with me&emdash;I've caught my error
66 before I pushed the changeset. I use the <command
67 role="hg-cmd">hg rollback</command> command, and Mercurial
68 makes that last changeset vanish.</para>
70 &interaction.rollback.rollback;
72 <para id="x_d9">Notice that the changeset is no longer present in the
73 repository's history, and the working directory once again
74 thinks that the file <filename>a</filename> is modified. The
75 commit and rollback have left the working directory exactly as
76 it was prior to the commit; the changeset has been completely
77 erased. I can now safely <command role="hg-cmd">hg
78 add</command> the file <filename>b</filename>, and rerun my
79 commit.</para>
81 &interaction.rollback.add;
83 </sect2>
84 <sect2>
85 <title>The erroneous pull</title>
87 <para id="x_da">It's common practice with Mercurial to maintain separate
88 development branches of a project in different repositories.
89 Your development team might have one shared repository for
90 your project's <quote>0.9</quote> release, and another,
91 containing different changes, for the <quote>1.0</quote>
92 release.</para>
94 <para id="x_db">Given this, you can imagine that the consequences could be
95 messy if you had a local <quote>0.9</quote> repository, and
96 accidentally pulled changes from the shared <quote>1.0</quote>
97 repository into it. At worst, you could be paying
98 insufficient attention, and push those changes into the shared
99 <quote>0.9</quote> tree, confusing your entire team (but don't
100 worry, we'll return to this horror scenario later). However,
101 it's more likely that you'll notice immediately, because
102 Mercurial will display the URL it's pulling from, or you will
103 see it pull a suspiciously large number of changes into the
104 repository.</para>
106 <para id="x_dc">The <command role="hg-cmd">hg rollback</command> command
107 will work nicely to expunge all of the changesets that you
108 just pulled. Mercurial groups all changes from one <command
109 role="hg-cmd">hg pull</command> into a single transaction,
110 so one <command role="hg-cmd">hg rollback</command> is all you
111 need to undo this mistake.</para>
113 </sect2>
114 <sect2 id="sec:undo:rollback-after-push">
115 <title>Rolling back is useless once you've pushed</title>
117 <para id="x_dd">The value of the <command role="hg-cmd">hg
118 rollback</command> command drops to zero once you've pushed
119 your changes to another repository. Rolling back a change
120 makes it disappear entirely, but <emphasis>only</emphasis> in
121 the repository in which you perform the <command
122 role="hg-cmd">hg rollback</command>. Because a rollback
123 eliminates history, there's no way for the disappearance of a
124 change to propagate between repositories.</para>
126 <para id="x_de">If you've pushed a change to another
127 repository&emdash;particularly if it's a shared
128 repository&emdash;it has essentially <quote>escaped into the
129 wild,</quote> and you'll have to recover from your mistake
130 in a different way. What will happen if you push a changeset
131 somewhere, then roll it back, then pull from the repository
132 you pushed to, is that the changeset will reappear in your
133 repository.</para>
135 <para id="x_df">(If you absolutely know for sure that the change you want
136 to roll back is the most recent change in the repository that
137 you pushed to, <emphasis>and</emphasis> you know that nobody
138 else could have pulled it from that repository, you can roll
139 back the changeset there, too, but you really should really
140 not rely on this working reliably. If you do this, sooner or
141 later a change really will make it into a repository that you
142 don't directly control (or have forgotten about), and come
143 back to bite you.)</para>
145 </sect2>
146 <sect2>
147 <title>You can only roll back once</title>
149 <para id="x_e0">Mercurial stores exactly one transaction in its
150 transaction log; that transaction is the most recent one that
151 occurred in the repository. This means that you can only roll
152 back one transaction. If you expect to be able to roll back
153 one transaction, then its predecessor, this is not the
154 behaviour you will get.</para>
156 &interaction.rollback.twice;
158 <para id="x_e1">Once you've rolled back one transaction in a repository,
159 you can't roll back again in that repository until you perform
160 another commit or pull.</para>
162 </sect2>
163 </sect1>
164 <sect1>
165 <title>Reverting the mistaken change</title>
167 <para id="x_e2">If you make a modification to a file, and decide that you
168 really didn't want to change the file at all, and you haven't
169 yet committed your changes, the <command role="hg-cmd">hg
170 revert</command> command is the one you'll need. It looks at
171 the changeset that's the parent of the working directory, and
172 restores the contents of the file to their state as of that
173 changeset. (That's a long-winded way of saying that, in the
174 normal case, it undoes your modifications.)</para>
176 <para id="x_e3">Let's illustrate how the <command role="hg-cmd">hg
177 revert</command> command works with yet another small example.
178 We'll begin by modifying a file that Mercurial is already
179 tracking.</para>
181 &interaction.daily.revert.modify;
183 <para id="x_e4">If we don't
184 want that change, we can simply <command role="hg-cmd">hg
185 revert</command> the file.</para>
187 &interaction.daily.revert.unmodify;
189 <para id="x_e5">The <command role="hg-cmd">hg revert</command> command
190 provides us with an extra degree of safety by saving our
191 modified file with a <filename>.orig</filename>
192 extension.</para>
194 &interaction.daily.revert.status;
196 <para id="x_e6">Here is a summary of the cases that the <command
197 role="hg-cmd">hg revert</command> command can deal with. We
198 will describe each of these in more detail in the section that
199 follows.</para>
200 <itemizedlist>
201 <listitem><para id="x_e7">If you modify a file, it will restore the file
202 to its unmodified state.</para>
203 </listitem>
204 <listitem><para id="x_e8">If you <command role="hg-cmd">hg add</command> a
205 file, it will undo the <quote>added</quote> state of the
206 file, but leave the file itself untouched.</para>
207 </listitem>
208 <listitem><para id="x_e9">If you delete a file without telling Mercurial,
209 it will restore the file to its unmodified contents.</para>
210 </listitem>
211 <listitem><para id="x_ea">If you use the <command role="hg-cmd">hg
212 remove</command> command to remove a file, it will undo
213 the <quote>removed</quote> state of the file, and restore
214 the file to its unmodified contents.</para>
215 </listitem></itemizedlist>
217 <sect2 id="sec:undo:mgmt">
218 <title>File management errors</title>
220 <para id="x_eb">The <command role="hg-cmd">hg revert</command> command is
221 useful for more than just modified files. It lets you reverse
222 the results of all of Mercurial's file management
223 commands&emdash;<command role="hg-cmd">hg add</command>,
224 <command role="hg-cmd">hg remove</command>, and so on.</para>
226 <para id="x_ec">If you <command role="hg-cmd">hg add</command> a file,
227 then decide that in fact you don't want Mercurial to track it,
228 use <command role="hg-cmd">hg revert</command> to undo the
229 add. Don't worry; Mercurial will not modify the file in any
230 way. It will just <quote>unmark</quote> the file.</para>
232 &interaction.daily.revert.add;
234 <para id="x_ed">Similarly, if you ask Mercurial to <command
235 role="hg-cmd">hg remove</command> a file, you can use
236 <command role="hg-cmd">hg revert</command> to restore it to
237 the contents it had as of the parent of the working directory.
238 &interaction.daily.revert.remove; This works just as
239 well for a file that you deleted by hand, without telling
240 Mercurial (recall that in Mercurial terminology, this kind of
241 file is called <quote>missing</quote>).</para>
243 &interaction.daily.revert.missing;
245 <para id="x_ee">If you revert a <command role="hg-cmd">hg copy</command>,
246 the copied-to file remains in your working directory
247 afterwards, untracked. Since a copy doesn't affect the
248 copied-from file in any way, Mercurial doesn't do anything
249 with the copied-from file.</para>
251 &interaction.daily.revert.copy;
253 <sect3>
254 <title>A slightly special case: reverting a rename</title>
256 <para id="x_ef">If you <command role="hg-cmd">hg rename</command> a
257 file, there is one small detail that you should remember.
258 When you <command role="hg-cmd">hg revert</command> a
259 rename, it's not enough to provide the name of the
260 renamed-to file, as you can see here.</para>
262 &interaction.daily.revert.rename;
264 <para id="x_f0">As you can see from the output of <command
265 role="hg-cmd">hg status</command>, the renamed-to file is
266 no longer identified as added, but the
267 renamed-<emphasis>from</emphasis> file is still removed!
268 This is counter-intuitive (at least to me), but at least
269 it's easy to deal with.</para>
271 &interaction.daily.revert.rename-orig;
273 <para id="x_f1">So remember, to revert a <command role="hg-cmd">hg
274 rename</command>, you must provide
275 <emphasis>both</emphasis> the source and destination
276 names.</para>
278 <para id="x_f2">% TODO: the output doesn't look like it will be
279 removed!</para>
281 <para id="x_f3">(By the way, if you rename a file, then modify the
282 renamed-to file, then revert both components of the rename,
283 when Mercurial restores the file that was removed as part of
284 the rename, it will be unmodified. If you need the
285 modifications in the renamed-to file to show up in the
286 renamed-from file, don't forget to copy them over.)</para>
288 <para id="x_f4">These fiddly aspects of reverting a rename arguably
289 constitute a small bug in Mercurial.</para>
291 </sect3>
292 </sect2>
293 </sect1>
294 <sect1>
295 <title>Dealing with committed changes</title>
297 <para id="x_f5">Consider a case where you have committed a change $a$, and
298 another change $b$ on top of it; you then realise that change
299 $a$ was incorrect. Mercurial lets you <quote>back out</quote>
300 an entire changeset automatically, and building blocks that let
301 you reverse part of a changeset by hand.</para>
303 <para id="x_f6">Before you read this section, here's something to
304 keep in mind: the <command role="hg-cmd">hg backout</command>
305 command undoes changes by <emphasis>adding</emphasis> history,
306 not by modifying or erasing it. It's the right tool to use if
307 you're fixing bugs, but not if you're trying to undo some change
308 that has catastrophic consequences. To deal with those, see
309 <xref linkend="sec:undo:aaaiiieee"/>.</para>
311 <sect2>
312 <title>Backing out a changeset</title>
314 <para id="x_f7">The <command role="hg-cmd">hg backout</command> command
315 lets you <quote>undo</quote> the effects of an entire
316 changeset in an automated fashion. Because Mercurial's
317 history is immutable, this command <emphasis>does
318 not</emphasis> get rid of the changeset you want to undo.
319 Instead, it creates a new changeset that
320 <emphasis>reverses</emphasis> the effect of the to-be-undone
321 changeset.</para>
323 <para id="x_f8">The operation of the <command role="hg-cmd">hg
324 backout</command> command is a little intricate, so let's
325 illustrate it with some examples. First, we'll create a
326 repository with some simple changes.</para>
328 &interaction.backout.init;
330 <para id="x_f9">The <command role="hg-cmd">hg backout</command> command
331 takes a single changeset ID as its argument; this is the
332 changeset to back out. Normally, <command role="hg-cmd">hg
333 backout</command> will drop you into a text editor to write
334 a commit message, so you can record why you're backing the
335 change out. In this example, we provide a commit message on
336 the command line using the <option
337 role="hg-opt-backout">-m</option> option.</para>
339 </sect2>
340 <sect2>
341 <title>Backing out the tip changeset</title>
343 <para id="x_fa">We're going to start by backing out the last changeset we
344 committed.</para>
346 &interaction.backout.simple;
348 <para id="x_fb">You can see that the second line from
349 <filename>myfile</filename> is no longer present. Taking a
350 look at the output of <command role="hg-cmd">hg log</command>
351 gives us an idea of what the <command role="hg-cmd">hg
352 backout</command> command has done.
353 &interaction.backout.simple.log; Notice that the new changeset
354 that <command role="hg-cmd">hg backout</command> has created
355 is a child of the changeset we backed out. It's easier to see
356 this in <xref linkend="fig:undo:backout"/>, which presents a
357 graphical view of the change history. As you can see, the
358 history is nice and linear.</para>
360 <figure id="fig:undo:backout">
361 <title>Backing out a change using the <command
362 role="hg-cmd">hg backout</command> command</title>
363 <mediaobject>
364 <imageobject><imagedata fileref="figs/undo-simple.png"/></imageobject>
365 <textobject><phrase>XXX add text</phrase></textobject>
366 </mediaobject>
367 </figure>
369 </sect2>
370 <sect2>
371 <title>Backing out a non-tip change</title>
373 <para id="x_fd">If you want to back out a change other than the last one
374 you committed, pass the <option
375 role="hg-opt-backout">--merge</option> option to the
376 <command role="hg-cmd">hg backout</command> command.</para>
378 &interaction.backout.non-tip.clone;
380 <para id="x_fe">This makes backing out any changeset a
381 <quote>one-shot</quote> operation that's usually simple and
382 fast.</para>
384 &interaction.backout.non-tip.backout;
386 <para id="x_ff">If you take a look at the contents of
387 <filename>myfile</filename> after the backout finishes, you'll
388 see that the first and third changes are present, but not the
389 second.</para>
391 &interaction.backout.non-tip.cat;
393 <para id="x_100">As the graphical history in <xref
394 linkend="fig:undo:backout-non-tip"/> illustrates, Mercurial
395 actually commits <emphasis>two</emphasis> changes in this kind
396 of situation (the box-shaped nodes are the ones that Mercurial
397 commits automatically). Before Mercurial begins the backout
398 process, it first remembers what the current parent of the
399 working directory is. It then backs out the target changeset,
400 and commits that as a changeset. Finally, it merges back to
401 the previous parent of the working directory, and commits the
402 result of the merge.</para>
404 <para id="x_101">% TODO: to me it looks like mercurial doesn't commit the
405 second merge automatically!</para>
407 <figure id="fig:undo:backout-non-tip">
408 <title>Automated backout of a non-tip change using the
409 <command role="hg-cmd">hg backout</command> command</title>
410 <mediaobject>
411 <imageobject><imagedata fileref="figs/undo-non-tip.png"/></imageobject>
412 <textobject><phrase>XXX add text</phrase></textobject>
413 </mediaobject>
414 </figure>
416 <para id="x_103">The result is that you end up <quote>back where you
417 were</quote>, only with some extra history that undoes the
418 effect of the changeset you wanted to back out.</para>
420 <sect3>
421 <title>Always use the <option
422 role="hg-opt-backout">--merge</option> option</title>
424 <para id="x_104">In fact, since the <option
425 role="hg-opt-backout">--merge</option> option will do the
426 <quote>right thing</quote> whether or not the changeset
427 you're backing out is the tip (i.e. it won't try to merge if
428 it's backing out the tip, since there's no need), you should
429 <emphasis>always</emphasis> use this option when you run the
430 <command role="hg-cmd">hg backout</command> command.</para>
432 </sect3>
433 </sect2>
434 <sect2>
435 <title>Gaining more control of the backout process</title>
437 <para id="x_105">While I've recommended that you always use the <option
438 role="hg-opt-backout">--merge</option> option when backing
439 out a change, the <command role="hg-cmd">hg backout</command>
440 command lets you decide how to merge a backout changeset.
441 Taking control of the backout process by hand is something you
442 will rarely need to do, but it can be useful to understand
443 what the <command role="hg-cmd">hg backout</command> command
444 is doing for you automatically. To illustrate this, let's
445 clone our first repository, but omit the backout change that
446 it contains.</para>
448 &interaction.backout.manual.clone;
450 <para id="x_106">As with our
451 earlier example, We'll commit a third changeset, then back out
452 its parent, and see what happens.</para>
454 &interaction.backout.manual.backout;
456 <para id="x_107">Our new changeset is again a descendant of the changeset
457 we backout out; it's thus a new head, <emphasis>not</emphasis>
458 a descendant of the changeset that was the tip. The <command
459 role="hg-cmd">hg backout</command> command was quite
460 explicit in telling us this.</para>
462 &interaction.backout.manual.log;
464 <para id="x_108">Again, it's easier to see what has happened by looking at
465 a graph of the revision history, in <xref
466 linkend="fig:undo:backout-manual"/>. This makes it clear
467 that when we use <command role="hg-cmd">hg backout</command>
468 to back out a change other than the tip, Mercurial adds a new
469 head to the repository (the change it committed is
470 box-shaped).</para>
472 <figure id="fig:undo:backout-manual">
473 <title>Backing out a change using the <command
474 role="hg-cmd">hg backout</command> command</title>
475 <mediaobject>
476 <imageobject><imagedata fileref="figs/undo-manual.png"/></imageobject>
477 <textobject><phrase>XXX add text</phrase></textobject>
478 </mediaobject>
479 </figure>
481 <para id="x_10a">After the <command role="hg-cmd">hg backout</command>
482 command has completed, it leaves the new
483 <quote>backout</quote> changeset as the parent of the working
484 directory.</para>
486 &interaction.backout.manual.parents;
488 <para id="x_10b">Now we have two isolated sets of changes.</para>
490 &interaction.backout.manual.heads;
492 <para id="x_10c">Let's think about what we expect to see as the contents of
493 <filename>myfile</filename> now. The first change should be
494 present, because we've never backed it out. The second change
495 should be missing, as that's the change we backed out. Since
496 the history graph shows the third change as a separate head,
497 we <emphasis>don't</emphasis> expect to see the third change
498 present in <filename>myfile</filename>.</para>
500 &interaction.backout.manual.cat;
502 <para id="x_10d">To get the third change back into the file, we just do a
503 normal merge of our two heads.</para>
505 &interaction.backout.manual.merge;
507 <para id="x_10e">Afterwards, the graphical history of our
508 repository looks like
509 <xref linkend="fig:undo:backout-manual-merge"/>.</para>
511 <figure id="fig:undo:backout-manual-merge">
512 <title>Manually merging a backout change</title>
513 <mediaobject>
514 <imageobject><imagedata fileref="figs/undo-manual-merge.png"/></imageobject>
515 <textobject><phrase>XXX add text</phrase></textobject>
516 </mediaobject>
517 </figure>
519 </sect2>
520 <sect2>
521 <title>Why <command role="hg-cmd">hg backout</command> works as
522 it does</title>
524 <para id="x_110">Here's a brief description of how the <command
525 role="hg-cmd">hg backout</command> command works.</para>
526 <orderedlist>
527 <listitem><para id="x_111">It ensures that the working directory is
528 <quote>clean</quote>, i.e. that the output of <command
529 role="hg-cmd">hg status</command> would be empty.</para>
530 </listitem>
531 <listitem><para id="x_112">It remembers the current parent of the working
532 directory. Let's call this changeset
533 <literal>orig</literal></para>
534 </listitem>
535 <listitem><para id="x_113">It does the equivalent of a <command
536 role="hg-cmd">hg update</command> to sync the working
537 directory to the changeset you want to back out. Let's
538 call this changeset <literal>backout</literal></para>
539 </listitem>
540 <listitem><para id="x_114">It finds the parent of that changeset. Let's
541 call that changeset <literal>parent</literal>.</para>
542 </listitem>
543 <listitem><para id="x_115">For each file that the
544 <literal>backout</literal> changeset affected, it does the
545 equivalent of a <command role="hg-cmd">hg revert -r
546 parent</command> on that file, to restore it to the
547 contents it had before that changeset was
548 committed.</para>
549 </listitem>
550 <listitem><para id="x_116">It commits the result as a new changeset.
551 This changeset has <literal>backout</literal> as its
552 parent.</para>
553 </listitem>
554 <listitem><para id="x_117">If you specify <option
555 role="hg-opt-backout">--merge</option> on the command
556 line, it merges with <literal>orig</literal>, and commits
557 the result of the merge.</para>
558 </listitem></orderedlist>
560 <para id="x_118">An alternative way to implement the <command
561 role="hg-cmd">hg backout</command> command would be to
562 <command role="hg-cmd">hg export</command> the
563 to-be-backed-out changeset as a diff, then use the <option
564 role="cmd-opt-patch">--reverse</option> option to the
565 <command>patch</command> command to reverse the effect of the
566 change without fiddling with the working directory. This
567 sounds much simpler, but it would not work nearly as
568 well.</para>
570 <para id="x_119">The reason that <command role="hg-cmd">hg
571 backout</command> does an update, a commit, a merge, and
572 another commit is to give the merge machinery the best chance
573 to do a good job when dealing with all the changes
574 <emphasis>between</emphasis> the change you're backing out and
575 the current tip.</para>
577 <para id="x_11a">If you're backing out a changeset that's 100 revisions
578 back in your project's history, the chances that the
579 <command>patch</command> command will be able to apply a
580 reverse diff cleanly are not good, because intervening changes
581 are likely to have <quote>broken the context</quote> that
582 <command>patch</command> uses to determine whether it can
583 apply a patch (if this sounds like gibberish, see <xref
584 linkend="sec:mq:patch"/> for a
585 discussion of the <command>patch</command> command). Also,
586 Mercurial's merge machinery will handle files and directories
587 being renamed, permission changes, and modifications to binary
588 files, none of which <command>patch</command> can deal
589 with.</para>
591 </sect2>
592 </sect1>
593 <sect1 id="sec:undo:aaaiiieee">
594 <title>Changes that should never have been</title>
596 <para id="x_11b">Most of the time, the <command role="hg-cmd">hg
597 backout</command> command is exactly what you need if you want
598 to undo the effects of a change. It leaves a permanent record
599 of exactly what you did, both when committing the original
600 changeset and when you cleaned up after it.</para>
602 <para id="x_11c">On rare occasions, though, you may find that you've
603 committed a change that really should not be present in the
604 repository at all. For example, it would be very unusual, and
605 usually considered a mistake, to commit a software project's
606 object files as well as its source files. Object files have
607 almost no intrinsic value, and they're <emphasis>big</emphasis>,
608 so they increase the size of the repository and the amount of
609 time it takes to clone or pull changes.</para>
611 <para id="x_11d">Before I discuss the options that you have if you commit a
612 <quote>brown paper bag</quote> change (the kind that's so bad
613 that you want to pull a brown paper bag over your head), let me
614 first discuss some approaches that probably won't work.</para>
616 <para id="x_11e">Since Mercurial treats history as
617 accumulative&emdash;every change builds on top of all changes
618 that preceded it&emdash;you generally can't just make disastrous
619 changes disappear. The one exception is when you've just
620 committed a change, and it hasn't been pushed or pulled into
621 another repository. That's when you can safely use the <command
622 role="hg-cmd">hg rollback</command> command, as I detailed in
623 <xref linkend="sec:undo:rollback"/>.</para>
625 <para id="x_11f">After you've pushed a bad change to another repository, you
626 <emphasis>could</emphasis> still use <command role="hg-cmd">hg
627 rollback</command> to make your local copy of the change
628 disappear, but it won't have the consequences you want. The
629 change will still be present in the remote repository, so it
630 will reappear in your local repository the next time you
631 pull.</para>
633 <para id="x_120">If a situation like this arises, and you know which
634 repositories your bad change has propagated into, you can
635 <emphasis>try</emphasis> to get rid of the changeefrom
636 <emphasis>every</emphasis> one of those repositories. This is,
637 of course, not a satisfactory solution: if you miss even a
638 single repository while you're expunging, the change is still
639 <quote>in the wild</quote>, and could propagate further.</para>
641 <para id="x_121">If you've committed one or more changes
642 <emphasis>after</emphasis> the change that you'd like to see
643 disappear, your options are further reduced. Mercurial doesn't
644 provide a way to <quote>punch a hole</quote> in history, leaving
645 changesets intact.</para>
647 <para id="x_122">XXX This needs filling out. The
648 <literal>hg-replay</literal> script in the
649 <literal>examples</literal> directory works, but doesn't handle
650 merge changesets. Kind of an important omission.</para>
652 <sect2>
653 <title>Protect yourself from <quote>escaped</quote>
654 changes</title>
656 <para id="x_123">If you've committed some changes to your local repository
657 and they've been pushed or pulled somewhere else, this isn't
658 necessarily a disaster. You can protect yourself ahead of
659 time against some classes of bad changeset. This is
660 particularly easy if your team usually pulls changes from a
661 central repository.</para>
663 <para id="x_124">By configuring some hooks on that repository to validate
664 incoming changesets (see chapter <xref linkend="chap:hook"/>),
665 you can
666 automatically prevent some kinds of bad changeset from being
667 pushed to the central repository at all. With such a
668 configuration in place, some kinds of bad changeset will
669 naturally tend to <quote>die out</quote> because they can't
670 propagate into the central repository. Better yet, this
671 happens without any need for explicit intervention.</para>
673 <para id="x_125">For instance, an incoming change hook that verifies that a
674 changeset will actually compile can prevent people from
675 inadvertantly <quote>breaking the build</quote>.</para>
677 </sect2>
678 </sect1>
679 <sect1 id="sec:undo:bisect">
680 <title>Finding the source of a bug</title>
682 <para id="x_126">While it's all very well to be able to back out a changeset
683 that introduced a bug, this requires that you know which
684 changeset to back out. Mercurial provides an invaluable
685 command, called <command role="hg-cmd">hg bisect</command>, that
686 helps you to automate this process and accomplish it very
687 efficiently.</para>
689 <para id="x_127">The idea behind the <command role="hg-cmd">hg
690 bisect</command> command is that a changeset has introduced
691 some change of behaviour that you can identify with a simple
692 binary test. You don't know which piece of code introduced the
693 change, but you know how to test for the presence of the bug.
694 The <command role="hg-cmd">hg bisect</command> command uses your
695 test to direct its search for the changeset that introduced the
696 code that caused the bug.</para>
698 <para id="x_128">Here are a few scenarios to help you understand how you
699 might apply this command.</para>
700 <itemizedlist>
701 <listitem><para id="x_129">The most recent version of your software has a
702 bug that you remember wasn't present a few weeks ago, but
703 you don't know when it was introduced. Here, your binary
704 test checks for the presence of that bug.</para>
705 </listitem>
706 <listitem><para id="x_12a">You fixed a bug in a rush, and now it's time to
707 close the entry in your team's bug database. The bug
708 database requires a changeset ID when you close an entry,
709 but you don't remember which changeset you fixed the bug in.
710 Once again, your binary test checks for the presence of the
711 bug.</para>
712 </listitem>
713 <listitem><para id="x_12b">Your software works correctly, but runs 15%
714 slower than the last time you measured it. You want to know
715 which changeset introduced the performance regression. In
716 this case, your binary test measures the performance of your
717 software, to see whether it's <quote>fast</quote> or
718 <quote>slow</quote>.</para>
719 </listitem>
720 <listitem><para id="x_12c">The sizes of the components of your project that
721 you ship exploded recently, and you suspect that something
722 changed in the way you build your project.</para>
723 </listitem></itemizedlist>
725 <para id="x_12d">From these examples, it should be clear that the <command
726 role="hg-cmd">hg bisect</command> command is not useful only
727 for finding the sources of bugs. You can use it to find any
728 <quote>emergent property</quote> of a repository (anything that
729 you can't find from a simple text search of the files in the
730 tree) for which you can write a binary test.</para>
732 <para id="x_12e">We'll introduce a little bit of terminology here, just to
733 make it clear which parts of the search process are your
734 responsibility, and which are Mercurial's. A
735 <emphasis>test</emphasis> is something that
736 <emphasis>you</emphasis> run when <command role="hg-cmd">hg
737 bisect</command> chooses a changeset. A
738 <emphasis>probe</emphasis> is what <command role="hg-cmd">hg
739 bisect</command> runs to tell whether a revision is good.
740 Finally, we'll use the word <quote>bisect</quote>, as both a
741 noun and a verb, to stand in for the phrase <quote>search using
742 the <command role="hg-cmd">hg bisect</command>
743 command</quote>.</para>
745 <para id="x_12f">One simple way to automate the searching process would be
746 simply to probe every changeset. However, this scales poorly.
747 If it took ten minutes to test a single changeset, and you had
748 10,000 changesets in your repository, the exhaustive approach
749 would take on average 35 <emphasis>days</emphasis> to find the
750 changeset that introduced a bug. Even if you knew that the bug
751 was introduced by one of the last 500 changesets, and limited
752 your search to those, you'd still be looking at over 40 hours to
753 find the changeset that introduced your bug.</para>
755 <para id="x_130">What the <command role="hg-cmd">hg bisect</command> command
756 does is use its knowledge of the <quote>shape</quote> of your
757 project's revision history to perform a search in time
758 proportional to the <emphasis>logarithm</emphasis> of the number
759 of changesets to check (the kind of search it performs is called
760 a dichotomic search). With this approach, searching through
761 10,000 changesets will take less than three hours, even at ten
762 minutes per test (the search will require about 14 tests).
763 Limit your search to the last hundred changesets, and it will
764 take only about an hour (roughly seven tests).</para>
766 <para id="x_131">The <command role="hg-cmd">hg bisect</command> command is
767 aware of the <quote>branchy</quote> nature of a Mercurial
768 project's revision history, so it has no problems dealing with
769 branches, merges, or multiple heads in a repository. It can
770 prune entire branches of history with a single probe, which is
771 how it operates so efficiently.</para>
773 <sect2>
774 <title>Using the <command role="hg-cmd">hg bisect</command>
775 command</title>
777 <para id="x_132">Here's an example of <command role="hg-cmd">hg
778 bisect</command> in action.</para>
780 <note>
781 <para id="x_133"> In versions 0.9.5 and earlier of Mercurial, <command
782 role="hg-cmd">hg bisect</command> was not a core command:
783 it was distributed with Mercurial as an extension. This
784 section describes the built-in command, not the old
785 extension.</para>
786 </note>
788 <para id="x_134">Now let's create a repository, so that we can try out the
789 <command role="hg-cmd">hg bisect</command> command in
790 isolation.</para>
792 &interaction.bisect.init;
794 <para id="x_135">We'll simulate a project that has a bug in it in a
795 simple-minded way: create trivial changes in a loop, and
796 nominate one specific change that will have the
797 <quote>bug</quote>. This loop creates 35 changesets, each
798 adding a single file to the repository. We'll represent our
799 <quote>bug</quote> with a file that contains the text <quote>i
800 have a gub</quote>.</para>
802 &interaction.bisect.commits;
804 <para id="x_136">The next thing that we'd like to do is figure out how to
805 use the <command role="hg-cmd">hg bisect</command> command.
806 We can use Mercurial's normal built-in help mechanism for
807 this.</para>
809 &interaction.bisect.help;
811 <para id="x_137">The <command role="hg-cmd">hg bisect</command> command
812 works in steps. Each step proceeds as follows.</para>
813 <orderedlist>
814 <listitem><para id="x_138">You run your binary test.</para>
815 <itemizedlist>
816 <listitem><para id="x_139">If the test succeeded, you tell <command
817 role="hg-cmd">hg bisect</command> by running the
818 <command role="hg-cmd">hg bisect good</command>
819 command.</para>
820 </listitem>
821 <listitem><para id="x_13a">If it failed, run the <command
822 role="hg-cmd">hg bisect bad</command>
823 command.</para></listitem></itemizedlist>
824 </listitem>
825 <listitem><para id="x_13b">The command uses your information to decide
826 which changeset to test next.</para>
827 </listitem>
828 <listitem><para id="x_13c">It updates the working directory to that
829 changeset, and the process begins again.</para>
830 </listitem></orderedlist>
831 <para id="x_13d">The process ends when <command role="hg-cmd">hg
832 bisect</command> identifies a unique changeset that marks
833 the point where your test transitioned from
834 <quote>succeeding</quote> to <quote>failing</quote>.</para>
836 <para id="x_13e">To start the search, we must run the <command
837 role="hg-cmd">hg bisect --reset</command> command.</para>
839 &interaction.bisect.search.init;
841 <para id="x_13f">In our case, the binary test we use is simple: we check to
842 see if any file in the repository contains the string <quote>i
843 have a gub</quote>. If it does, this changeset contains the
844 change that <quote>caused the bug</quote>. By convention, a
845 changeset that has the property we're searching for is
846 <quote>bad</quote>, while one that doesn't is
847 <quote>good</quote>.</para>
849 <para id="x_140">Most of the time, the revision to which the working
850 directory is synced (usually the tip) already exhibits the
851 problem introduced by the buggy change, so we'll mark it as
852 <quote>bad</quote>.</para>
854 &interaction.bisect.search.bad-init;
856 <para id="x_141">Our next task is to nominate a changeset that we know
857 <emphasis>doesn't</emphasis> have the bug; the <command
858 role="hg-cmd">hg bisect</command> command will
859 <quote>bracket</quote> its search between the first pair of
860 good and bad changesets. In our case, we know that revision
861 10 didn't have the bug. (I'll have more words about choosing
862 the first <quote>good</quote> changeset later.)</para>
864 &interaction.bisect.search.good-init;
866 <para id="x_142">Notice that this command printed some output.</para>
867 <itemizedlist>
868 <listitem><para id="x_143">It told us how many changesets it must
869 consider before it can identify the one that introduced
870 the bug, and how many tests that will require.</para>
871 </listitem>
872 <listitem><para id="x_144">It updated the working directory to the next
873 changeset to test, and told us which changeset it's
874 testing.</para>
875 </listitem></itemizedlist>
877 <para id="x_145">We now run our test in the working directory. We use the
878 <command>grep</command> command to see if our
879 <quote>bad</quote> file is present in the working directory.
880 If it is, this revision is bad; if not, this revision is good.
881 &interaction.bisect.search.step1;</para>
883 <para id="x_146">This test looks like a perfect candidate for automation,
884 so let's turn it into a shell function.</para>
885 &interaction.bisect.search.mytest;
887 <para id="x_147">We can now run an entire test step with a single command,
888 <literal>mytest</literal>.</para>
890 &interaction.bisect.search.step2;
892 <para id="x_148">A few more invocations of our canned test step command,
893 and we're done.</para>
895 &interaction.bisect.search.rest;
897 <para id="x_149">Even though we had 40 changesets to search through, the
898 <command role="hg-cmd">hg bisect</command> command let us find
899 the changeset that introduced our <quote>bug</quote> with only
900 five tests. Because the number of tests that the <command
901 role="hg-cmd">hg bisect</command> command performs grows
902 logarithmically with the number of changesets to search, the
903 advantage that it has over the <quote>brute force</quote>
904 search approach increases with every changeset you add.</para>
906 </sect2>
907 <sect2>
908 <title>Cleaning up after your search</title>
910 <para id="x_14a">When you're finished using the <command role="hg-cmd">hg
911 bisect</command> command in a repository, you can use the
912 <command role="hg-cmd">hg bisect reset</command> command to
913 drop the information it was using to drive your search. The
914 command doesn't use much space, so it doesn't matter if you
915 forget to run this command. However, <command
916 role="hg-cmd">hg bisect</command> won't let you start a new
917 search in that repository until you do a <command
918 role="hg-cmd">hg bisect reset</command>.</para>
920 &interaction.bisect.search.reset;
922 </sect2>
923 </sect1>
924 <sect1>
925 <title>Tips for finding bugs effectively</title>
927 <sect2>
928 <title>Give consistent input</title>
930 <para id="x_14b">The <command role="hg-cmd">hg bisect</command> command
931 requires that you correctly report the result of every test
932 you perform. If you tell it that a test failed when it really
933 succeeded, it <emphasis>might</emphasis> be able to detect the
934 inconsistency. If it can identify an inconsistency in your
935 reports, it will tell you that a particular changeset is both
936 good and bad. However, it can't do this perfectly; it's about
937 as likely to report the wrong changeset as the source of the
938 bug.</para>
940 </sect2>
941 <sect2>
942 <title>Automate as much as possible</title>
944 <para id="x_14c">When I started using the <command role="hg-cmd">hg
945 bisect</command> command, I tried a few times to run my
946 tests by hand, on the command line. This is an approach that
947 I, at least, am not suited to. After a few tries, I found
948 that I was making enough mistakes that I was having to restart
949 my searches several times before finally getting correct
950 results.</para>
952 <para id="x_14d">My initial problems with driving the <command
953 role="hg-cmd">hg bisect</command> command by hand occurred
954 even with simple searches on small repositories; if the
955 problem you're looking for is more subtle, or the number of
956 tests that <command role="hg-cmd">hg bisect</command> must
957 perform increases, the likelihood of operator error ruining
958 the search is much higher. Once I started automating my
959 tests, I had much better results.</para>
961 <para id="x_14e">The key to automated testing is twofold:</para>
962 <itemizedlist>
963 <listitem><para id="x_14f">always test for the same symptom, and</para>
964 </listitem>
965 <listitem><para id="x_150">always feed consistent input to the <command
966 role="hg-cmd">hg bisect</command> command.</para>
967 </listitem></itemizedlist>
968 <para id="x_151">In my tutorial example above, the <command>grep</command>
969 command tests for the symptom, and the <literal>if</literal>
970 statement takes the result of this check and ensures that we
971 always feed the same input to the <command role="hg-cmd">hg
972 bisect</command> command. The <literal>mytest</literal>
973 function marries these together in a reproducible way, so that
974 every test is uniform and consistent.</para>
976 </sect2>
977 <sect2>
978 <title>Check your results</title>
980 <para id="x_152">Because the output of a <command role="hg-cmd">hg
981 bisect</command> search is only as good as the input you
982 give it, don't take the changeset it reports as the absolute
983 truth. A simple way to cross-check its report is to manually
984 run your test at each of the following changesets:</para>
985 <itemizedlist>
986 <listitem><para id="x_153">The changeset that it reports as the first bad
987 revision. Your test should still report this as
988 bad.</para>
989 </listitem>
990 <listitem><para id="x_154">The parent of that changeset (either parent,
991 if it's a merge). Your test should report this changeset
992 as good.</para>
993 </listitem>
994 <listitem><para id="x_155">A child of that changeset. Your test should
995 report this changeset as bad.</para>
996 </listitem></itemizedlist>
998 </sect2>
999 <sect2>
1000 <title>Beware interference between bugs</title>
1002 <para id="x_156">It's possible that your search for one bug could be
1003 disrupted by the presence of another. For example, let's say
1004 your software crashes at revision 100, and worked correctly at
1005 revision 50. Unknown to you, someone else introduced a
1006 different crashing bug at revision 60, and fixed it at
1007 revision 80. This could distort your results in one of
1008 several ways.</para>
1010 <para id="x_157">It is possible that this other bug completely
1011 <quote>masks</quote> yours, which is to say that it occurs
1012 before your bug has a chance to manifest itself. If you can't
1013 avoid that other bug (for example, it prevents your project
1014 from building), and so can't tell whether your bug is present
1015 in a particular changeset, the <command role="hg-cmd">hg
1016 bisect</command> command cannot help you directly. Instead,
1017 you can mark a changeset as untested by running <command
1018 role="hg-cmd">hg bisect --skip</command>.</para>
1020 <para id="x_158">A different problem could arise if your test for a bug's
1021 presence is not specific enough. If you check for <quote>my
1022 program crashes</quote>, then both your crashing bug and an
1023 unrelated crashing bug that masks it will look like the same
1024 thing, and mislead <command role="hg-cmd">hg
1025 bisect</command>.</para>
1027 <para id="x_159">Another useful situation in which to use <command
1028 role="hg-cmd">hg bisect --skip</command> is if you can't
1029 test a revision because your project was in a broken and hence
1030 untestable state at that revision, perhaps because someone
1031 checked in a change that prevented the project from
1032 building.</para>
1034 </sect2>
1035 <sect2>
1036 <title>Bracket your search lazily</title>
1038 <para id="x_15a">Choosing the first <quote>good</quote> and
1039 <quote>bad</quote> changesets that will mark the end points of
1040 your search is often easy, but it bears a little discussion
1041 nevertheless. From the perspective of <command
1042 role="hg-cmd">hg bisect</command>, the <quote>newest</quote>
1043 changeset is conventionally <quote>bad</quote>, and the older
1044 changeset is <quote>good</quote>.</para>
1046 <para id="x_15b">If you're having trouble remembering when a suitable
1047 <quote>good</quote> change was, so that you can tell <command
1048 role="hg-cmd">hg bisect</command>, you could do worse than
1049 testing changesets at random. Just remember to eliminate
1050 contenders that can't possibly exhibit the bug (perhaps
1051 because the feature with the bug isn't present yet) and those
1052 where another problem masks the bug (as I discussed
1053 above).</para>
1055 <para id="x_15c">Even if you end up <quote>early</quote> by thousands of
1056 changesets or months of history, you will only add a handful
1057 of tests to the total number that <command role="hg-cmd">hg
1058 bisect</command> must perform, thanks to its logarithmic
1059 behaviour.</para>
1061 </sect2>
1062 </sect1>
1063 </chapter>
1065 <!--
1066 local variables:
1067 sgml-parent-document: ("00book.xml" "book" "chapter")
1068 end:
1069 -->