hgbook

view en/undo.tex @ 197:76697ae503db

Local branches.
author Bryan O'Sullivan <bos@serpentine.com>
date Mon Apr 16 16:33:51 2007 -0700 (2007-04-16)
parents 26b7a4e943aa
children 9bba958be4c6
line source
1 \chapter{Finding and fixing your mistakes}
2 \label{chap:undo}
4 To err might be human, but to really handle the consequences well
5 takes a top-notch revision control system. In this chapter, we'll
6 discuss some of the techniques you can use when you find that a
7 problem has crept into your project. Mercurial has some highly
8 capable features that will help you to isolate the sources of
9 problems, and to handle them appropriately.
11 \section{Erasing local history}
13 \subsection{The accidental commit}
15 I have the occasional but persistent problem of typing rather more
16 quickly than I can think, which sometimes results in me committing a
17 changeset that is either incomplete or plain wrong. In my case, the
18 usual kind of incomplete changeset is one in which I've created a new
19 source file, but forgotten to \hgcmd{add} it. A ``plain wrong''
20 changeset is not as common, but no less annoying.
22 \subsection{Rolling back a transaction}
23 \label{sec:undo:rollback}
25 In section~\ref{sec:concepts:txn}, I mentioned that Mercurial treats
26 each modification of a repository as a \emph{transaction}. Every time
27 you commit a changeset or pull changes from another repository,
28 Mercurial remembers what you did. You can undo, or \emph{roll back},
29 exactly one of these actions using the \hgcmd{rollback} command.
31 Here's a mistake that I often find myself making: committing a change
32 in which I've created a new file, but forgotten to \hgcmd{add} it.
33 \interaction{rollback.commit}
34 Looking at the output of \hgcmd{status} after the commit immediately
35 confirms the error.
36 \interaction{rollback.status}
37 The commit captured the changes to the file \filename{a}, but not the
38 new file \filename{b}. If I were to push this changeset to a
39 repository that I shared with a colleague, the chances are high that
40 something in \filename{a} would refer to \filename{b}, which would not
41 be present in their repository when they pulled my changes. I would
42 thus become the object of some indignation.
44 However, luck is with me---I've caught my error before I pushed the
45 changeset. I use the \hgcmd{rollback} command, and Mercurial makes
46 that last changeset vanish.
47 \interaction{rollback.rollback}
48 Notice that the changeset is no longer present in the repository's
49 history, and the working directory once again thinks that the file
50 \filename{a} is modified. The commit and rollback have left the
51 working directory exactly as it was prior to the commit; the changeset
52 has been completely erased. I can now safely \hgcmd{add} the file
53 \filename{b}, and rerun my commit.
54 \interaction{rollback.add}
56 \subsection{The erroneous pull}
58 It's common practice with Mercurial to maintain separate development
59 branches of a project in different repositories. Your development
60 team might have one shared repository for your project's ``0.9''
61 release, and another, containing different changes, for the ``1.0''
62 release.
64 Given this, you can imagine that the consequences could be messy if
65 you had a local ``0.9'' repository, and accidentally pulled changes
66 from the shared ``1.0'' repository into it. At worst, you could be
67 paying insufficient attention, and push those changes into the shared
68 ``0.9'' tree, confusing your entire team (but don't worry, we'll
69 return to this horror scenario later). However, it's more likely that
70 you'll notice immediately, because Mercurial will display the URL it's
71 pulling from, or you will see it pull a suspiciously large number of
72 changes into the repository.
74 The \hgcmd{rollback} command will work nicely to expunge all of the
75 changesets that you just pulled. Mercurial groups all changes from
76 one \hgcmd{pull} into a single transaction, so one \hgcmd{rollback} is
77 all you need to undo this mistake.
79 \subsection{Rolling back is useless once you've pushed}
81 The value of the \hgcmd{rollback} command drops to zero once you've
82 pushed your changes to another repository. Rolling back a change
83 makes it disappear entirely, but \emph{only} in the repository in
84 which you perform the \hgcmd{rollback}. Because a rollback eliminates
85 history, there's no way for the disappearance of a change to propagate
86 between repositories.
88 If you've pushed a change to another repository---particularly if it's
89 a shared repository---it has essentially ``escaped into the wild,''
90 and you'll have to recover from your mistake in a different way. What
91 will happen if you push a changeset somewhere, then roll it back, then
92 pull from the repository you pushed to, is that the changeset will
93 reappear in your repository.
95 (If you absolutely know for sure that the change you want to roll back
96 is the most recent change in the repository that you pushed to,
97 \emph{and} you know that nobody else could have pulled it from that
98 repository, you can roll back the changeset there, too, but you really
99 should really not rely on this working reliably. If you do this,
100 sooner or later a change really will make it into a repository that
101 you don't directly control (or have forgotten about), and come back to
102 bite you.)
104 \subsection{You can only roll back once}
106 Mercurial stores exactly one transaction in its transaction log; that
107 transaction is the most recent one that occurred in the repository.
108 This means that you can only roll back one transaction. If you expect
109 to be able to roll back one transaction, then its predecessor, this is
110 not the behaviour you will get.
111 \interaction{rollback.twice}
112 Once you've rolled back one transaction in a repository, you can't
113 roll back again in that repository until you perform another commit or
114 pull.
116 \section{Reverting the mistaken change}
118 If you make a modification to a file, and decide that you really
119 didn't want to change the file at all, and you haven't yet committed
120 your changes, the \hgcmd{revert} command is the one you'll need. It
121 looks at the changeset that's the parent of the working directory, and
122 restores the contents of the file to their state as of that changeset.
123 (That's a long-winded way of saying that, in the normal case, it
124 undoes your modifications.)
126 Let's illustrate how the \hgcmd{revert} command works with yet another
127 small example. We'll begin by modifying a file that Mercurial is
128 already tracking.
129 \interaction{daily.revert.modify}
130 If we don't want that change, we can simply \hgcmd{revert} the file.
131 \interaction{daily.revert.unmodify}
132 The \hgcmd{revert} command provides us with an extra degree of safety
133 by saving our modified file with a \filename{.orig} extension.
134 \interaction{daily.revert.status}
136 Here is a summary of the cases that the \hgcmd{revert} command can
137 deal with. We will describe each of these in more detail in the
138 section that follows.
139 \begin{itemize}
140 \item If you modify a file, it will restore the file to its unmodified
141 state.
142 \item If you \hgcmd{add} a file, it will undo the ``added'' state of
143 the file, but leave the file itself untouched.
144 \item If you delete a file without telling Mercurial, it will restore
145 the file to its unmodified contents.
146 \item If you use the \hgcmd{remove} command to remove a file, it will
147 undo the ``removed'' state of the file, and restore the file to its
148 unmodified contents.
149 \end{itemize}
151 \subsection{File management errors}
152 \label{sec:undo:mgmt}
154 The \hgcmd{revert} command is useful for more than just modified
155 files. It lets you reverse the results of all of Mercurial's file
156 management commands---\hgcmd{add}, \hgcmd{remove}, and so on.
158 If you \hgcmd{add} a file, then decide that in fact you don't want
159 Mercurial to track it, use \hgcmd{revert} to undo the add. Don't
160 worry; Mercurial will not modify the file in any way. It will just
161 ``unmark'' the file.
162 \interaction{daily.revert.add}
164 Similarly, if you ask Mercurial to \hgcmd{remove} a file, you can use
165 \hgcmd{revert} to restore it to the contents it had as of the parent
166 of the working directory.
167 \interaction{daily.revert.remove}
168 This works just as well for a file that you deleted by hand, without
169 telling Mercurial (recall that in Mercurial terminology, this kind of
170 file is called ``missing'').
171 \interaction{daily.revert.missing}
173 If you revert a \hgcmd{copy}, the copied-to file remains in your
174 working directory afterwards, untracked. Since a copy doesn't affect
175 the copied-from file in any way, Mercurial doesn't do anything with
176 the copied-from file.
177 \interaction{daily.revert.copy}
179 \subsubsection{A slightly special case: reverting a rename}
181 If you \hgcmd{rename} a file, there is one small detail that
182 you should remember. When you \hgcmd{revert} a rename, it's not
183 enough to provide the name of the renamed-to file, as you can see
184 here.
185 \interaction{daily.revert.rename}
186 As you can see from the output of \hgcmd{status}, the renamed-to file
187 is no longer identified as added, but the renamed-\emph{from} file is
188 still removed! This is counter-intuitive (at least to me), but at
189 least it's easy to deal with.
190 \interaction{daily.revert.rename-orig}
191 So remember, to revert a \hgcmd{rename}, you must provide \emph{both}
192 the source and destination names.
194 (By the way, if you rename a file, then modify the renamed-to file,
195 then revert both components of the rename, when Mercurial restores the
196 file that was removed as part of the rename, it will be unmodified.
197 If you need the modifications in the renamed-to file to show up in the
198 renamed-from file, don't forget to copy them over.)
200 These fiddly aspects of reverting a rename arguably constitute a small
201 bug in Mercurial.
203 \section{Dealing with committed changes}
205 Consider a case where you have committed a change $a$, and another
206 change $b$ on top of it; you then realise that change $a$ was
207 incorrect. Mercurial lets you ``back out'' an entire changeset
208 automatically, and building blocks that let you reverse part of a
209 changeset by hand.
211 Before you read this section, here's something to keep in mind: the
212 \hgcmd{backout} command undoes changes by \emph{adding} history, not
213 by modifying or erasing it. It's the right tool to use if you're
214 fixing bugs, but not if you're trying to undo some change that has
215 catastrophic consequences. To deal with those, see
216 section~\ref{sec:undo:aaaiiieee}.
218 \subsection{Backing out a changeset}
220 The \hgcmd{backout} command lets you ``undo'' the effects of an entire
221 changeset in an automated fashion. Because Mercurial's history is
222 immutable, this command \emph{does not} get rid of the changeset you
223 want to undo. Instead, it creates a new changeset that
224 \emph{reverses} the effect of the to-be-undone changeset.
226 The operation of the \hgcmd{backout} command is a little intricate, so
227 let's illustrate it with some examples. First, we'll create a
228 repository with some simple changes.
229 \interaction{backout.init}
231 The \hgcmd{backout} command takes a single changeset ID as its
232 argument; this is the changeset to back out. Normally,
233 \hgcmd{backout} will drop you into a text editor to write a commit
234 message, so you can record why you're backing the change out. In this
235 example, we provide a commit message on the command line using the
236 \hgopt{backout}{-m} option.
238 \subsection{Backing out the tip changeset}
240 We're going to start by backing out the last changeset we committed.
241 \interaction{backout.simple}
242 You can see that the second line from \filename{myfile} is no longer
243 present. Taking a look at the output of \hgcmd{log} gives us an idea
244 of what the \hgcmd{backout} command has done.
245 \interaction{backout.simple.log}
246 Notice that the new changeset that \hgcmd{backout} has created is a
247 child of the changeset we backed out. It's easier to see this in
248 figure~\ref{fig:undo:backout}, which presents a graphical view of the
249 change history. As you can see, the history is nice and linear.
251 \begin{figure}[htb]
252 \centering
253 \grafix{undo-simple}
254 \caption{Backing out a change using the \hgcmd{backout} command}
255 \label{fig:undo:backout}
256 \end{figure}
258 \subsection{Backing out a non-tip change}
260 If you want to back out a change other than the last one you
261 committed, pass the \hgopt{backout}{--merge} option to the
262 \hgcmd{backout} command.
263 \interaction{backout.non-tip.clone}
264 This makes backing out any changeset a ``one-shot'' operation that's
265 usually simple and fast.
266 \interaction{backout.non-tip.backout}
268 If you take a look at the contents of \filename{myfile} after the
269 backout finishes, you'll see that the first and third changes are
270 present, but not the second.
271 \interaction{backout.non-tip.cat}
273 As the graphical history in figure~\ref{fig:undo:backout-non-tip}
274 illustrates, Mercurial actually commits \emph{two} changes in this
275 kind of situation (the box-shaped nodes are the ones that Mercurial
276 commits automatically). Before Mercurial begins the backout process,
277 it first remembers what the current parent of the working directory
278 is. It then backs out the target changeset, and commits that as a
279 changeset. Finally, it merges back to the previous parent of the
280 working directory, and commits the result of the merge.
282 \begin{figure}[htb]
283 \centering
284 \grafix{undo-non-tip}
285 \caption{Automated backout of a non-tip change using the \hgcmd{backout} command}
286 \label{fig:undo:backout-non-tip}
287 \end{figure}
289 The result is that you end up ``back where you were'', only with some
290 extra history that undoes the effect of the changeset you wanted to
291 back out.
293 \subsubsection{Always use the \hgopt{backout}{--merge} option}
295 In fact, since the \hgopt{backout}{--merge} option will do the ``right
296 thing'' whether or not the changeset you're backing out is the tip
297 (i.e.~it won't try to merge if it's backing out the tip, since there's
298 no need), you should \emph{always} use this option when you run the
299 \hgcmd{backout} command.
301 \subsection{Gaining more control of the backout process}
303 While I've recommended that you always use the
304 \hgopt{backout}{--merge} option when backing out a change, the
305 \hgcmd{backout} command lets you decide how to merge a backout
306 changeset. Taking control of the backout process by hand is something
307 you will rarely need to do, but it can be useful to understand what
308 the \hgcmd{backout} command is doing for you automatically. To
309 illustrate this, let's clone our first repository, but omit the
310 backout change that it contains.
312 \interaction{backout.manual.clone}
313 As with our earlier example, We'll commit a third changeset, then back
314 out its parent, and see what happens.
315 \interaction{backout.manual.backout}
316 Our new changeset is again a descendant of the changeset we backout
317 out; it's thus a new head, \emph{not} a descendant of the changeset
318 that was the tip. The \hgcmd{backout} command was quite explicit in
319 telling us this.
320 \interaction{backout.manual.log}
322 Again, it's easier to see what has happened by looking at a graph of
323 the revision history, in figure~\ref{fig:undo:backout-manual}. This
324 makes it clear that when we use \hgcmd{backout} to back out a change
325 other than the tip, Mercurial adds a new head to the repository (the
326 change it committed is box-shaped).
328 \begin{figure}[htb]
329 \centering
330 \grafix{undo-manual}
331 \caption{Backing out a change using the \hgcmd{backout} command}
332 \label{fig:undo:backout-manual}
333 \end{figure}
335 After the \hgcmd{backout} command has completed, it leaves the new
336 ``backout'' changeset as the parent of the working directory.
337 \interaction{backout.manual.parents}
338 Now we have two isolated sets of changes.
339 \interaction{backout.manual.heads}
341 Let's think about what we expect to see as the contents of
342 \filename{myfile} now. The first change should be present, because
343 we've never backed it out. The second change should be missing, as
344 that's the change we backed out. Since the history graph shows the
345 third change as a separate head, we \emph{don't} expect to see the
346 third change present in \filename{myfile}.
347 \interaction{backout.manual.cat}
348 To get the third change back into the file, we just do a normal merge
349 of our two heads.
350 \interaction{backout.manual.merge}
351 Afterwards, the graphical history of our repository looks like
352 figure~\ref{fig:undo:backout-manual-merge}.
354 \begin{figure}[htb]
355 \centering
356 \grafix{undo-manual-merge}
357 \caption{Manually merging a backout change}
358 \label{fig:undo:backout-manual-merge}
359 \end{figure}
361 \subsection{Why \hgcmd{backout} works as it does}
363 Here's a brief description of how the \hgcmd{backout} command works.
364 \begin{enumerate}
365 \item It ensures that the working directory is ``clean'', i.e.~that
366 the output of \hgcmd{status} would be empty.
367 \item It remembers the current parent of the working directory. Let's
368 call this changeset \texttt{orig}
369 \item It does the equivalent of a \hgcmd{update} to sync the working
370 directory to the changeset you want to back out. Let's call this
371 changeset \texttt{backout}
372 \item It finds the parent of that changeset. Let's call that
373 changeset \texttt{parent}.
374 \item For each file that the \texttt{backout} changeset affected, it
375 does the equivalent of a \hgcmdargs{revert}{-r parent} on that file,
376 to restore it to the contents it had before that changeset was
377 committed.
378 \item It commits the result as a new changeset. This changeset has
379 \texttt{backout} as its parent.
380 \item If you specify \hgopt{backout}{--merge} on the command line, it
381 merges with \texttt{orig}, and commits the result of the merge.
382 \end{enumerate}
384 An alternative way to implement the \hgcmd{backout} command would be
385 to \hgcmd{export} the to-be-backed-out changeset as a diff, then use
386 the \cmdopt{patch}{--reverse} option to the \command{patch} command to
387 reverse the effect of the change without fiddling with the working
388 directory. This sounds much simpler, but it would not work nearly as
389 well.
391 The reason that \hgcmd{backout} does an update, a commit, a merge, and
392 another commit is to give the merge machinery the best chance to do a
393 good job when dealing with all the changes \emph{between} the change
394 you're backing out and the current tip.
396 If you're backing out a changeset that's~100 revisions back in your
397 project's history, the chances that the \command{patch} command will
398 be able to apply a reverse diff cleanly are not good, because
399 intervening changes are likely to have ``broken the context'' that
400 \command{patch} uses to determine whether it can apply a patch (if
401 this sounds like gibberish, see \ref{sec:mq:patch} for a
402 discussion of the \command{patch} command). Also, Mercurial's merge
403 machinery will handle files and directories being renamed, permission
404 changes, and modifications to binary files, none of which
405 \command{patch} can deal with.
407 \section{Changes that should never have been}
408 \label{sec:undo:aaaiiieee}
410 Most of the time, the \hgcmd{backout} command is exactly what you need
411 if you want to undo the effects of a change. It leaves a permanent
412 record of exactly what you did, both when committing the original
413 changeset and when you cleaned up after it.
415 On rare occasions, though, you may find that you've committed a change
416 that really should not be present in the repository at all. For
417 example, it would be very unusual, and usually considered a mistake,
418 to commit a software project's object files as well as its source
419 files. Object files have almost no intrinsic value, and they're
420 \emph{big}, so they increase the size of the repository and the amount
421 of time it takes to clone or pull changes.
423 Before I discuss the options that you have if you commit a ``brown
424 paper bag'' change (the kind that's so bad that you want to pull a
425 brown paper bag over your head), let me first discuss some approaches
426 that probably won't work.
428 Since Mercurial treats history as accumulative---every change builds
429 on top of all changes that preceded it---you generally can't just make
430 disastrous changes disappear. The one exception is when you've just
431 committed a change, and it hasn't been pushed or pulled into another
432 repository. That's when you can safely use the \hgcmd{rollback}
433 command, as I detailed in section~\ref{sec:undo:rollback}.
435 After you've pushed a bad change to another repository, you
436 \emph{could} still use \hgcmd{rollback} to make your local copy of the
437 change disappear, but it won't have the consequences you want. The
438 change will still be present in the remote repository, so it will
439 reappear in your local repository the next time you pull.
441 If a situation like this arises, and you know which repositories your
442 bad change has propagated into, you can \emph{try} to get rid of the
443 changeefrom \emph{every} one of those repositories. This is, of
444 course, not a satisfactory solution: if you miss even a single
445 repository while you're expunging, the change is still ``in the
446 wild'', and could propagate further.
448 If you've committed one or more changes \emph{after} the change that
449 you'd like to see disappear, your options are further reduced.
450 Mercurial doesn't provide a way to ``punch a hole'' in history,
451 leaving changesets intact.
453 XXX This needs filling out. The \texttt{hg-replay} script in the
454 \texttt{examples} directory works, but doesn't handle merge
455 changesets. Kind of an important omission.
457 \section{Finding the source of a bug}
459 While it's all very well to be able to back out a changeset that
460 introduced a bug, this requires that you know which changeset to back
461 out. Mercurial provides an invaluable extension, called
462 \hgext{bisect}, that helps you to automate this process and accomplish
463 it very efficiently.
465 The idea behind the \hgext{bisect} extension is that a changeset has
466 introduced some change of behaviour that you can identify with a
467 simple binary test. You don't know which piece of code introduced the
468 change, but you know how to test for the presence of the bug. The
469 \hgext{bisect} extension uses your test to direct its search for the
470 changeset that introduced the code that caused the bug.
472 Here are a few scenarios to help you understand how you might apply this
473 extension.
474 \begin{itemize}
475 \item The most recent version of your software has a bug that you
476 remember wasn't present a few weeks ago, but you don't know when it
477 was introduced. Here, your binary test checks for the presence of
478 that bug.
479 \item You fixed a bug in a rush, and now it's time to close the entry
480 in your team's bug database. The bug database requires a changeset
481 ID when you close an entry, but you don't remember which changeset
482 you fixed the bug in. Once again, your binary test checks for the
483 presence of the bug.
484 \item Your software works correctly, but runs~15\% slower than the
485 last time you measured it. You want to know which changeset
486 introduced the performance regression. In this case, your binary
487 test measures the performance of your software, to see whether it's
488 ``fast'' or ``slow''.
489 \item The sizes of the components of your project that you ship
490 exploded recently, and you suspect that something changed in the way
491 you build your project.
492 \end{itemize}
494 From these examples, it should be clear that the \hgext{bisect}
495 extension is not useful only for finding the sources of bugs. You can
496 use it to find any ``emergent property'' of a repository (anything
497 that you can't find from a simple text search of the files in the
498 tree) for which you can write a binary test.
500 We'll introduce a little bit of terminology here, just to make it
501 clear which parts of the search process are your responsibility, and
502 which are Mercurial's. A \emph{test} is something that \emph{you} run
503 when \hgext{bisect} chooses a changeset. A \emph{probe} is what
504 \hgext{bisect} runs to tell whether a revision is good. Finally,
505 we'll use the word ``bisect'', as both a noun and a verb, to stand in
506 for the phrase ``search using the \hgext{bisect} extension''.
508 One simple way to automate the searching process would be simply to
509 probe every changeset. However, this scales poorly. If it took ten
510 minutes to test a single changeset, and you had 10,000 changesets in
511 your repository, the exhaustive approach would take on average~35
512 \emph{days} to find the changeset that introduced a bug. Even if you
513 knew that the bug was introduced by one of the last 500 changesets,
514 and limited your search to those, you'd still be looking at over 40
515 hours to find the changeset that introduced your bug.
517 What the \emph{bisect} extension does is use its knowledge of the
518 ``shape'' of your project's revision history to perform a search in
519 time proportional to the \emph{logarithm} of the number of changesets
520 to check (the kind of search it performs is called a dichotomic
521 search). With this approach, searching through 10,000 changesets will
522 take less than two hours, even at ten minutes per test. Limit your
523 search to the last 500 changesets, and it will take less than an hour.
525 The \hgext{bisect} extension is aware of the ``branchy'' nature of a
526 Mercurial project's revision history, so it has no problems dealing
527 with branches, merges, or multiple heads in a repoository. It can
528 prune entire branches of history with a single probe, which is how it
529 operates so efficiently.
531 \subsection{Using the \hgext{bisect} extension}
533 Here's an example of \hgext{bisect} in action. To keep the core of
534 Mercurial simple, \hgext{bisect} is packaged as an extension; this
535 means that it won't be present unless you explicitly enable it. To do
536 this, edit your \hgrc\ and add the following section header (if it's
537 not already present):
538 \begin{codesample2}
539 [extensions]
540 \end{codesample2}
541 Then add a line to this section to enable the extension:
542 \begin{codesample2}
543 hbisect =
544 \end{codesample2}
545 \begin{note}
546 That's right, there's a ``\texttt{h}'' at the front of the name of
547 the \hgext{bisect} extension. The reason is that Mercurial is
548 written in Python, and uses a standard Python package called
549 \texttt{bisect}. If you omit the ``\texttt{h}'' from the name
550 ``\texttt{hbisect}'', Mercurial will erroneously find the standard
551 Python \texttt{bisect} package, and try to use it as a Mercurial
552 extension. This won't work, and Mercurial will crash repeatedly
553 until you fix the spelling in your \hgrc. Ugh.
554 \end{note}
556 Now let's create a repository, so that we can try out the
557 \hgext{bisect} extension in isolation.
558 \interaction{bisect.init}
559 We'll simulate a project that has a bug in it in a simple-minded way:
560 create trivial changes in a loop, and nominate one specific change
561 that will have the ``bug''. This loop creates 50 changesets, each
562 adding a single file to the repository. We'll represent our ``bug''
563 with a file that contains the text ``i have a gub''.
564 \interaction{bisect.commits}
566 The next thing that we'd like to do is figure out how to use the
567 \hgext{bisect} extension. We can use Mercurial's normal built-in help
568 mechanism for this.
569 \interaction{bisect.help}
571 The \hgext{bisect} extension works in steps. Each step proceeds as follows.
572 \begin{enumerate}
573 \item You run your binary test.
574 \begin{itemize}
575 \item If the test succeeded, you tell \hgext{bisect} by running the
576 \hgcmdargs{bisect}{good} command.
577 \item If it failed, use the \hgcmdargs{bisect}{bad} command to let
578 the \hgext{bisect} extension know.
579 \end{itemize}
580 \item The extension uses your information to decide which changeset to
581 test next.
582 \item It updates the working directory to that changeset, and the
583 process begins again.
584 \end{enumerate}
585 The process ends when \hgext{bisect} identifies a unique changeset
586 that marks the point where your test transitioned from ``succeeding''
587 to ``failing''.
589 To start the search, we must run the \hgcmdargs{bisect}{init} command.
590 \interaction{bisect.search.init}
592 In our case, the binary test we use is simple: we check to see if any
593 file in the repository contains the string ``i have a gub''. If it
594 does, this changeset contains the change that ``caused the bug''. By
595 convention, a changeset that has the property we're searching for is
596 ``bad'', while one that doesn't is ``good''.
598 Most of the time, the revision to which the working directory is
599 synced (usually the tip) already exhibits the problem introduced by
600 the buggy change, so we'll mark it as ``bad''.
601 \interaction{bisect.search.bad-init}
603 Our next task is to nominate a changeset that we know \emph{doesn't}
604 have the bug; the \hgext{bisect} extension will ``bracket'' its search
605 between the first pair of good and bad changesets. In our case, we
606 know that revision~10 didn't have the bug. (I'll have more words
607 about choosing the first ``good'' changeset later.)
608 \interaction{bisect.search.good-init}
610 Notice that this command printed some output.
611 \begin{itemize}
612 \item It told us how many changesets it must consider before it can
613 identify the one that introduced the bug, and how many tests that
614 will require.
615 \item It updated the working directory to the next changeset to test,
616 and told us which changeset it's testing.
617 \end{itemize}
619 We now run our test in the working directory. We use the
620 \command{grep} command to see if our ``bad'' file is present in the
621 working directory. If it is, this revision is bad; if not, this
622 revision is good.
623 \interaction{bisect.search.step1}
625 This test looks like a perfect candidate for automation, so let's turn
626 it into a shell function.
627 \interaction{bisect.search.mytest}
628 We can now run an entire test step with a single command,
629 \texttt{mytest}.
630 \interaction{bisect.search.step2}
631 A few more invocations of our canned test step command, and we're
632 done.
633 \interaction{bisect.search.rest}
635 Even though we had~40 changesets to search through, the \hgext{bisect}
636 extension let us find the changeset that introduced our ``bug'' with
637 only five tests. Because the number of tests that the \hgext{bisect}
638 extension grows logarithmically with the number of changesets to
639 search, the advantage that it has over the ``brute force'' search
640 approach increases with every changeset you add.
642 \subsection{Cleaning up after your search}
644 When you're finished using the \hgext{bisect} extension in a
645 repository, you can use the \hgcmdargs{bisect}{reset} command to drop
646 the information it was using to drive your search. The extension
647 doesn't use much space, so it doesn't matter if you forget to run this
648 command. However, \hgext{bisect} won't let you start a new search in
649 that repository until you do a \hgcmdargs{bisect}{reset}.
650 \interaction{bisect.search.reset}
652 \section{Tips for finding bugs effectively}
654 \subsection{Give consistent input}
656 The \hgext{bisect} extension requires that you correctly report the
657 result of every test you perform. If you tell it that a test failed
658 when it really succeeded, it \emph{might} be able to detect the
659 inconsistency. If it can identify an inconsistency in your reports,
660 it will tell you that a particular changeset is both good and bad.
661 However, it can't do this perfectly; it's about as likely to report
662 the wrong changeset as the source of the bug.
664 \subsection{Automate as much as possible}
666 When I started using the \hgext{bisect} extension, I tried a few times
667 to run my tests by hand, on the command line. This is an approach
668 that I, at least, am not suited to. After a few tries, I found that I
669 was making enough mistakes that I was having to restart my searches
670 several times before finally getting correct results.
672 My initial problems with driving the \hgext{bisect} extension by hand
673 occurred even with simple searches on small repositories; if the
674 problem you're looking for is more subtle, or the number of tests that
675 \hgext{bisect} must perform increases, the likelihood of operator
676 error ruining the search is much higher. Once I started automating my
677 tests, I had much better results.
679 The key to automated testing is twofold:
680 \begin{itemize}
681 \item always test for the same symptom, and
682 \item always feed consistent input to the \hgcmd{bisect} command.
683 \end{itemize}
684 In my tutorial example above, the \command{grep} command tests for the
685 symptom, and the \texttt{if} statement takes the result of this check
686 and ensures that we always feed the same input to the \hgcmd{bisect}
687 command. The \texttt{mytest} function marries these together in a
688 reproducible way, so that every test is uniform and consistent.
690 \subsection{Check your results}
692 Because the output of a \hgext{bisect} search is only as good as the
693 input you give it, don't take the changeset it reports as the
694 absolute truth. A simple way to cross-check its report is to manually
695 run your test at each of the following changesets:
696 \begin{itemize}
697 \item The changeset that it reports as the first bad revision. Your
698 test should still report this as bad.
699 \item The parent of that changeset (either parent, if it's a merge).
700 Your test should report this changeset as good.
701 \item A child of that changeset. Your test should report this
702 changeset as bad.
703 \end{itemize}
705 \subsection{Beware interference between bugs}
707 It's possible that your search for one bug could be disrupted by the
708 presence of another. For example, let's say your software crashes at
709 revision 100, and worked correctly at revision 50. Unknown to you,
710 someone else introduced a different crashing bug at revision 60, and
711 fixed it at revision 80. This could distort your results in one of
712 several ways.
714 It is possible that this other bug completely ``masks'' yours, which
715 is to say that it occurs before your bug has a chance to manifest
716 itself. If you can't avoid that other bug (for example, it prevents
717 your project from building), and so can't tell whether your bug is
718 present in a particular changeset, the \hgext{bisect} extension cannot
719 help you directly. Instead, you'll need to manually avoid the
720 changesets where that bug is present, and do separate searches
721 ``around'' it.
723 A different problem could arise if your test for a bug's presence is
724 not specific enough. If you checks for ``my program crashes'', then
725 both your crashing bug and an unrelated crashing bug that masks it
726 will look like the same thing, and mislead \hgext{bisect}.
728 \subsection{Bracket your search lazily}
730 Choosing the first ``good'' and ``bad'' changesets that will mark the
731 end points of your search is often easy, but it bears a little
732 discussion neverthheless. From the perspective of \hgext{bisect}, the
733 ``newest'' changeset is conventionally ``bad'', and the older
734 changeset is ``good''.
736 If you're having trouble remembering when a suitable ``good'' change
737 was, so that you can tell \hgext{bisect}, you could do worse than
738 testing changesets at random. Just remember to eliminate contenders
739 that can't possibly exhibit the bug (perhaps because the feature with
740 the bug isn't present yet) and those where another problem masks the
741 bug (as I discussed above).
743 Even if you end up ``early'' by thousands of changesets or months of
744 history, you will only add a handful of tests to the total number that
745 \hgext{bisect} must perform, thanks to its logarithmic behaviour.
747 %%% Local Variables:
748 %%% mode: latex
749 %%% TeX-master: "00book"
750 %%% End: