hgbook

view en/undo.tex @ 432:5da084395a69

translated some other paragrpahs
author jerojasro@localhost
date Sun Nov 30 18:55:34 2008 -0500 (2008-11-30)
parents 7a6bd93174bd
children f79542a53cb2 91adcea08b33
line source
1 \chapter{Finding and fixing your mistakes}
2 \label{chap:undo}
4 To err might be human, but to really handle the consequences well
5 takes a top-notch revision control system. In this chapter, we'll
6 discuss some of the techniques you can use when you find that a
7 problem has crept into your project. Mercurial has some highly
8 capable features that will help you to isolate the sources of
9 problems, and to handle them appropriately.
11 \section{Erasing local history}
13 \subsection{The accidental commit}
15 I have the occasional but persistent problem of typing rather more
16 quickly than I can think, which sometimes results in me committing a
17 changeset that is either incomplete or plain wrong. In my case, the
18 usual kind of incomplete changeset is one in which I've created a new
19 source file, but forgotten to \hgcmd{add} it. A ``plain wrong''
20 changeset is not as common, but no less annoying.
22 \subsection{Rolling back a transaction}
23 \label{sec:undo:rollback}
25 In section~\ref{sec:concepts:txn}, I mentioned that Mercurial treats
26 each modification of a repository as a \emph{transaction}. Every time
27 you commit a changeset or pull changes from another repository,
28 Mercurial remembers what you did. You can undo, or \emph{roll back},
29 exactly one of these actions using the \hgcmd{rollback} command. (See
30 section~\ref{sec:undo:rollback-after-push} for an important caveat
31 about the use of this command.)
33 Here's a mistake that I often find myself making: committing a change
34 in which I've created a new file, but forgotten to \hgcmd{add} it.
35 \interaction{rollback.commit}
36 Looking at the output of \hgcmd{status} after the commit immediately
37 confirms the error.
38 \interaction{rollback.status}
39 The commit captured the changes to the file \filename{a}, but not the
40 new file \filename{b}. If I were to push this changeset to a
41 repository that I shared with a colleague, the chances are high that
42 something in \filename{a} would refer to \filename{b}, which would not
43 be present in their repository when they pulled my changes. I would
44 thus become the object of some indignation.
46 However, luck is with me---I've caught my error before I pushed the
47 changeset. I use the \hgcmd{rollback} command, and Mercurial makes
48 that last changeset vanish.
49 \interaction{rollback.rollback}
50 Notice that the changeset is no longer present in the repository's
51 history, and the working directory once again thinks that the file
52 \filename{a} is modified. The commit and rollback have left the
53 working directory exactly as it was prior to the commit; the changeset
54 has been completely erased. I can now safely \hgcmd{add} the file
55 \filename{b}, and rerun my commit.
56 \interaction{rollback.add}
58 \subsection{The erroneous pull}
60 It's common practice with Mercurial to maintain separate development
61 branches of a project in different repositories. Your development
62 team might have one shared repository for your project's ``0.9''
63 release, and another, containing different changes, for the ``1.0''
64 release.
66 Given this, you can imagine that the consequences could be messy if
67 you had a local ``0.9'' repository, and accidentally pulled changes
68 from the shared ``1.0'' repository into it. At worst, you could be
69 paying insufficient attention, and push those changes into the shared
70 ``0.9'' tree, confusing your entire team (but don't worry, we'll
71 return to this horror scenario later). However, it's more likely that
72 you'll notice immediately, because Mercurial will display the URL it's
73 pulling from, or you will see it pull a suspiciously large number of
74 changes into the repository.
76 The \hgcmd{rollback} command will work nicely to expunge all of the
77 changesets that you just pulled. Mercurial groups all changes from
78 one \hgcmd{pull} into a single transaction, so one \hgcmd{rollback} is
79 all you need to undo this mistake.
81 \subsection{Rolling back is useless once you've pushed}
82 \label{sec:undo:rollback-after-push}
84 The value of the \hgcmd{rollback} command drops to zero once you've
85 pushed your changes to another repository. Rolling back a change
86 makes it disappear entirely, but \emph{only} in the repository in
87 which you perform the \hgcmd{rollback}. Because a rollback eliminates
88 history, there's no way for the disappearance of a change to propagate
89 between repositories.
91 If you've pushed a change to another repository---particularly if it's
92 a shared repository---it has essentially ``escaped into the wild,''
93 and you'll have to recover from your mistake in a different way. What
94 will happen if you push a changeset somewhere, then roll it back, then
95 pull from the repository you pushed to, is that the changeset will
96 reappear in your repository.
98 (If you absolutely know for sure that the change you want to roll back
99 is the most recent change in the repository that you pushed to,
100 \emph{and} you know that nobody else could have pulled it from that
101 repository, you can roll back the changeset there, too, but you really
102 should really not rely on this working reliably. If you do this,
103 sooner or later a change really will make it into a repository that
104 you don't directly control (or have forgotten about), and come back to
105 bite you.)
107 \subsection{You can only roll back once}
109 Mercurial stores exactly one transaction in its transaction log; that
110 transaction is the most recent one that occurred in the repository.
111 This means that you can only roll back one transaction. If you expect
112 to be able to roll back one transaction, then its predecessor, this is
113 not the behaviour you will get.
114 \interaction{rollback.twice}
115 Once you've rolled back one transaction in a repository, you can't
116 roll back again in that repository until you perform another commit or
117 pull.
119 \section{Reverting the mistaken change}
121 If you make a modification to a file, and decide that you really
122 didn't want to change the file at all, and you haven't yet committed
123 your changes, the \hgcmd{revert} command is the one you'll need. It
124 looks at the changeset that's the parent of the working directory, and
125 restores the contents of the file to their state as of that changeset.
126 (That's a long-winded way of saying that, in the normal case, it
127 undoes your modifications.)
129 Let's illustrate how the \hgcmd{revert} command works with yet another
130 small example. We'll begin by modifying a file that Mercurial is
131 already tracking.
132 \interaction{daily.revert.modify}
133 If we don't want that change, we can simply \hgcmd{revert} the file.
134 \interaction{daily.revert.unmodify}
135 The \hgcmd{revert} command provides us with an extra degree of safety
136 by saving our modified file with a \filename{.orig} extension.
137 \interaction{daily.revert.status}
139 Here is a summary of the cases that the \hgcmd{revert} command can
140 deal with. We will describe each of these in more detail in the
141 section that follows.
142 \begin{itemize}
143 \item If you modify a file, it will restore the file to its unmodified
144 state.
145 \item If you \hgcmd{add} a file, it will undo the ``added'' state of
146 the file, but leave the file itself untouched.
147 \item If you delete a file without telling Mercurial, it will restore
148 the file to its unmodified contents.
149 \item If you use the \hgcmd{remove} command to remove a file, it will
150 undo the ``removed'' state of the file, and restore the file to its
151 unmodified contents.
152 \end{itemize}
154 \subsection{File management errors}
155 \label{sec:undo:mgmt}
157 The \hgcmd{revert} command is useful for more than just modified
158 files. It lets you reverse the results of all of Mercurial's file
159 management commands---\hgcmd{add}, \hgcmd{remove}, and so on.
161 If you \hgcmd{add} a file, then decide that in fact you don't want
162 Mercurial to track it, use \hgcmd{revert} to undo the add. Don't
163 worry; Mercurial will not modify the file in any way. It will just
164 ``unmark'' the file.
165 \interaction{daily.revert.add}
167 Similarly, if you ask Mercurial to \hgcmd{remove} a file, you can use
168 \hgcmd{revert} to restore it to the contents it had as of the parent
169 of the working directory.
170 \interaction{daily.revert.remove}
171 This works just as well for a file that you deleted by hand, without
172 telling Mercurial (recall that in Mercurial terminology, this kind of
173 file is called ``missing'').
174 \interaction{daily.revert.missing}
176 If you revert a \hgcmd{copy}, the copied-to file remains in your
177 working directory afterwards, untracked. Since a copy doesn't affect
178 the copied-from file in any way, Mercurial doesn't do anything with
179 the copied-from file.
180 \interaction{daily.revert.copy}
182 \subsubsection{A slightly special case: reverting a rename}
184 If you \hgcmd{rename} a file, there is one small detail that
185 you should remember. When you \hgcmd{revert} a rename, it's not
186 enough to provide the name of the renamed-to file, as you can see
187 here.
188 \interaction{daily.revert.rename}
189 As you can see from the output of \hgcmd{status}, the renamed-to file
190 is no longer identified as added, but the renamed-\emph{from} file is
191 still removed! This is counter-intuitive (at least to me), but at
192 least it's easy to deal with.
193 \interaction{daily.revert.rename-orig}
194 So remember, to revert a \hgcmd{rename}, you must provide \emph{both}
195 the source and destination names.
197 % TODO: the output doesn't look like it will be removed!
199 (By the way, if you rename a file, then modify the renamed-to file,
200 then revert both components of the rename, when Mercurial restores the
201 file that was removed as part of the rename, it will be unmodified.
202 If you need the modifications in the renamed-to file to show up in the
203 renamed-from file, don't forget to copy them over.)
205 These fiddly aspects of reverting a rename arguably constitute a small
206 bug in Mercurial.
208 \section{Dealing with committed changes}
210 Consider a case where you have committed a change $a$, and another
211 change $b$ on top of it; you then realise that change $a$ was
212 incorrect. Mercurial lets you ``back out'' an entire changeset
213 automatically, and building blocks that let you reverse part of a
214 changeset by hand.
216 Before you read this section, here's something to keep in mind: the
217 \hgcmd{backout} command undoes changes by \emph{adding} history, not
218 by modifying or erasing it. It's the right tool to use if you're
219 fixing bugs, but not if you're trying to undo some change that has
220 catastrophic consequences. To deal with those, see
221 section~\ref{sec:undo:aaaiiieee}.
223 \subsection{Backing out a changeset}
225 The \hgcmd{backout} command lets you ``undo'' the effects of an entire
226 changeset in an automated fashion. Because Mercurial's history is
227 immutable, this command \emph{does not} get rid of the changeset you
228 want to undo. Instead, it creates a new changeset that
229 \emph{reverses} the effect of the to-be-undone changeset.
231 The operation of the \hgcmd{backout} command is a little intricate, so
232 let's illustrate it with some examples. First, we'll create a
233 repository with some simple changes.
234 \interaction{backout.init}
236 The \hgcmd{backout} command takes a single changeset ID as its
237 argument; this is the changeset to back out. Normally,
238 \hgcmd{backout} will drop you into a text editor to write a commit
239 message, so you can record why you're backing the change out. In this
240 example, we provide a commit message on the command line using the
241 \hgopt{backout}{-m} option.
243 \subsection{Backing out the tip changeset}
245 We're going to start by backing out the last changeset we committed.
246 \interaction{backout.simple}
247 You can see that the second line from \filename{myfile} is no longer
248 present. Taking a look at the output of \hgcmd{log} gives us an idea
249 of what the \hgcmd{backout} command has done.
250 \interaction{backout.simple.log}
251 Notice that the new changeset that \hgcmd{backout} has created is a
252 child of the changeset we backed out. It's easier to see this in
253 figure~\ref{fig:undo:backout}, which presents a graphical view of the
254 change history. As you can see, the history is nice and linear.
256 \begin{figure}[htb]
257 \centering
258 \grafix{undo-simple}
259 \caption{Backing out a change using the \hgcmd{backout} command}
260 \label{fig:undo:backout}
261 \end{figure}
263 \subsection{Backing out a non-tip change}
265 If you want to back out a change other than the last one you
266 committed, pass the \hgopt{backout}{--merge} option to the
267 \hgcmd{backout} command.
268 \interaction{backout.non-tip.clone}
269 This makes backing out any changeset a ``one-shot'' operation that's
270 usually simple and fast.
271 \interaction{backout.non-tip.backout}
273 If you take a look at the contents of \filename{myfile} after the
274 backout finishes, you'll see that the first and third changes are
275 present, but not the second.
276 \interaction{backout.non-tip.cat}
278 As the graphical history in figure~\ref{fig:undo:backout-non-tip}
279 illustrates, Mercurial actually commits \emph{two} changes in this
280 kind of situation (the box-shaped nodes are the ones that Mercurial
281 commits automatically). Before Mercurial begins the backout process,
282 it first remembers what the current parent of the working directory
283 is. It then backs out the target changeset, and commits that as a
284 changeset. Finally, it merges back to the previous parent of the
285 working directory, and commits the result of the merge.
287 % TODO: to me it looks like mercurial doesn't commit the second merge automatically!
289 \begin{figure}[htb]
290 \centering
291 \grafix{undo-non-tip}
292 \caption{Automated backout of a non-tip change using the \hgcmd{backout} command}
293 \label{fig:undo:backout-non-tip}
294 \end{figure}
296 The result is that you end up ``back where you were'', only with some
297 extra history that undoes the effect of the changeset you wanted to
298 back out.
300 \subsubsection{Always use the \hgopt{backout}{--merge} option}
302 In fact, since the \hgopt{backout}{--merge} option will do the ``right
303 thing'' whether or not the changeset you're backing out is the tip
304 (i.e.~it won't try to merge if it's backing out the tip, since there's
305 no need), you should \emph{always} use this option when you run the
306 \hgcmd{backout} command.
308 \subsection{Gaining more control of the backout process}
310 While I've recommended that you always use the
311 \hgopt{backout}{--merge} option when backing out a change, the
312 \hgcmd{backout} command lets you decide how to merge a backout
313 changeset. Taking control of the backout process by hand is something
314 you will rarely need to do, but it can be useful to understand what
315 the \hgcmd{backout} command is doing for you automatically. To
316 illustrate this, let's clone our first repository, but omit the
317 backout change that it contains.
319 \interaction{backout.manual.clone}
320 As with our earlier example, We'll commit a third changeset, then back
321 out its parent, and see what happens.
322 \interaction{backout.manual.backout}
323 Our new changeset is again a descendant of the changeset we backout
324 out; it's thus a new head, \emph{not} a descendant of the changeset
325 that was the tip. The \hgcmd{backout} command was quite explicit in
326 telling us this.
327 \interaction{backout.manual.log}
329 Again, it's easier to see what has happened by looking at a graph of
330 the revision history, in figure~\ref{fig:undo:backout-manual}. This
331 makes it clear that when we use \hgcmd{backout} to back out a change
332 other than the tip, Mercurial adds a new head to the repository (the
333 change it committed is box-shaped).
335 \begin{figure}[htb]
336 \centering
337 \grafix{undo-manual}
338 \caption{Backing out a change using the \hgcmd{backout} command}
339 \label{fig:undo:backout-manual}
340 \end{figure}
342 After the \hgcmd{backout} command has completed, it leaves the new
343 ``backout'' changeset as the parent of the working directory.
344 \interaction{backout.manual.parents}
345 Now we have two isolated sets of changes.
346 \interaction{backout.manual.heads}
348 Let's think about what we expect to see as the contents of
349 \filename{myfile} now. The first change should be present, because
350 we've never backed it out. The second change should be missing, as
351 that's the change we backed out. Since the history graph shows the
352 third change as a separate head, we \emph{don't} expect to see the
353 third change present in \filename{myfile}.
354 \interaction{backout.manual.cat}
355 To get the third change back into the file, we just do a normal merge
356 of our two heads.
357 \interaction{backout.manual.merge}
358 Afterwards, the graphical history of our repository looks like
359 figure~\ref{fig:undo:backout-manual-merge}.
361 \begin{figure}[htb]
362 \centering
363 \grafix{undo-manual-merge}
364 \caption{Manually merging a backout change}
365 \label{fig:undo:backout-manual-merge}
366 \end{figure}
368 \subsection{Why \hgcmd{backout} works as it does}
370 Here's a brief description of how the \hgcmd{backout} command works.
371 \begin{enumerate}
372 \item It ensures that the working directory is ``clean'', i.e.~that
373 the output of \hgcmd{status} would be empty.
374 \item It remembers the current parent of the working directory. Let's
375 call this changeset \texttt{orig}
376 \item It does the equivalent of a \hgcmd{update} to sync the working
377 directory to the changeset you want to back out. Let's call this
378 changeset \texttt{backout}
379 \item It finds the parent of that changeset. Let's call that
380 changeset \texttt{parent}.
381 \item For each file that the \texttt{backout} changeset affected, it
382 does the equivalent of a \hgcmdargs{revert}{-r parent} on that file,
383 to restore it to the contents it had before that changeset was
384 committed.
385 \item It commits the result as a new changeset. This changeset has
386 \texttt{backout} as its parent.
387 \item If you specify \hgopt{backout}{--merge} on the command line, it
388 merges with \texttt{orig}, and commits the result of the merge.
389 \end{enumerate}
391 An alternative way to implement the \hgcmd{backout} command would be
392 to \hgcmd{export} the to-be-backed-out changeset as a diff, then use
393 the \cmdopt{patch}{--reverse} option to the \command{patch} command to
394 reverse the effect of the change without fiddling with the working
395 directory. This sounds much simpler, but it would not work nearly as
396 well.
398 The reason that \hgcmd{backout} does an update, a commit, a merge, and
399 another commit is to give the merge machinery the best chance to do a
400 good job when dealing with all the changes \emph{between} the change
401 you're backing out and the current tip.
403 If you're backing out a changeset that's~100 revisions back in your
404 project's history, the chances that the \command{patch} command will
405 be able to apply a reverse diff cleanly are not good, because
406 intervening changes are likely to have ``broken the context'' that
407 \command{patch} uses to determine whether it can apply a patch (if
408 this sounds like gibberish, see \ref{sec:mq:patch} for a
409 discussion of the \command{patch} command). Also, Mercurial's merge
410 machinery will handle files and directories being renamed, permission
411 changes, and modifications to binary files, none of which
412 \command{patch} can deal with.
414 \section{Changes that should never have been}
415 \label{sec:undo:aaaiiieee}
417 Most of the time, the \hgcmd{backout} command is exactly what you need
418 if you want to undo the effects of a change. It leaves a permanent
419 record of exactly what you did, both when committing the original
420 changeset and when you cleaned up after it.
422 On rare occasions, though, you may find that you've committed a change
423 that really should not be present in the repository at all. For
424 example, it would be very unusual, and usually considered a mistake,
425 to commit a software project's object files as well as its source
426 files. Object files have almost no intrinsic value, and they're
427 \emph{big}, so they increase the size of the repository and the amount
428 of time it takes to clone or pull changes.
430 Before I discuss the options that you have if you commit a ``brown
431 paper bag'' change (the kind that's so bad that you want to pull a
432 brown paper bag over your head), let me first discuss some approaches
433 that probably won't work.
435 Since Mercurial treats history as accumulative---every change builds
436 on top of all changes that preceded it---you generally can't just make
437 disastrous changes disappear. The one exception is when you've just
438 committed a change, and it hasn't been pushed or pulled into another
439 repository. That's when you can safely use the \hgcmd{rollback}
440 command, as I detailed in section~\ref{sec:undo:rollback}.
442 After you've pushed a bad change to another repository, you
443 \emph{could} still use \hgcmd{rollback} to make your local copy of the
444 change disappear, but it won't have the consequences you want. The
445 change will still be present in the remote repository, so it will
446 reappear in your local repository the next time you pull.
448 If a situation like this arises, and you know which repositories your
449 bad change has propagated into, you can \emph{try} to get rid of the
450 changeefrom \emph{every} one of those repositories. This is, of
451 course, not a satisfactory solution: if you miss even a single
452 repository while you're expunging, the change is still ``in the
453 wild'', and could propagate further.
455 If you've committed one or more changes \emph{after} the change that
456 you'd like to see disappear, your options are further reduced.
457 Mercurial doesn't provide a way to ``punch a hole'' in history,
458 leaving changesets intact.
460 XXX This needs filling out. The \texttt{hg-replay} script in the
461 \texttt{examples} directory works, but doesn't handle merge
462 changesets. Kind of an important omission.
464 \subsection{Protect yourself from ``escaped'' changes}
466 If you've committed some changes to your local repository and they've
467 been pushed or pulled somewhere else, this isn't necessarily a
468 disaster. You can protect yourself ahead of time against some classes
469 of bad changeset. This is particularly easy if your team usually
470 pulls changes from a central repository.
472 By configuring some hooks on that repository to validate incoming
473 changesets (see chapter~\ref{chap:hook}), you can automatically
474 prevent some kinds of bad changeset from being pushed to the central
475 repository at all. With such a configuration in place, some kinds of
476 bad changeset will naturally tend to ``die out'' because they can't
477 propagate into the central repository. Better yet, this happens
478 without any need for explicit intervention.
480 For instance, an incoming change hook that verifies that a changeset
481 will actually compile can prevent people from inadvertantly ``breaking
482 the build''.
484 \section{Finding the source of a bug}
485 \label{sec:undo:bisect}
487 While it's all very well to be able to back out a changeset that
488 introduced a bug, this requires that you know which changeset to back
489 out. Mercurial provides an invaluable command, called
490 \hgcmd{bisect}, that helps you to automate this process and accomplish
491 it very efficiently.
493 The idea behind the \hgcmd{bisect} command is that a changeset has
494 introduced some change of behaviour that you can identify with a
495 simple binary test. You don't know which piece of code introduced the
496 change, but you know how to test for the presence of the bug. The
497 \hgcmd{bisect} command uses your test to direct its search for the
498 changeset that introduced the code that caused the bug.
500 Here are a few scenarios to help you understand how you might apply
501 this command.
502 \begin{itemize}
503 \item The most recent version of your software has a bug that you
504 remember wasn't present a few weeks ago, but you don't know when it
505 was introduced. Here, your binary test checks for the presence of
506 that bug.
507 \item You fixed a bug in a rush, and now it's time to close the entry
508 in your team's bug database. The bug database requires a changeset
509 ID when you close an entry, but you don't remember which changeset
510 you fixed the bug in. Once again, your binary test checks for the
511 presence of the bug.
512 \item Your software works correctly, but runs~15\% slower than the
513 last time you measured it. You want to know which changeset
514 introduced the performance regression. In this case, your binary
515 test measures the performance of your software, to see whether it's
516 ``fast'' or ``slow''.
517 \item The sizes of the components of your project that you ship
518 exploded recently, and you suspect that something changed in the way
519 you build your project.
520 \end{itemize}
522 From these examples, it should be clear that the \hgcmd{bisect}
523 command is not useful only for finding the sources of bugs. You can
524 use it to find any ``emergent property'' of a repository (anything
525 that you can't find from a simple text search of the files in the
526 tree) for which you can write a binary test.
528 We'll introduce a little bit of terminology here, just to make it
529 clear which parts of the search process are your responsibility, and
530 which are Mercurial's. A \emph{test} is something that \emph{you} run
531 when \hgcmd{bisect} chooses a changeset. A \emph{probe} is what
532 \hgcmd{bisect} runs to tell whether a revision is good. Finally,
533 we'll use the word ``bisect'', as both a noun and a verb, to stand in
534 for the phrase ``search using the \hgcmd{bisect} command.
536 One simple way to automate the searching process would be simply to
537 probe every changeset. However, this scales poorly. If it took ten
538 minutes to test a single changeset, and you had 10,000 changesets in
539 your repository, the exhaustive approach would take on average~35
540 \emph{days} to find the changeset that introduced a bug. Even if you
541 knew that the bug was introduced by one of the last 500 changesets,
542 and limited your search to those, you'd still be looking at over 40
543 hours to find the changeset that introduced your bug.
545 What the \hgcmd{bisect} command does is use its knowledge of the
546 ``shape'' of your project's revision history to perform a search in
547 time proportional to the \emph{logarithm} of the number of changesets
548 to check (the kind of search it performs is called a dichotomic
549 search). With this approach, searching through 10,000 changesets will
550 take less than three hours, even at ten minutes per test (the search
551 will require about 14 tests). Limit your search to the last hundred
552 changesets, and it will take only about an hour (roughly seven tests).
554 The \hgcmd{bisect} command is aware of the ``branchy'' nature of a
555 Mercurial project's revision history, so it has no problems dealing
556 with branches, merges, or multiple heads in a repoository. It can
557 prune entire branches of history with a single probe, which is how it
558 operates so efficiently.
560 \subsection{Using the \hgcmd{bisect} command}
562 Here's an example of \hgcmd{bisect} in action.
564 \begin{note}
565 In versions 0.9.5 and earlier of Mercurial, \hgcmd{bisect} was not a
566 core command: it was distributed with Mercurial as an extension.
567 This section describes the built-in command, not the old extension.
568 \end{note}
570 Now let's create a repository, so that we can try out the
571 \hgcmd{bisect} command in isolation.
572 \interaction{bisect.init}
573 We'll simulate a project that has a bug in it in a simple-minded way:
574 create trivial changes in a loop, and nominate one specific change
575 that will have the ``bug''. This loop creates 35 changesets, each
576 adding a single file to the repository. We'll represent our ``bug''
577 with a file that contains the text ``i have a gub''.
578 \interaction{bisect.commits}
580 The next thing that we'd like to do is figure out how to use the
581 \hgcmd{bisect} command. We can use Mercurial's normal built-in help
582 mechanism for this.
583 \interaction{bisect.help}
585 The \hgcmd{bisect} command works in steps. Each step proceeds as follows.
586 \begin{enumerate}
587 \item You run your binary test.
588 \begin{itemize}
589 \item If the test succeeded, you tell \hgcmd{bisect} by running the
590 \hgcmdargs{bisect}{good} command.
591 \item If it failed, run the \hgcmdargs{bisect}{--bad} command.
592 \end{itemize}
593 \item The command uses your information to decide which changeset to
594 test next.
595 \item It updates the working directory to that changeset, and the
596 process begins again.
597 \end{enumerate}
598 The process ends when \hgcmd{bisect} identifies a unique changeset
599 that marks the point where your test transitioned from ``succeeding''
600 to ``failing''.
602 To start the search, we must run the \hgcmdargs{bisect}{--reset} command.
603 \interaction{bisect.search.init}
605 In our case, the binary test we use is simple: we check to see if any
606 file in the repository contains the string ``i have a gub''. If it
607 does, this changeset contains the change that ``caused the bug''. By
608 convention, a changeset that has the property we're searching for is
609 ``bad'', while one that doesn't is ``good''.
611 Most of the time, the revision to which the working directory is
612 synced (usually the tip) already exhibits the problem introduced by
613 the buggy change, so we'll mark it as ``bad''.
614 \interaction{bisect.search.bad-init}
616 Our next task is to nominate a changeset that we know \emph{doesn't}
617 have the bug; the \hgcmd{bisect} command will ``bracket'' its search
618 between the first pair of good and bad changesets. In our case, we
619 know that revision~10 didn't have the bug. (I'll have more words
620 about choosing the first ``good'' changeset later.)
621 \interaction{bisect.search.good-init}
623 Notice that this command printed some output.
624 \begin{itemize}
625 \item It told us how many changesets it must consider before it can
626 identify the one that introduced the bug, and how many tests that
627 will require.
628 \item It updated the working directory to the next changeset to test,
629 and told us which changeset it's testing.
630 \end{itemize}
632 We now run our test in the working directory. We use the
633 \command{grep} command to see if our ``bad'' file is present in the
634 working directory. If it is, this revision is bad; if not, this
635 revision is good.
636 \interaction{bisect.search.step1}
638 This test looks like a perfect candidate for automation, so let's turn
639 it into a shell function.
640 \interaction{bisect.search.mytest}
641 We can now run an entire test step with a single command,
642 \texttt{mytest}.
643 \interaction{bisect.search.step2}
644 A few more invocations of our canned test step command, and we're
645 done.
646 \interaction{bisect.search.rest}
648 Even though we had~40 changesets to search through, the \hgcmd{bisect}
649 command let us find the changeset that introduced our ``bug'' with
650 only five tests. Because the number of tests that the \hgcmd{bisect}
651 command performs grows logarithmically with the number of changesets to
652 search, the advantage that it has over the ``brute force'' search
653 approach increases with every changeset you add.
655 \subsection{Cleaning up after your search}
657 When you're finished using the \hgcmd{bisect} command in a
658 repository, you can use the \hgcmdargs{bisect}{reset} command to drop
659 the information it was using to drive your search. The command
660 doesn't use much space, so it doesn't matter if you forget to run this
661 command. However, \hgcmd{bisect} won't let you start a new search in
662 that repository until you do a \hgcmdargs{bisect}{reset}.
663 \interaction{bisect.search.reset}
665 \section{Tips for finding bugs effectively}
667 \subsection{Give consistent input}
669 The \hgcmd{bisect} command requires that you correctly report the
670 result of every test you perform. If you tell it that a test failed
671 when it really succeeded, it \emph{might} be able to detect the
672 inconsistency. If it can identify an inconsistency in your reports,
673 it will tell you that a particular changeset is both good and bad.
674 However, it can't do this perfectly; it's about as likely to report
675 the wrong changeset as the source of the bug.
677 \subsection{Automate as much as possible}
679 When I started using the \hgcmd{bisect} command, I tried a few times
680 to run my tests by hand, on the command line. This is an approach
681 that I, at least, am not suited to. After a few tries, I found that I
682 was making enough mistakes that I was having to restart my searches
683 several times before finally getting correct results.
685 My initial problems with driving the \hgcmd{bisect} command by hand
686 occurred even with simple searches on small repositories; if the
687 problem you're looking for is more subtle, or the number of tests that
688 \hgcmd{bisect} must perform increases, the likelihood of operator
689 error ruining the search is much higher. Once I started automating my
690 tests, I had much better results.
692 The key to automated testing is twofold:
693 \begin{itemize}
694 \item always test for the same symptom, and
695 \item always feed consistent input to the \hgcmd{bisect} command.
696 \end{itemize}
697 In my tutorial example above, the \command{grep} command tests for the
698 symptom, and the \texttt{if} statement takes the result of this check
699 and ensures that we always feed the same input to the \hgcmd{bisect}
700 command. The \texttt{mytest} function marries these together in a
701 reproducible way, so that every test is uniform and consistent.
703 \subsection{Check your results}
705 Because the output of a \hgcmd{bisect} search is only as good as the
706 input you give it, don't take the changeset it reports as the
707 absolute truth. A simple way to cross-check its report is to manually
708 run your test at each of the following changesets:
709 \begin{itemize}
710 \item The changeset that it reports as the first bad revision. Your
711 test should still report this as bad.
712 \item The parent of that changeset (either parent, if it's a merge).
713 Your test should report this changeset as good.
714 \item A child of that changeset. Your test should report this
715 changeset as bad.
716 \end{itemize}
718 \subsection{Beware interference between bugs}
720 It's possible that your search for one bug could be disrupted by the
721 presence of another. For example, let's say your software crashes at
722 revision 100, and worked correctly at revision 50. Unknown to you,
723 someone else introduced a different crashing bug at revision 60, and
724 fixed it at revision 80. This could distort your results in one of
725 several ways.
727 It is possible that this other bug completely ``masks'' yours, which
728 is to say that it occurs before your bug has a chance to manifest
729 itself. If you can't avoid that other bug (for example, it prevents
730 your project from building), and so can't tell whether your bug is
731 present in a particular changeset, the \hgcmd{bisect} command cannot
732 help you directly. Instead, you can mark a changeset as untested by
733 running \hgcmdargs{bisect}{--skip}.
735 A different problem could arise if your test for a bug's presence is
736 not specific enough. If you check for ``my program crashes'', then
737 both your crashing bug and an unrelated crashing bug that masks it
738 will look like the same thing, and mislead \hgcmd{bisect}.
740 Another useful situation in which to use \hgcmdargs{bisect}{--skip} is
741 if you can't test a revision because your project was in a broken and
742 hence untestable state at that revision, perhaps because someone
743 checked in a change that prevented the project from building.
745 \subsection{Bracket your search lazily}
747 Choosing the first ``good'' and ``bad'' changesets that will mark the
748 end points of your search is often easy, but it bears a little
749 discussion nevertheless. From the perspective of \hgcmd{bisect}, the
750 ``newest'' changeset is conventionally ``bad'', and the older
751 changeset is ``good''.
753 If you're having trouble remembering when a suitable ``good'' change
754 was, so that you can tell \hgcmd{bisect}, you could do worse than
755 testing changesets at random. Just remember to eliminate contenders
756 that can't possibly exhibit the bug (perhaps because the feature with
757 the bug isn't present yet) and those where another problem masks the
758 bug (as I discussed above).
760 Even if you end up ``early'' by thousands of changesets or months of
761 history, you will only add a handful of tests to the total number that
762 \hgcmd{bisect} must perform, thanks to its logarithmic behaviour.
764 %%% Local Variables:
765 %%% mode: latex
766 %%% TeX-master: "00book"
767 %%% End: