hgbook: 5da084395a69 en/undo.tex

hgbook

view en/undo.tex @ 432:5da084395a69

translated some other paragrpahs

author	jerojasro@localhost
date	Sun Nov 30 18:55:34 2008 -0500 (2008-11-30)
parents	7a6bd93174bd
children	f79542a53cb2 91adcea08b33

line source

1 \chapter{Finding and fixing your mistakes}

2 \label{chap:undo}

4 To err might be human, but to really handle the consequences well

5 takes a top-notch revision control system. In this chapter, we'll

6 discuss some of the techniques you can use when you find that a

7 problem has crept into your project. Mercurial has some highly

8 capable features that will help you to isolate the sources of

9 problems, and to handle them appropriately.

11 \section{Erasing local history}

13 \subsection{The accidental commit}

15 I have the occasional but persistent problem of typing rather more

16 quickly than I can think, which sometimes results in me committing a

17 changeset that is either incomplete or plain wrong. In my case, the

18 usual kind of incomplete changeset is one in which I've created a new

19 source file, but forgotten to \hgcmd{add} it. A ``plain wrong''

20 changeset is not as common, but no less annoying.

22 \subsection{Rolling back a transaction}

23 \label{sec:undo:rollback}

25 In section~\ref{sec:concepts:txn}, I mentioned that Mercurial treats

26 each modification of a repository as a \emph{transaction}. Every time

27 you commit a changeset or pull changes from another repository,

28 Mercurial remembers what you did. You can undo, or \emph{roll back},

29 exactly one of these actions using the \hgcmd{rollback} command. (See

30 section~\ref{sec:undo:rollback-after-push} for an important caveat

31 about the use of this command.)

33 Here's a mistake that I often find myself making: committing a change

34 in which I've created a new file, but forgotten to \hgcmd{add} it.

35 \interaction{rollback.commit}

36 Looking at the output of \hgcmd{status} after the commit immediately

37 confirms the error.

38 \interaction{rollback.status}

39 The commit captured the changes to the file \filename{a}, but not the

40 new file \filename{b}. If I were to push this changeset to a

41 repository that I shared with a colleague, the chances are high that

42 something in \filename{a} would refer to \filename{b}, which would not

43 be present in their repository when they pulled my changes. I would

44 thus become the object of some indignation.

46 However, luck is with me---I've caught my error before I pushed the

47 changeset. I use the \hgcmd{rollback} command, and Mercurial makes

48 that last changeset vanish.

49 \interaction{rollback.rollback}

50 Notice that the changeset is no longer present in the repository's

51 history, and the working directory once again thinks that the file

52 \filename{a} is modified. The commit and rollback have left the

53 working directory exactly as it was prior to the commit; the changeset

54 has been completely erased. I can now safely \hgcmd{add} the file

55 \filename{b}, and rerun my commit.

56 \interaction{rollback.add}

58 \subsection{The erroneous pull}

60 It's common practice with Mercurial to maintain separate development

61 branches of a project in different repositories. Your development

62 team might have one shared repository for your project's ``0.9''

63 release, and another, containing different changes, for the ``1.0''

64 release.

66 Given this, you can imagine that the consequences could be messy if

67 you had a local ``0.9'' repository, and accidentally pulled changes

68 from the shared ``1.0'' repository into it. At worst, you could be

69 paying insufficient attention, and push those changes into the shared

70 ``0.9'' tree, confusing your entire team (but don't worry, we'll

71 return to this horror scenario later). However, it's more likely that

72 you'll notice immediately, because Mercurial will display the URL it's

73 pulling from, or you will see it pull a suspiciously large number of

74 changes into the repository.

76 The \hgcmd{rollback} command will work nicely to expunge all of the

77 changesets that you just pulled. Mercurial groups all changes from

78 one \hgcmd{pull} into a single transaction, so one \hgcmd{rollback} is

79 all you need to undo this mistake.

81 \subsection{Rolling back is useless once you've pushed}

82 \label{sec:undo:rollback-after-push}

84 The value of the \hgcmd{rollback} command drops to zero once you've

85 pushed your changes to another repository. Rolling back a change

86 makes it disappear entirely, but \emph{only} in the repository in

87 which you perform the \hgcmd{rollback}. Because a rollback eliminates

88 history, there's no way for the disappearance of a change to propagate

89 between repositories.

91 If you've pushed a change to another repository---particularly if it's

92 a shared repository---it has essentially ``escaped into the wild,''

93 and you'll have to recover from your mistake in a different way. What

94 will happen if you push a changeset somewhere, then roll it back, then

95 pull from the repository you pushed to, is that the changeset will

96 reappear in your repository.

98 (If you absolutely know for sure that the change you want to roll back

99 is the most recent change in the repository that you pushed to,

100 \emph{and} you know that nobody else could have pulled it from that

101 repository, you can roll back the changeset there, too, but you really

102 should really not rely on this working reliably. If you do this,

103 sooner or later a change really will make it into a repository that

104 you don't directly control (or have forgotten about), and come back to

105 bite you.)

106

107 \subsection{You can only roll back once}

108

109 Mercurial stores exactly one transaction in its transaction log; that

110 transaction is the most recent one that occurred in the repository.

111 This means that you can only roll back one transaction. If you expect

112 to be able to roll back one transaction, then its predecessor, this is

113 not the behaviour you will get.

114 \interaction{rollback.twice}

115 Once you've rolled back one transaction in a repository, you can't

116 roll back again in that repository until you perform another commit or

117 pull.

118

119 \section{Reverting the mistaken change}

120

121 If you make a modification to a file, and decide that you really

122 didn't want to change the file at all, and you haven't yet committed

123 your changes, the \hgcmd{revert} command is the one you'll need. It

124 looks at the changeset that's the parent of the working directory, and

125 restores the contents of the file to their state as of that changeset.

126 (That's a long-winded way of saying that, in the normal case, it

127 undoes your modifications.)

128

129 Let's illustrate how the \hgcmd{revert} command works with yet another

130 small example. We'll begin by modifying a file that Mercurial is

131 already tracking.

132 \interaction{daily.revert.modify}

133 If we don't want that change, we can simply \hgcmd{revert} the file.

134 \interaction{daily.revert.unmodify}

135 The \hgcmd{revert} command provides us with an extra degree of safety

136 by saving our modified file with a \filename{.orig} extension.

137 \interaction{daily.revert.status}

138

139 Here is a summary of the cases that the \hgcmd{revert} command can

140 deal with. We will describe each of these in more detail in the

141 section that follows.

142 \begin{itemize}

143 \item If you modify a file, it will restore the file to its unmodified

144 state.

145 \item If you \hgcmd{add} a file, it will undo the ``added'' state of

146 the file, but leave the file itself untouched.

147 \item If you delete a file without telling Mercurial, it will restore

148 the file to its unmodified contents.

149 \item If you use the \hgcmd{remove} command to remove a file, it will

150 undo the ``removed'' state of the file, and restore the file to its

151 unmodified contents.

152 \end{itemize}

153

154 \subsection{File management errors}

155 \label{sec:undo:mgmt}

156

157 The \hgcmd{revert} command is useful for more than just modified

158 files. It lets you reverse the results of all of Mercurial's file

159 management commands---\hgcmd{add}, \hgcmd{remove}, and so on.

160

161 If you \hgcmd{add} a file, then decide that in fact you don't want

162 Mercurial to track it, use \hgcmd{revert} to undo the add. Don't

163 worry; Mercurial will not modify the file in any way. It will just

164 ``unmark'' the file.

165 \interaction{daily.revert.add}

166

167 Similarly, if you ask Mercurial to \hgcmd{remove} a file, you can use

168 \hgcmd{revert} to restore it to the contents it had as of the parent

169 of the working directory.

170 \interaction{daily.revert.remove}

171 This works just as well for a file that you deleted by hand, without

172 telling Mercurial (recall that in Mercurial terminology, this kind of

173 file is called ``missing'').

174 \interaction{daily.revert.missing}

175

176 If you revert a \hgcmd{copy}, the copied-to file remains in your

177 working directory afterwards, untracked. Since a copy doesn't affect

178 the copied-from file in any way, Mercurial doesn't do anything with

179 the copied-from file.

180 \interaction{daily.revert.copy}

181

182 \subsubsection{A slightly special case: reverting a rename}

183

184 If you \hgcmd{rename} a file, there is one small detail that

185 you should remember. When you \hgcmd{revert} a rename, it's not

186 enough to provide the name of the renamed-to file, as you can see

187 here.

188 \interaction{daily.revert.rename}

189 As you can see from the output of \hgcmd{status}, the renamed-to file

190 is no longer identified as added, but the renamed-\emph{from} file is

191 still removed! This is counter-intuitive (at least to me), but at

192 least it's easy to deal with.

193 \interaction{daily.revert.rename-orig}

194 So remember, to revert a \hgcmd{rename}, you must provide \emph{both}

195 the source and destination names.

196

197 % TODO: the output doesn't look like it will be removed!

198

199 (By the way, if you rename a file, then modify the renamed-to file,

200 then revert both components of the rename, when Mercurial restores the

201 file that was removed as part of the rename, it will be unmodified.

202 If you need the modifications in the renamed-to file to show up in the

203 renamed-from file, don't forget to copy them over.)

204

205 These fiddly aspects of reverting a rename arguably constitute a small

206 bug in Mercurial.

207

208 \section{Dealing with committed changes}

209

210 Consider a case where you have committed a change $a$, and another

211 change $b$ on top of it; you then realise that change $a$ was

212 incorrect. Mercurial lets you ``back out'' an entire changeset

213 automatically, and building blocks that let you reverse part of a

214 changeset by hand.

215

216 Before you read this section, here's something to keep in mind: the

217 \hgcmd{backout} command undoes changes by \emph{adding} history, not

218 by modifying or erasing it. It's the right tool to use if you're

219 fixing bugs, but not if you're trying to undo some change that has

220 catastrophic consequences. To deal with those, see

221 section~\ref{sec:undo:aaaiiieee}.

222

223 \subsection{Backing out a changeset}

224

225 The \hgcmd{backout} command lets you ``undo'' the effects of an entire

226 changeset in an automated fashion. Because Mercurial's history is

227 immutable, this command \emph{does not} get rid of the changeset you

228 want to undo. Instead, it creates a new changeset that

229 \emph{reverses} the effect of the to-be-undone changeset.

230

231 The operation of the \hgcmd{backout} command is a little intricate, so

232 let's illustrate it with some examples. First, we'll create a

233 repository with some simple changes.

234 \interaction{backout.init}

235

236 The \hgcmd{backout} command takes a single changeset ID as its

237 argument; this is the changeset to back out. Normally,

238 \hgcmd{backout} will drop you into a text editor to write a commit

239 message, so you can record why you're backing the change out. In this

240 example, we provide a commit message on the command line using the

241 \hgopt{backout}{-m} option.

242

243 \subsection{Backing out the tip changeset}

244

245 We're going to start by backing out the last changeset we committed.

246 \interaction{backout.simple}

247 You can see that the second line from \filename{myfile} is no longer

248 present. Taking a look at the output of \hgcmd{log} gives us an idea

249 of what the \hgcmd{backout} command has done.

250 \interaction{backout.simple.log}

251 Notice that the new changeset that \hgcmd{backout} has created is a

252 child of the changeset we backed out. It's easier to see this in

253 figure~\ref{fig:undo:backout}, which presents a graphical view of the

254 change history. As you can see, the history is nice and linear.

255

256 \begin{figure}[htb]

257 \centering

258 \grafix{undo-simple}

259 \caption{Backing out a change using the \hgcmd{backout} command}

260 \label{fig:undo:backout}

261 \end{figure}

262

263 \subsection{Backing out a non-tip change}

264

265 If you want to back out a change other than the last one you

266 committed, pass the \hgopt{backout}{--merge} option to the

267 \hgcmd{backout} command.

268 \interaction{backout.non-tip.clone}

269 This makes backing out any changeset a ``one-shot'' operation that's

270 usually simple and fast.

271 \interaction{backout.non-tip.backout}

272

273 If you take a look at the contents of \filename{myfile} after the

274 backout finishes, you'll see that the first and third changes are

275 present, but not the second.

276 \interaction{backout.non-tip.cat}

277

278 As the graphical history in figure~\ref{fig:undo:backout-non-tip}

279 illustrates, Mercurial actually commits \emph{two} changes in this

280 kind of situation (the box-shaped nodes are the ones that Mercurial

281 commits automatically). Before Mercurial begins the backout process,

282 it first remembers what the current parent of the working directory

283 is. It then backs out the target changeset, and commits that as a

284 changeset. Finally, it merges back to the previous parent of the

285 working directory, and commits the result of the merge.

286

287 % TODO: to me it looks like mercurial doesn't commit the second merge automatically!

288

289 \begin{figure}[htb]

290 \centering

291 \grafix{undo-non-tip}

292 \caption{Automated backout of a non-tip change using the \hgcmd{backout} command}

293 \label{fig:undo:backout-non-tip}

294 \end{figure}

295

296 The result is that you end up ``back where you were'', only with some

297 extra history that undoes the effect of the changeset you wanted to

298 back out.

299

300 \subsubsection{Always use the \hgopt{backout}{--merge} option}

301

302 In fact, since the \hgopt{backout}{--merge} option will do the ``right

303 thing'' whether or not the changeset you're backing out is the tip

304 (i.e.~it won't try to merge if it's backing out the tip, since there's

305 no need), you should \emph{always} use this option when you run the

306 \hgcmd{backout} command.

307

308 \subsection{Gaining more control of the backout process}

309

310 While I've recommended that you always use the

311 \hgopt{backout}{--merge} option when backing out a change, the

312 \hgcmd{backout} command lets you decide how to merge a backout

313 changeset. Taking control of the backout process by hand is something

314 you will rarely need to do, but it can be useful to understand what

315 the \hgcmd{backout} command is doing for you automatically. To

316 illustrate this, let's clone our first repository, but omit the

317 backout change that it contains.

318

319 \interaction{backout.manual.clone}

320 As with our earlier example, We'll commit a third changeset, then back

321 out its parent, and see what happens.

322 \interaction{backout.manual.backout}

323 Our new changeset is again a descendant of the changeset we backout

324 out; it's thus a new head, \emph{not} a descendant of the changeset

325 that was the tip. The \hgcmd{backout} command was quite explicit in

326 telling us this.

327 \interaction{backout.manual.log}

328

329 Again, it's easier to see what has happened by looking at a graph of

330 the revision history, in figure~\ref{fig:undo:backout-manual}. This

331 makes it clear that when we use \hgcmd{backout} to back out a change

332 other than the tip, Mercurial adds a new head to the repository (the

333 change it committed is box-shaped).

334

335 \begin{figure}[htb]

336 \centering

337 \grafix{undo-manual}

338 \caption{Backing out a change using the \hgcmd{backout} command}

339 \label{fig:undo:backout-manual}

340 \end{figure}

341

342 After the \hgcmd{backout} command has completed, it leaves the new

343 ``backout'' changeset as the parent of the working directory.

344 \interaction{backout.manual.parents}

345 Now we have two isolated sets of changes.

346 \interaction{backout.manual.heads}

347

348 Let's think about what we expect to see as the contents of

349 \filename{myfile} now. The first change should be present, because

350 we've never backed it out. The second change should be missing, as

351 that's the change we backed out. Since the history graph shows the

352 third change as a separate head, we \emph{don't} expect to see the

353 third change present in \filename{myfile}.

354 \interaction{backout.manual.cat}

355 To get the third change back into the file, we just do a normal merge

356 of our two heads.

357 \interaction{backout.manual.merge}

358 Afterwards, the graphical history of our repository looks like

359 figure~\ref{fig:undo:backout-manual-merge}.

360

361 \begin{figure}[htb]

362 \centering

363 \grafix{undo-manual-merge}

364 \caption{Manually merging a backout change}

365 \label{fig:undo:backout-manual-merge}

366 \end{figure}

367

368 \subsection{Why \hgcmd{backout} works as it does}

369

370 Here's a brief description of how the \hgcmd{backout} command works.

371 \begin{enumerate}

372 \item It ensures that the working directory is ``clean'', i.e.~that

373 the output of \hgcmd{status} would be empty.

374 \item It remembers the current parent of the working directory. Let's

375 call this changeset \texttt{orig}

376 \item It does the equivalent of a \hgcmd{update} to sync the working

377 directory to the changeset you want to back out. Let's call this

378 changeset \texttt{backout}

379 \item It finds the parent of that changeset. Let's call that

380 changeset \texttt{parent}.

381 \item For each file that the \texttt{backout} changeset affected, it

382 does the equivalent of a \hgcmdargs{revert}{-r parent} on that file,

383 to restore it to the contents it had before that changeset was

384 committed.

385 \item It commits the result as a new changeset. This changeset has

386 \texttt{backout} as its parent.

387 \item If you specify \hgopt{backout}{--merge} on the command line, it

388 merges with \texttt{orig}, and commits the result of the merge.

389 \end{enumerate}

390

391 An alternative way to implement the \hgcmd{backout} command would be

392 to \hgcmd{export} the to-be-backed-out changeset as a diff, then use

393 the \cmdopt{patch}{--reverse} option to the \command{patch} command to

394 reverse the effect of the change without fiddling with the working

395 directory. This sounds much simpler, but it would not work nearly as

396 well.

397

398 The reason that \hgcmd{backout} does an update, a commit, a merge, and

399 another commit is to give the merge machinery the best chance to do a

400 good job when dealing with all the changes \emph{between} the change

401 you're backing out and the current tip.

402

403 If you're backing out a changeset that's~100 revisions back in your

404 project's history, the chances that the \command{patch} command will

405 be able to apply a reverse diff cleanly are not good, because

406 intervening changes are likely to have ``broken the context'' that

407 \command{patch} uses to determine whether it can apply a patch (if

408 this sounds like gibberish, see \ref{sec:mq:patch} for a

409 discussion of the \command{patch} command). Also, Mercurial's merge

410 machinery will handle files and directories being renamed, permission

411 changes, and modifications to binary files, none of which

412 \command{patch} can deal with.

413

414 \section{Changes that should never have been}

415 \label{sec:undo:aaaiiieee}

416

417 Most of the time, the \hgcmd{backout} command is exactly what you need

418 if you want to undo the effects of a change. It leaves a permanent

419 record of exactly what you did, both when committing the original

420 changeset and when you cleaned up after it.

421

422 On rare occasions, though, you may find that you've committed a change

423 that really should not be present in the repository at all. For

424 example, it would be very unusual, and usually considered a mistake,

425 to commit a software project's object files as well as its source

426 files. Object files have almost no intrinsic value, and they're

427 \emph{big}, so they increase the size of the repository and the amount

428 of time it takes to clone or pull changes.

429

430 Before I discuss the options that you have if you commit a ``brown

431 paper bag'' change (the kind that's so bad that you want to pull a

432 brown paper bag over your head), let me first discuss some approaches

433 that probably won't work.

434

435 Since Mercurial treats history as accumulative---every change builds

436 on top of all changes that preceded it---you generally can't just make

437 disastrous changes disappear. The one exception is when you've just

438 committed a change, and it hasn't been pushed or pulled into another

439 repository. That's when you can safely use the \hgcmd{rollback}

440 command, as I detailed in section~\ref{sec:undo:rollback}.

441

442 After you've pushed a bad change to another repository, you

443 \emph{could} still use \hgcmd{rollback} to make your local copy of the

444 change disappear, but it won't have the consequences you want. The

445 change will still be present in the remote repository, so it will

446 reappear in your local repository the next time you pull.

447

448 If a situation like this arises, and you know which repositories your

449 bad change has propagated into, you can \emph{try} to get rid of the

450 changeefrom \emph{every} one of those repositories. This is, of

451 course, not a satisfactory solution: if you miss even a single

452 repository while you're expunging, the change is still ``in the

453 wild'', and could propagate further.

454

455 If you've committed one or more changes \emph{after} the change that

456 you'd like to see disappear, your options are further reduced.

457 Mercurial doesn't provide a way to ``punch a hole'' in history,

458 leaving changesets intact.

459

460 XXX This needs filling out. The \texttt{hg-replay} script in the

461 \texttt{examples} directory works, but doesn't handle merge

462 changesets. Kind of an important omission.

463

464 \subsection{Protect yourself from ``escaped'' changes}

465

466 If you've committed some changes to your local repository and they've

467 been pushed or pulled somewhere else, this isn't necessarily a

468 disaster. You can protect yourself ahead of time against some classes

469 of bad changeset. This is particularly easy if your team usually

470 pulls changes from a central repository.

471

472 By configuring some hooks on that repository to validate incoming

473 changesets (see chapter~\ref{chap:hook}), you can automatically

474 prevent some kinds of bad changeset from being pushed to the central

475 repository at all. With such a configuration in place, some kinds of

476 bad changeset will naturally tend to ``die out'' because they can't

477 propagate into the central repository. Better yet, this happens

478 without any need for explicit intervention.

479

480 For instance, an incoming change hook that verifies that a changeset

481 will actually compile can prevent people from inadvertantly ``breaking

482 the build''.

483

484 \section{Finding the source of a bug}

485 \label{sec:undo:bisect}

486

487 While it's all very well to be able to back out a changeset that

488 introduced a bug, this requires that you know which changeset to back

489 out. Mercurial provides an invaluable command, called

490 \hgcmd{bisect}, that helps you to automate this process and accomplish

491 it very efficiently.

492

493 The idea behind the \hgcmd{bisect} command is that a changeset has

494 introduced some change of behaviour that you can identify with a

495 simple binary test. You don't know which piece of code introduced the

496 change, but you know how to test for the presence of the bug. The

497 \hgcmd{bisect} command uses your test to direct its search for the

498 changeset that introduced the code that caused the bug.

499

500 Here are a few scenarios to help you understand how you might apply

501 this command.

502 \begin{itemize}

503 \item The most recent version of your software has a bug that you

504 remember wasn't present a few weeks ago, but you don't know when it

505 was introduced. Here, your binary test checks for the presence of

506 that bug.

507 \item You fixed a bug in a rush, and now it's time to close the entry

508 in your team's bug database. The bug database requires a changeset

509 ID when you close an entry, but you don't remember which changeset

510 you fixed the bug in. Once again, your binary test checks for the

511 presence of the bug.

512 \item Your software works correctly, but runs~15\% slower than the

513 last time you measured it. You want to know which changeset

514 introduced the performance regression. In this case, your binary

515 test measures the performance of your software, to see whether it's

516 ``fast'' or ``slow''.

517 \item The sizes of the components of your project that you ship

518 exploded recently, and you suspect that something changed in the way

519 you build your project.

520 \end{itemize}

521

522 From these examples, it should be clear that the \hgcmd{bisect}

523 command is not useful only for finding the sources of bugs. You can

524 use it to find any ``emergent property'' of a repository (anything

525 that you can't find from a simple text search of the files in the

526 tree) for which you can write a binary test.

527

528 We'll introduce a little bit of terminology here, just to make it

529 clear which parts of the search process are your responsibility, and

530 which are Mercurial's. A \emph{test} is something that \emph{you} run

531 when \hgcmd{bisect} chooses a changeset. A \emph{probe} is what

532 \hgcmd{bisect} runs to tell whether a revision is good. Finally,

533 we'll use the word ``bisect'', as both a noun and a verb, to stand in

534 for the phrase ``search using the \hgcmd{bisect} command.

535

536 One simple way to automate the searching process would be simply to

537 probe every changeset. However, this scales poorly. If it took ten

538 minutes to test a single changeset, and you had 10,000 changesets in

539 your repository, the exhaustive approach would take on average~35

540 \emph{days} to find the changeset that introduced a bug. Even if you

541 knew that the bug was introduced by one of the last 500 changesets,

542 and limited your search to those, you'd still be looking at over 40

543 hours to find the changeset that introduced your bug.

544

545 What the \hgcmd{bisect} command does is use its knowledge of the

546 ``shape'' of your project's revision history to perform a search in

547 time proportional to the \emph{logarithm} of the number of changesets

548 to check (the kind of search it performs is called a dichotomic

549 search). With this approach, searching through 10,000 changesets will

550 take less than three hours, even at ten minutes per test (the search

551 will require about 14 tests). Limit your search to the last hundred

552 changesets, and it will take only about an hour (roughly seven tests).

553

554 The \hgcmd{bisect} command is aware of the ``branchy'' nature of a

555 Mercurial project's revision history, so it has no problems dealing

556 with branches, merges, or multiple heads in a repoository. It can

557 prune entire branches of history with a single probe, which is how it

558 operates so efficiently.

559

560 \subsection{Using the \hgcmd{bisect} command}

561

562 Here's an example of \hgcmd{bisect} in action.

563

564 \begin{note}

565 In versions 0.9.5 and earlier of Mercurial, \hgcmd{bisect} was not a

566 core command: it was distributed with Mercurial as an extension.

567 This section describes the built-in command, not the old extension.

568 \end{note}

569

570 Now let's create a repository, so that we can try out the

571 \hgcmd{bisect} command in isolation.

572 \interaction{bisect.init}

573 We'll simulate a project that has a bug in it in a simple-minded way:

574 create trivial changes in a loop, and nominate one specific change

575 that will have the ``bug''. This loop creates 35 changesets, each

576 adding a single file to the repository. We'll represent our ``bug''

577 with a file that contains the text ``i have a gub''.

578 \interaction{bisect.commits}

579

580 The next thing that we'd like to do is figure out how to use the

581 \hgcmd{bisect} command. We can use Mercurial's normal built-in help

582 mechanism for this.

583 \interaction{bisect.help}

584

585 The \hgcmd{bisect} command works in steps. Each step proceeds as follows.

586 \begin{enumerate}

587 \item You run your binary test.

588 \begin{itemize}

589 \item If the test succeeded, you tell \hgcmd{bisect} by running the

590 \hgcmdargs{bisect}{good} command.

591 \item If it failed, run the \hgcmdargs{bisect}{--bad} command.

592 \end{itemize}

593 \item The command uses your information to decide which changeset to

594 test next.

595 \item It updates the working directory to that changeset, and the

596 process begins again.

597 \end{enumerate}

598 The process ends when \hgcmd{bisect} identifies a unique changeset

599 that marks the point where your test transitioned from ``succeeding''

600 to ``failing''.

601

602 To start the search, we must run the \hgcmdargs{bisect}{--reset} command.

603 \interaction{bisect.search.init}

604

605 In our case, the binary test we use is simple: we check to see if any

606 file in the repository contains the string ``i have a gub''. If it

607 does, this changeset contains the change that ``caused the bug''. By

608 convention, a changeset that has the property we're searching for is

609 ``bad'', while one that doesn't is ``good''.

610

611 Most of the time, the revision to which the working directory is

612 synced (usually the tip) already exhibits the problem introduced by

613 the buggy change, so we'll mark it as ``bad''.

614 \interaction{bisect.search.bad-init}

615

616 Our next task is to nominate a changeset that we know \emph{doesn't}

617 have the bug; the \hgcmd{bisect} command will ``bracket'' its search

618 between the first pair of good and bad changesets. In our case, we

619 know that revision~10 didn't have the bug. (I'll have more words

620 about choosing the first ``good'' changeset later.)

621 \interaction{bisect.search.good-init}

622

623 Notice that this command printed some output.

624 \begin{itemize}

625 \item It told us how many changesets it must consider before it can

626 identify the one that introduced the bug, and how many tests that

627 will require.

628 \item It updated the working directory to the next changeset to test,

629 and told us which changeset it's testing.

630 \end{itemize}

631

632 We now run our test in the working directory. We use the

633 \command{grep} command to see if our ``bad'' file is present in the

634 working directory. If it is, this revision is bad; if not, this

635 revision is good.

636 \interaction{bisect.search.step1}

637

638 This test looks like a perfect candidate for automation, so let's turn

639 it into a shell function.

640 \interaction{bisect.search.mytest}

641 We can now run an entire test step with a single command,

642 \texttt{mytest}.

643 \interaction{bisect.search.step2}

644 A few more invocations of our canned test step command, and we're

645 done.

646 \interaction{bisect.search.rest}

647

648 Even though we had~40 changesets to search through, the \hgcmd{bisect}

649 command let us find the changeset that introduced our ``bug'' with

650 only five tests. Because the number of tests that the \hgcmd{bisect}

651 command performs grows logarithmically with the number of changesets to

652 search, the advantage that it has over the ``brute force'' search

653 approach increases with every changeset you add.

654

655 \subsection{Cleaning up after your search}

656

657 When you're finished using the \hgcmd{bisect} command in a

658 repository, you can use the \hgcmdargs{bisect}{reset} command to drop

659 the information it was using to drive your search. The command

660 doesn't use much space, so it doesn't matter if you forget to run this

661 command. However, \hgcmd{bisect} won't let you start a new search in

662 that repository until you do a \hgcmdargs{bisect}{reset}.

663 \interaction{bisect.search.reset}

664

665 \section{Tips for finding bugs effectively}

666

667 \subsection{Give consistent input}

668

669 The \hgcmd{bisect} command requires that you correctly report the

670 result of every test you perform. If you tell it that a test failed

671 when it really succeeded, it \emph{might} be able to detect the

672 inconsistency. If it can identify an inconsistency in your reports,

673 it will tell you that a particular changeset is both good and bad.

674 However, it can't do this perfectly; it's about as likely to report

675 the wrong changeset as the source of the bug.

676

677 \subsection{Automate as much as possible}

678

679 When I started using the \hgcmd{bisect} command, I tried a few times

680 to run my tests by hand, on the command line. This is an approach

681 that I, at least, am not suited to. After a few tries, I found that I

682 was making enough mistakes that I was having to restart my searches

683 several times before finally getting correct results.

684

685 My initial problems with driving the \hgcmd{bisect} command by hand

686 occurred even with simple searches on small repositories; if the

687 problem you're looking for is more subtle, or the number of tests that

688 \hgcmd{bisect} must perform increases, the likelihood of operator

689 error ruining the search is much higher. Once I started automating my

690 tests, I had much better results.

691

692 The key to automated testing is twofold:

693 \begin{itemize}

694 \item always test for the same symptom, and

695 \item always feed consistent input to the \hgcmd{bisect} command.

696 \end{itemize}

697 In my tutorial example above, the \command{grep} command tests for the

698 symptom, and the \texttt{if} statement takes the result of this check

699 and ensures that we always feed the same input to the \hgcmd{bisect}

700 command. The \texttt{mytest} function marries these together in a

701 reproducible way, so that every test is uniform and consistent.

702

703 \subsection{Check your results}

704

705 Because the output of a \hgcmd{bisect} search is only as good as the

706 input you give it, don't take the changeset it reports as the

707 absolute truth. A simple way to cross-check its report is to manually

708 run your test at each of the following changesets:

709 \begin{itemize}

710 \item The changeset that it reports as the first bad revision. Your

711 test should still report this as bad.

712 \item The parent of that changeset (either parent, if it's a merge).

713 Your test should report this changeset as good.

714 \item A child of that changeset. Your test should report this

715 changeset as bad.

716 \end{itemize}

717

718 \subsection{Beware interference between bugs}

719

720 It's possible that your search for one bug could be disrupted by the

721 presence of another. For example, let's say your software crashes at

722 revision 100, and worked correctly at revision 50. Unknown to you,

723 someone else introduced a different crashing bug at revision 60, and

724 fixed it at revision 80. This could distort your results in one of

725 several ways.

726

727 It is possible that this other bug completely ``masks'' yours, which

728 is to say that it occurs before your bug has a chance to manifest

729 itself. If you can't avoid that other bug (for example, it prevents

730 your project from building), and so can't tell whether your bug is

731 present in a particular changeset, the \hgcmd{bisect} command cannot

732 help you directly. Instead, you can mark a changeset as untested by

733 running \hgcmdargs{bisect}{--skip}.

734

735 A different problem could arise if your test for a bug's presence is

736 not specific enough. If you check for ``my program crashes'', then

737 both your crashing bug and an unrelated crashing bug that masks it

738 will look like the same thing, and mislead \hgcmd{bisect}.

739

740 Another useful situation in which to use \hgcmdargs{bisect}{--skip} is

741 if you can't test a revision because your project was in a broken and

742 hence untestable state at that revision, perhaps because someone

743 checked in a change that prevented the project from building.

744

745 \subsection{Bracket your search lazily}

746

747 Choosing the first ``good'' and ``bad'' changesets that will mark the

748 end points of your search is often easy, but it bears a little

749 discussion nevertheless. From the perspective of \hgcmd{bisect}, the

750 ``newest'' changeset is conventionally ``bad'', and the older

751 changeset is ``good''.

752

753 If you're having trouble remembering when a suitable ``good'' change

754 was, so that you can tell \hgcmd{bisect}, you could do worse than

755 testing changesets at random. Just remember to eliminate contenders

756 that can't possibly exhibit the bug (perhaps because the feature with

757 the bug isn't present yet) and those where another problem masks the

758 bug (as I discussed above).

759

760 Even if you end up ``early'' by thousands of changesets or months of

761 history, you will only add a handful of tests to the total number that

762 \hgcmd{bisect} must perform, thanks to its logarithmic behaviour.

763

764 %%% Local Variables:

765 %%% mode: latex

766 %%% TeX-master: "00book"

767 %%% End: