hgbook: e0a4ba81f888 en/ch08-undo.xml

hgbook

view en/ch08-undo.xml @ 595:e0a4ba81f888

Add throbber for comment submission

author	Bryan O'Sullivan <bos@serpentine.com>
date	Thu Mar 26 22:07:30 2009 -0700 (2009-03-26)
parents	4ce9d0754af3
children	1c13ed2130a7

line source

1

3 <chapter id="chap:undo">

4 <?dbhtml filename="finding-and-fixing-mistakes.html"?>

5 <title>Finding and fixing mistakes</title>

7 <para id="x_d2">To err might be human, but to really handle the consequences

8 well takes a top-notch revision control system. In this chapter,

9 we'll discuss some of the techniques you can use when you find

10 that a problem has crept into your project. Mercurial has some

11 highly capable features that will help you to isolate the sources

12 of problems, and to handle them appropriately.</para>

14 <sect1>

15 <title>Erasing local history</title>

17 <sect2>

18 <title>The accidental commit</title>

20 <para id="x_d3">I have the occasional but persistent problem of typing

21 rather more quickly than I can think, which sometimes results

22 in me committing a changeset that is either incomplete or

23 plain wrong. In my case, the usual kind of incomplete

24 changeset is one in which I've created a new source file, but

25 forgotten to <command role="hg-cmd">hg add</command> it. A

26 <quote>plain wrong</quote> changeset is not as common, but no

27 less annoying.</para>

29 </sect2>

30 <sect2 id="sec:undo:rollback">

31 <title>Rolling back a transaction</title>

33 <para id="x_d4">In <xref linkend="sec:concepts:txn"/>, I

34 mentioned that Mercurial treats each modification of a

35 repository as a <emphasis>transaction</emphasis>. Every time

36 you commit a changeset or pull changes from another

37 repository, Mercurial remembers what you did. You can undo,

38 or <emphasis>roll back</emphasis>, exactly one of these

39 actions using the <command role="hg-cmd">hg rollback</command>

40 command. (See <xref linkend="sec:undo:rollback-after-push"/>

41 for an important caveat about the use of this command.)</para>

43 <para id="x_d5">Here's a mistake that I often find myself making:

44 committing a change in which I've created a new file, but

45 forgotten to <command role="hg-cmd">hg add</command>

46 it.</para>

48 &interaction.rollback.commit;

50 <para id="x_d6">Looking at the output of <command role="hg-cmd">hg

51 status</command> after the commit immediately confirms the

52 error.</para>

54 &interaction.rollback.status;

56 <para id="x_d7">The commit captured the changes to the file

57 <filename>a</filename>, but not the new file

58 <filename>b</filename>. If I were to push this changeset to a

59 repository that I shared with a colleague, the chances are

60 high that something in <filename>a</filename> would refer to

61 <filename>b</filename>, which would not be present in their

62 repository when they pulled my changes. I would thus become

63 the object of some indignation.</para>

65 <para id="x_d8">However, luck is with me&emdash;I've caught my error

66 before I pushed the changeset. I use the <command

67 role="hg-cmd">hg rollback</command> command, and Mercurial

68 makes that last changeset vanish.</para>

70 &interaction.rollback.rollback;

72 <para id="x_d9">Notice that the changeset is no longer present in the

73 repository's history, and the working directory once again

74 thinks that the file <filename>a</filename> is modified. The

75 commit and rollback have left the working directory exactly as

76 it was prior to the commit; the changeset has been completely

77 erased. I can now safely <command role="hg-cmd">hg

78 add</command> the file <filename>b</filename>, and rerun my

79 commit.</para>

81 &interaction.rollback.add;

83 </sect2>

84 <sect2>

85 <title>The erroneous pull</title>

87 <para id="x_da">It's common practice with Mercurial to maintain separate

88 development branches of a project in different repositories.

89 Your development team might have one shared repository for

90 your project's <quote>0.9</quote> release, and another,

91 containing different changes, for the <quote>1.0</quote>

92 release.</para>

94 <para id="x_db">Given this, you can imagine that the consequences could be

95 messy if you had a local <quote>0.9</quote> repository, and

96 accidentally pulled changes from the shared <quote>1.0</quote>

97 repository into it. At worst, you could be paying

98 insufficient attention, and push those changes into the shared

99 <quote>0.9</quote> tree, confusing your entire team (but don't

100 worry, we'll return to this horror scenario later). However,

101 it's more likely that you'll notice immediately, because

102 Mercurial will display the URL it's pulling from, or you will

103 see it pull a suspiciously large number of changes into the

104 repository.</para>

105

106 <para id="x_dc">The <command role="hg-cmd">hg rollback</command> command

107 will work nicely to expunge all of the changesets that you

108 just pulled. Mercurial groups all changes from one <command

109 role="hg-cmd">hg pull</command> into a single transaction,

110 so one <command role="hg-cmd">hg rollback</command> is all you

111 need to undo this mistake.</para>

112

113 </sect2>

114 <sect2 id="sec:undo:rollback-after-push">

115 <title>Rolling back is useless once you've pushed</title>

116

117 <para id="x_dd">The value of the <command role="hg-cmd">hg

118 rollback</command> command drops to zero once you've pushed

119 your changes to another repository. Rolling back a change

120 makes it disappear entirely, but <emphasis>only</emphasis> in

121 the repository in which you perform the <command

122 role="hg-cmd">hg rollback</command>. Because a rollback

123 eliminates history, there's no way for the disappearance of a

124 change to propagate between repositories.</para>

125

126 <para id="x_de">If you've pushed a change to another

127 repository&emdash;particularly if it's a shared

128 repository&emdash;it has essentially <quote>escaped into the

129 wild,</quote> and you'll have to recover from your mistake

130 in a different way. What will happen if you push a changeset

131 somewhere, then roll it back, then pull from the repository

132 you pushed to, is that the changeset will reappear in your

133 repository.</para>

134

135 <para id="x_df">(If you absolutely know for sure that the change you want

136 to roll back is the most recent change in the repository that

137 you pushed to, <emphasis>and</emphasis> you know that nobody

138 else could have pulled it from that repository, you can roll

139 back the changeset there, too, but you really should really

140 not rely on this working reliably. If you do this, sooner or

141 later a change really will make it into a repository that you

142 don't directly control (or have forgotten about), and come

143 back to bite you.)</para>

144

145 </sect2>

146 <sect2>

147 <title>You can only roll back once</title>

148

149 <para id="x_e0">Mercurial stores exactly one transaction in its

150 transaction log; that transaction is the most recent one that

151 occurred in the repository. This means that you can only roll

152 back one transaction. If you expect to be able to roll back

153 one transaction, then its predecessor, this is not the

154 behaviour you will get.</para>

155

156 &interaction.rollback.twice;

157

158 <para id="x_e1">Once you've rolled back one transaction in a repository,

159 you can't roll back again in that repository until you perform

160 another commit or pull.</para>

161

162 </sect2>

163 </sect1>

164 <sect1>

165 <title>Reverting the mistaken change</title>

166

167 <para id="x_e2">If you make a modification to a file, and decide that you

168 really didn't want to change the file at all, and you haven't

169 yet committed your changes, the <command role="hg-cmd">hg

170 revert</command> command is the one you'll need. It looks at

171 the changeset that's the parent of the working directory, and

172 restores the contents of the file to their state as of that

173 changeset. (That's a long-winded way of saying that, in the

174 normal case, it undoes your modifications.)</para>

175

176 <para id="x_e3">Let's illustrate how the <command role="hg-cmd">hg

177 revert</command> command works with yet another small example.

178 We'll begin by modifying a file that Mercurial is already

179 tracking.</para>

180

181 &interaction.daily.revert.modify;

182

183 <para id="x_e4">If we don't

184 want that change, we can simply <command role="hg-cmd">hg

185 revert</command> the file.</para>

186

187 &interaction.daily.revert.unmodify;

188

189 <para id="x_e5">The <command role="hg-cmd">hg revert</command> command

190 provides us with an extra degree of safety by saving our

191 modified file with a <filename>.orig</filename>

192 extension.</para>

193

194 &interaction.daily.revert.status;

195

196 <para id="x_e6">Here is a summary of the cases that the <command

197 role="hg-cmd">hg revert</command> command can deal with. We

198 will describe each of these in more detail in the section that

199 follows.</para>

200 <itemizedlist>

201 <listitem><para id="x_e7">If you modify a file, it will restore the file

202 to its unmodified state.</para>

203 </listitem>

204 <listitem><para id="x_e8">If you <command role="hg-cmd">hg add</command> a

205 file, it will undo the <quote>added</quote> state of the

206 file, but leave the file itself untouched.</para>

207 </listitem>

208 <listitem><para id="x_e9">If you delete a file without telling Mercurial,

209 it will restore the file to its unmodified contents.</para>

210 </listitem>

211 <listitem><para id="x_ea">If you use the <command role="hg-cmd">hg

212 remove</command> command to remove a file, it will undo

213 the <quote>removed</quote> state of the file, and restore

214 the file to its unmodified contents.</para>

215 </listitem></itemizedlist>

216

217 <sect2 id="sec:undo:mgmt">

218 <title>File management errors</title>

219

220 <para id="x_eb">The <command role="hg-cmd">hg revert</command> command is

221 useful for more than just modified files. It lets you reverse

222 the results of all of Mercurial's file management

223 commands&emdash;<command role="hg-cmd">hg add</command>,

224 <command role="hg-cmd">hg remove</command>, and so on.</para>

225

226 <para id="x_ec">If you <command role="hg-cmd">hg add</command> a file,

227 then decide that in fact you don't want Mercurial to track it,

228 use <command role="hg-cmd">hg revert</command> to undo the

229 add. Don't worry; Mercurial will not modify the file in any

230 way. It will just <quote>unmark</quote> the file.</para>

231

232 &interaction.daily.revert.add;

233

234 <para id="x_ed">Similarly, if you ask Mercurial to <command

235 role="hg-cmd">hg remove</command> a file, you can use

236 <command role="hg-cmd">hg revert</command> to restore it to

237 the contents it had as of the parent of the working directory.

238 &interaction.daily.revert.remove; This works just as

239 well for a file that you deleted by hand, without telling

240 Mercurial (recall that in Mercurial terminology, this kind of

241 file is called <quote>missing</quote>).</para>

242

243 &interaction.daily.revert.missing;

244

245 <para id="x_ee">If you revert a <command role="hg-cmd">hg copy</command>,

246 the copied-to file remains in your working directory

247 afterwards, untracked. Since a copy doesn't affect the

248 copied-from file in any way, Mercurial doesn't do anything

249 with the copied-from file.</para>

250

251 &interaction.daily.revert.copy;

252

253 <sect3>

254 <title>A slightly special case: reverting a rename</title>

255

256 <para id="x_ef">If you <command role="hg-cmd">hg rename</command> a

257 file, there is one small detail that you should remember.

258 When you <command role="hg-cmd">hg revert</command> a

259 rename, it's not enough to provide the name of the

260 renamed-to file, as you can see here.</para>

261

262 &interaction.daily.revert.rename;

263

264 <para id="x_f0">As you can see from the output of <command

265 role="hg-cmd">hg status</command>, the renamed-to file is

266 no longer identified as added, but the

267 renamed-<emphasis>from</emphasis> file is still removed!

268 This is counter-intuitive (at least to me), but at least

269 it's easy to deal with.</para>

270

271 &interaction.daily.revert.rename-orig;

272

273 <para id="x_f1">So remember, to revert a <command role="hg-cmd">hg

274 rename</command>, you must provide

275 <emphasis>both</emphasis> the source and destination

276 names.</para>

277

278 <para id="x_f2">% TODO: the output doesn't look like it will be

279 removed!</para>

280

281 <para id="x_f3">(By the way, if you rename a file, then modify the

282 renamed-to file, then revert both components of the rename,

283 when Mercurial restores the file that was removed as part of

284 the rename, it will be unmodified. If you need the

285 modifications in the renamed-to file to show up in the

286 renamed-from file, don't forget to copy them over.)</para>

287

288 <para id="x_f4">These fiddly aspects of reverting a rename arguably

289 constitute a small bug in Mercurial.</para>

290

291 </sect3>

292 </sect2>

293 </sect1>

294 <sect1>

295 <title>Dealing with committed changes</title>

296

297 <para id="x_f5">Consider a case where you have committed a change $a$, and

298 another change $b$ on top of it; you then realise that change

299 $a$ was incorrect. Mercurial lets you <quote>back out</quote>

300 an entire changeset automatically, and building blocks that let

301 you reverse part of a changeset by hand.</para>

302

303 <para id="x_f6">Before you read this section, here's something to

304 keep in mind: the <command role="hg-cmd">hg backout</command>

305 command undoes changes by <emphasis>adding</emphasis> history,

306 not by modifying or erasing it. It's the right tool to use if

307 you're fixing bugs, but not if you're trying to undo some change

308 that has catastrophic consequences. To deal with those, see

309 <xref linkend="sec:undo:aaaiiieee"/>.</para>

310

311 <sect2>

312 <title>Backing out a changeset</title>

313

314 <para id="x_f7">The <command role="hg-cmd">hg backout</command> command

315 lets you <quote>undo</quote> the effects of an entire

316 changeset in an automated fashion. Because Mercurial's

317 history is immutable, this command <emphasis>does

318 not</emphasis> get rid of the changeset you want to undo.

319 Instead, it creates a new changeset that

320 <emphasis>reverses</emphasis> the effect of the to-be-undone

321 changeset.</para>

322

323 <para id="x_f8">The operation of the <command role="hg-cmd">hg

324 backout</command> command is a little intricate, so let's

325 illustrate it with some examples. First, we'll create a

326 repository with some simple changes.</para>

327

328 &interaction.backout.init;

329

330 <para id="x_f9">The <command role="hg-cmd">hg backout</command> command

331 takes a single changeset ID as its argument; this is the

332 changeset to back out. Normally, <command role="hg-cmd">hg

333 backout</command> will drop you into a text editor to write

334 a commit message, so you can record why you're backing the

335 change out. In this example, we provide a commit message on

336 the command line using the <option

337 role="hg-opt-backout">-m</option> option.</para>

338

339 </sect2>

340 <sect2>

341 <title>Backing out the tip changeset</title>

342

343 <para id="x_fa">We're going to start by backing out the last changeset we

344 committed.</para>

345

346 &interaction.backout.simple;

347

348 <para id="x_fb">You can see that the second line from

349 <filename>myfile</filename> is no longer present. Taking a

350 look at the output of <command role="hg-cmd">hg log</command>

351 gives us an idea of what the <command role="hg-cmd">hg

352 backout</command> command has done.

353 &interaction.backout.simple.log; Notice that the new changeset

354 that <command role="hg-cmd">hg backout</command> has created

355 is a child of the changeset we backed out. It's easier to see

356 this in <xref linkend="fig:undo:backout"/>, which presents a

357 graphical view of the change history. As you can see, the

358 history is nice and linear.</para>

359

360 <figure id="fig:undo:backout">

361 <title>Backing out a change using the <command

362 role="hg-cmd">hg backout</command> command</title>

363 <mediaobject>

364 <imageobject><imagedata fileref="figs/undo-simple.png"/></imageobject>

365 <textobject><phrase>XXX add text</phrase></textobject>

366 </mediaobject>

367 </figure>

368

369 </sect2>

370 <sect2>

371 <title>Backing out a non-tip change</title>

372

373 <para id="x_fd">If you want to back out a change other than the last one

374 you committed, pass the <option

375 role="hg-opt-backout">--merge</option> option to the

376 <command role="hg-cmd">hg backout</command> command.</para>

377

378 &interaction.backout.non-tip.clone;

379

380 <para id="x_fe">This makes backing out any changeset a

381 <quote>one-shot</quote> operation that's usually simple and

382 fast.</para>

383

384 &interaction.backout.non-tip.backout;

385

386 <para id="x_ff">If you take a look at the contents of

387 <filename>myfile</filename> after the backout finishes, you'll

388 see that the first and third changes are present, but not the

389 second.</para>

390

391 &interaction.backout.non-tip.cat;

392

393 <para id="x_100">As the graphical history in <xref

394 linkend="fig:undo:backout-non-tip"/> illustrates, Mercurial

395 actually commits <emphasis>two</emphasis> changes in this kind

396 of situation (the box-shaped nodes are the ones that Mercurial

397 commits automatically). Before Mercurial begins the backout

398 process, it first remembers what the current parent of the

399 working directory is. It then backs out the target changeset,

400 and commits that as a changeset. Finally, it merges back to

401 the previous parent of the working directory, and commits the

402 result of the merge.</para>

403

404 <para id="x_101">% TODO: to me it looks like mercurial doesn't commit the

405 second merge automatically!</para>

406

407 <figure id="fig:undo:backout-non-tip">

408 <title>Automated backout of a non-tip change using the

409 <command role="hg-cmd">hg backout</command> command</title>

410 <mediaobject>

411 <imageobject><imagedata fileref="figs/undo-non-tip.png"/></imageobject>

412 <textobject><phrase>XXX add text</phrase></textobject>

413 </mediaobject>

414 </figure>

415

416 <para id="x_103">The result is that you end up <quote>back where you

417 were</quote>, only with some extra history that undoes the

418 effect of the changeset you wanted to back out.</para>

419

420 <sect3>

421 <title>Always use the <option

422 role="hg-opt-backout">--merge</option> option</title>

423

424 <para id="x_104">In fact, since the <option

425 role="hg-opt-backout">--merge</option> option will do the

426 <quote>right thing</quote> whether or not the changeset

427 you're backing out is the tip (i.e. it won't try to merge if

428 it's backing out the tip, since there's no need), you should

429 <emphasis>always</emphasis> use this option when you run the

430 <command role="hg-cmd">hg backout</command> command.</para>

431

432 </sect3>

433 </sect2>

434 <sect2>

435 <title>Gaining more control of the backout process</title>

436

437 <para id="x_105">While I've recommended that you always use the <option

438 role="hg-opt-backout">--merge</option> option when backing

439 out a change, the <command role="hg-cmd">hg backout</command>

440 command lets you decide how to merge a backout changeset.

441 Taking control of the backout process by hand is something you

442 will rarely need to do, but it can be useful to understand

443 what the <command role="hg-cmd">hg backout</command> command

444 is doing for you automatically. To illustrate this, let's

445 clone our first repository, but omit the backout change that

446 it contains.</para>

447

448 &interaction.backout.manual.clone;

449

450 <para id="x_106">As with our

451 earlier example, We'll commit a third changeset, then back out

452 its parent, and see what happens.</para>

453

454 &interaction.backout.manual.backout;

455

456 <para id="x_107">Our new changeset is again a descendant of the changeset

457 we backout out; it's thus a new head, <emphasis>not</emphasis>

458 a descendant of the changeset that was the tip. The <command

459 role="hg-cmd">hg backout</command> command was quite

460 explicit in telling us this.</para>

461

462 &interaction.backout.manual.log;

463

464 <para id="x_108">Again, it's easier to see what has happened by looking at

465 a graph of the revision history, in <xref

466 linkend="fig:undo:backout-manual"/>. This makes it clear

467 that when we use <command role="hg-cmd">hg backout</command>

468 to back out a change other than the tip, Mercurial adds a new

469 head to the repository (the change it committed is

470 box-shaped).</para>

471

472 <figure id="fig:undo:backout-manual">

473 <title>Backing out a change using the <command

474 role="hg-cmd">hg backout</command> command</title>

475 <mediaobject>

476 <imageobject><imagedata fileref="figs/undo-manual.png"/></imageobject>

477 <textobject><phrase>XXX add text</phrase></textobject>

478 </mediaobject>

479 </figure>

480

481 <para id="x_10a">After the <command role="hg-cmd">hg backout</command>

482 command has completed, it leaves the new

483 <quote>backout</quote> changeset as the parent of the working

484 directory.</para>

485

486 &interaction.backout.manual.parents;

487

488 <para id="x_10b">Now we have two isolated sets of changes.</para>

489

490 &interaction.backout.manual.heads;

491

492 <para id="x_10c">Let's think about what we expect to see as the contents of

493 <filename>myfile</filename> now. The first change should be

494 present, because we've never backed it out. The second change

495 should be missing, as that's the change we backed out. Since

496 the history graph shows the third change as a separate head,

497 we <emphasis>don't</emphasis> expect to see the third change

498 present in <filename>myfile</filename>.</para>

499

500 &interaction.backout.manual.cat;

501

502 <para id="x_10d">To get the third change back into the file, we just do a

503 normal merge of our two heads.</para>

504

505 &interaction.backout.manual.merge;

506

507 <para id="x_10e">Afterwards, the graphical history of our

508 repository looks like

509 <xref linkend="fig:undo:backout-manual-merge"/>.</para>

510

511 <figure id="fig:undo:backout-manual-merge">

512 <title>Manually merging a backout change</title>

513 <mediaobject>

514 <imageobject><imagedata fileref="figs/undo-manual-merge.png"/></imageobject>

515 <textobject><phrase>XXX add text</phrase></textobject>

516 </mediaobject>

517 </figure>

518

519 </sect2>

520 <sect2>

521 <title>Why <command role="hg-cmd">hg backout</command> works as

522 it does</title>

523

524 <para id="x_110">Here's a brief description of how the <command

525 role="hg-cmd">hg backout</command> command works.</para>

526 <orderedlist>

527 <listitem><para id="x_111">It ensures that the working directory is

528 <quote>clean</quote>, i.e. that the output of <command

529 role="hg-cmd">hg status</command> would be empty.</para>

530 </listitem>

531 <listitem><para id="x_112">It remembers the current parent of the working

532 directory. Let's call this changeset

533 <literal>orig</literal></para>

534 </listitem>

535 <listitem><para id="x_113">It does the equivalent of a <command

536 role="hg-cmd">hg update</command> to sync the working

537 directory to the changeset you want to back out. Let's

538 call this changeset <literal>backout</literal></para>

539 </listitem>

540 <listitem><para id="x_114">It finds the parent of that changeset. Let's

541 call that changeset <literal>parent</literal>.</para>

542 </listitem>

543 <listitem><para id="x_115">For each file that the

544 <literal>backout</literal> changeset affected, it does the

545 equivalent of a <command role="hg-cmd">hg revert -r

546 parent</command> on that file, to restore it to the

547 contents it had before that changeset was

548 committed.</para>

549 </listitem>

550 <listitem><para id="x_116">It commits the result as a new changeset.

551 This changeset has <literal>backout</literal> as its

552 parent.</para>

553 </listitem>

554 <listitem><para id="x_117">If you specify <option

555 role="hg-opt-backout">--merge</option> on the command

556 line, it merges with <literal>orig</literal>, and commits

557 the result of the merge.</para>

558 </listitem></orderedlist>

559

560 <para id="x_118">An alternative way to implement the <command

561 role="hg-cmd">hg backout</command> command would be to

562 <command role="hg-cmd">hg export</command> the

563 to-be-backed-out changeset as a diff, then use the <option

564 role="cmd-opt-patch">--reverse</option> option to the

565 <command>patch</command> command to reverse the effect of the

566 change without fiddling with the working directory. This

567 sounds much simpler, but it would not work nearly as

568 well.</para>

569

570 <para id="x_119">The reason that <command role="hg-cmd">hg

571 backout</command> does an update, a commit, a merge, and

572 another commit is to give the merge machinery the best chance

573 to do a good job when dealing with all the changes

574 <emphasis>between</emphasis> the change you're backing out and

575 the current tip.</para>

576

577 <para id="x_11a">If you're backing out a changeset that's 100 revisions

578 back in your project's history, the chances that the

579 <command>patch</command> command will be able to apply a

580 reverse diff cleanly are not good, because intervening changes

581 are likely to have <quote>broken the context</quote> that

582 <command>patch</command> uses to determine whether it can

583 apply a patch (if this sounds like gibberish, see <xref

584 linkend="sec:mq:patch"/> for a

585 discussion of the <command>patch</command> command). Also,

586 Mercurial's merge machinery will handle files and directories

587 being renamed, permission changes, and modifications to binary

588 files, none of which <command>patch</command> can deal

589 with.</para>

590

591 </sect2>

592 </sect1>

593 <sect1 id="sec:undo:aaaiiieee">

594 <title>Changes that should never have been</title>

595

596 <para id="x_11b">Most of the time, the <command role="hg-cmd">hg

597 backout</command> command is exactly what you need if you want

598 to undo the effects of a change. It leaves a permanent record

599 of exactly what you did, both when committing the original

600 changeset and when you cleaned up after it.</para>

601

602 <para id="x_11c">On rare occasions, though, you may find that you've

603 committed a change that really should not be present in the

604 repository at all. For example, it would be very unusual, and

605 usually considered a mistake, to commit a software project's

606 object files as well as its source files. Object files have

607 almost no intrinsic value, and they're <emphasis>big</emphasis>,

608 so they increase the size of the repository and the amount of

609 time it takes to clone or pull changes.</para>

610

611 <para id="x_11d">Before I discuss the options that you have if you commit a

612 <quote>brown paper bag</quote> change (the kind that's so bad

613 that you want to pull a brown paper bag over your head), let me

614 first discuss some approaches that probably won't work.</para>

615

616 <para id="x_11e">Since Mercurial treats history as

617 accumulative&emdash;every change builds on top of all changes

618 that preceded it&emdash;you generally can't just make disastrous

619 changes disappear. The one exception is when you've just

620 committed a change, and it hasn't been pushed or pulled into

621 another repository. That's when you can safely use the <command

622 role="hg-cmd">hg rollback</command> command, as I detailed in

623 <xref linkend="sec:undo:rollback"/>.</para>

624

625 <para id="x_11f">After you've pushed a bad change to another repository, you

626 <emphasis>could</emphasis> still use <command role="hg-cmd">hg

627 rollback</command> to make your local copy of the change

628 disappear, but it won't have the consequences you want. The

629 change will still be present in the remote repository, so it

630 will reappear in your local repository the next time you

631 pull.</para>

632

633 <para id="x_120">If a situation like this arises, and you know which

634 repositories your bad change has propagated into, you can

635 <emphasis>try</emphasis> to get rid of the changeefrom

636 <emphasis>every</emphasis> one of those repositories. This is,

637 of course, not a satisfactory solution: if you miss even a

638 single repository while you're expunging, the change is still

639 <quote>in the wild</quote>, and could propagate further.</para>

640

641 <para id="x_121">If you've committed one or more changes

642 <emphasis>after</emphasis> the change that you'd like to see

643 disappear, your options are further reduced. Mercurial doesn't

644 provide a way to <quote>punch a hole</quote> in history, leaving

645 changesets intact.</para>

646

647 <para id="x_122">XXX This needs filling out. The

648 <literal>hg-replay</literal> script in the

649 <literal>examples</literal> directory works, but doesn't handle

650 merge changesets. Kind of an important omission.</para>

651

652 <sect2>

653 <title>Protect yourself from <quote>escaped</quote>

654 changes</title>

655

656 <para id="x_123">If you've committed some changes to your local repository

657 and they've been pushed or pulled somewhere else, this isn't

658 necessarily a disaster. You can protect yourself ahead of

659 time against some classes of bad changeset. This is

660 particularly easy if your team usually pulls changes from a

661 central repository.</para>

662

663 <para id="x_124">By configuring some hooks on that repository to validate

664 incoming changesets (see chapter <xref linkend="chap:hook"/>),

665 you can

666 automatically prevent some kinds of bad changeset from being

667 pushed to the central repository at all. With such a

668 configuration in place, some kinds of bad changeset will

669 naturally tend to <quote>die out</quote> because they can't

670 propagate into the central repository. Better yet, this

671 happens without any need for explicit intervention.</para>

672

673 <para id="x_125">For instance, an incoming change hook that verifies that a

674 changeset will actually compile can prevent people from

675 inadvertantly <quote>breaking the build</quote>.</para>

676

677 </sect2>

678 </sect1>

679 <sect1 id="sec:undo:bisect">

680 <title>Finding the source of a bug</title>

681

682 <para id="x_126">While it's all very well to be able to back out a changeset

683 that introduced a bug, this requires that you know which

684 changeset to back out. Mercurial provides an invaluable

685 command, called <command role="hg-cmd">hg bisect</command>, that

686 helps you to automate this process and accomplish it very

687 efficiently.</para>

688

689 <para id="x_127">The idea behind the <command role="hg-cmd">hg

690 bisect</command> command is that a changeset has introduced

691 some change of behaviour that you can identify with a simple

692 binary test. You don't know which piece of code introduced the

693 change, but you know how to test for the presence of the bug.

694 The <command role="hg-cmd">hg bisect</command> command uses your

695 test to direct its search for the changeset that introduced the

696 code that caused the bug.</para>

697

698 <para id="x_128">Here are a few scenarios to help you understand how you

699 might apply this command.</para>

700 <itemizedlist>

701 <listitem><para id="x_129">The most recent version of your software has a

702 bug that you remember wasn't present a few weeks ago, but

703 you don't know when it was introduced. Here, your binary

704 test checks for the presence of that bug.</para>

705 </listitem>

706 <listitem><para id="x_12a">You fixed a bug in a rush, and now it's time to

707 close the entry in your team's bug database. The bug

708 database requires a changeset ID when you close an entry,

709 but you don't remember which changeset you fixed the bug in.

710 Once again, your binary test checks for the presence of the

711 bug.</para>

712 </listitem>

713 <listitem><para id="x_12b">Your software works correctly, but runs 15%

714 slower than the last time you measured it. You want to know

715 which changeset introduced the performance regression. In

716 this case, your binary test measures the performance of your

717 software, to see whether it's <quote>fast</quote> or

718 <quote>slow</quote>.</para>

719 </listitem>

720 <listitem><para id="x_12c">The sizes of the components of your project that

721 you ship exploded recently, and you suspect that something

722 changed in the way you build your project.</para>

723 </listitem></itemizedlist>

724

725 <para id="x_12d">From these examples, it should be clear that the <command

726 role="hg-cmd">hg bisect</command> command is not useful only

727 for finding the sources of bugs. You can use it to find any

728 <quote>emergent property</quote> of a repository (anything that

729 you can't find from a simple text search of the files in the

730 tree) for which you can write a binary test.</para>

731

732 <para id="x_12e">We'll introduce a little bit of terminology here, just to

733 make it clear which parts of the search process are your

734 responsibility, and which are Mercurial's. A

735 <emphasis>test</emphasis> is something that

736 <emphasis>you</emphasis> run when <command role="hg-cmd">hg

737 bisect</command> chooses a changeset. A

738 <emphasis>probe</emphasis> is what <command role="hg-cmd">hg

739 bisect</command> runs to tell whether a revision is good.

740 Finally, we'll use the word <quote>bisect</quote>, as both a

741 noun and a verb, to stand in for the phrase <quote>search using

742 the <command role="hg-cmd">hg bisect</command>

743 command</quote>.</para>

744

745 <para id="x_12f">One simple way to automate the searching process would be

746 simply to probe every changeset. However, this scales poorly.

747 If it took ten minutes to test a single changeset, and you had

748 10,000 changesets in your repository, the exhaustive approach

749 would take on average 35 <emphasis>days</emphasis> to find the

750 changeset that introduced a bug. Even if you knew that the bug

751 was introduced by one of the last 500 changesets, and limited

752 your search to those, you'd still be looking at over 40 hours to

753 find the changeset that introduced your bug.</para>

754

755 <para id="x_130">What the <command role="hg-cmd">hg bisect</command> command

756 does is use its knowledge of the <quote>shape</quote> of your

757 project's revision history to perform a search in time

758 proportional to the <emphasis>logarithm</emphasis> of the number

759 of changesets to check (the kind of search it performs is called

760 a dichotomic search). With this approach, searching through

761 10,000 changesets will take less than three hours, even at ten

762 minutes per test (the search will require about 14 tests).

763 Limit your search to the last hundred changesets, and it will

764 take only about an hour (roughly seven tests).</para>

765

766 <para id="x_131">The <command role="hg-cmd">hg bisect</command> command is

767 aware of the <quote>branchy</quote> nature of a Mercurial

768 project's revision history, so it has no problems dealing with

769 branches, merges, or multiple heads in a repository. It can

770 prune entire branches of history with a single probe, which is

771 how it operates so efficiently.</para>

772

773 <sect2>

774 <title>Using the <command role="hg-cmd">hg bisect</command>

775 command</title>

776

777 <para id="x_132">Here's an example of <command role="hg-cmd">hg

778 bisect</command> in action.</para>

779

780 <note>

781 <para id="x_133"> In versions 0.9.5 and earlier of Mercurial, <command

782 role="hg-cmd">hg bisect</command> was not a core command:

783 it was distributed with Mercurial as an extension. This

784 section describes the built-in command, not the old

785 extension.</para>

786 </note>

787

788 <para id="x_134">Now let's create a repository, so that we can try out the

789 <command role="hg-cmd">hg bisect</command> command in

790 isolation.</para>

791

792 &interaction.bisect.init;

793

794 <para id="x_135">We'll simulate a project that has a bug in it in a

795 simple-minded way: create trivial changes in a loop, and

796 nominate one specific change that will have the

797 <quote>bug</quote>. This loop creates 35 changesets, each

798 adding a single file to the repository. We'll represent our

799 <quote>bug</quote> with a file that contains the text <quote>i

800 have a gub</quote>.</para>

801

802 &interaction.bisect.commits;

803

804 <para id="x_136">The next thing that we'd like to do is figure out how to

805 use the <command role="hg-cmd">hg bisect</command> command.

806 We can use Mercurial's normal built-in help mechanism for

807 this.</para>

808

809 &interaction.bisect.help;

810

811 <para id="x_137">The <command role="hg-cmd">hg bisect</command> command

812 works in steps. Each step proceeds as follows.</para>

813 <orderedlist>

814 <listitem><para id="x_138">You run your binary test.</para>

815 <itemizedlist>

816 <listitem><para id="x_139">If the test succeeded, you tell <command

817 role="hg-cmd">hg bisect</command> by running the

818 <command role="hg-cmd">hg bisect good</command>

819 command.</para>

820 </listitem>

821 <listitem><para id="x_13a">If it failed, run the <command

822 role="hg-cmd">hg bisect bad</command>

823 command.</para></listitem></itemizedlist>

824 </listitem>

825 <listitem><para id="x_13b">The command uses your information to decide

826 which changeset to test next.</para>

827 </listitem>

828 <listitem><para id="x_13c">It updates the working directory to that

829 changeset, and the process begins again.</para>

830 </listitem></orderedlist>

831 <para id="x_13d">The process ends when <command role="hg-cmd">hg

832 bisect</command> identifies a unique changeset that marks

833 the point where your test transitioned from

834 <quote>succeeding</quote> to <quote>failing</quote>.</para>

835

836 <para id="x_13e">To start the search, we must run the <command

837 role="hg-cmd">hg bisect --reset</command> command.</para>

838

839 &interaction.bisect.search.init;

840

841 <para id="x_13f">In our case, the binary test we use is simple: we check to

842 see if any file in the repository contains the string <quote>i

843 have a gub</quote>. If it does, this changeset contains the

844 change that <quote>caused the bug</quote>. By convention, a

845 changeset that has the property we're searching for is

846 <quote>bad</quote>, while one that doesn't is

847 <quote>good</quote>.</para>

848

849 <para id="x_140">Most of the time, the revision to which the working

850 directory is synced (usually the tip) already exhibits the

851 problem introduced by the buggy change, so we'll mark it as

852 <quote>bad</quote>.</para>

853

854 &interaction.bisect.search.bad-init;

855

856 <para id="x_141">Our next task is to nominate a changeset that we know

857 <emphasis>doesn't</emphasis> have the bug; the <command

858 role="hg-cmd">hg bisect</command> command will

859 <quote>bracket</quote> its search between the first pair of

860 good and bad changesets. In our case, we know that revision

861 10 didn't have the bug. (I'll have more words about choosing

862 the first <quote>good</quote> changeset later.)</para>

863

864 &interaction.bisect.search.good-init;

865

866 <para id="x_142">Notice that this command printed some output.</para>

867 <itemizedlist>

868 <listitem><para id="x_143">It told us how many changesets it must

869 consider before it can identify the one that introduced

870 the bug, and how many tests that will require.</para>

871 </listitem>

872 <listitem><para id="x_144">It updated the working directory to the next

873 changeset to test, and told us which changeset it's

874 testing.</para>

875 </listitem></itemizedlist>

876

877 <para id="x_145">We now run our test in the working directory. We use the

878 <command>grep</command> command to see if our

879 <quote>bad</quote> file is present in the working directory.

880 If it is, this revision is bad; if not, this revision is good.

881 &interaction.bisect.search.step1;</para>

882

883 <para id="x_146">This test looks like a perfect candidate for automation,

884 so let's turn it into a shell function.</para>

885 &interaction.bisect.search.mytest;

886

887 <para id="x_147">We can now run an entire test step with a single command,

888 <literal>mytest</literal>.</para>

889

890 &interaction.bisect.search.step2;

891

892 <para id="x_148">A few more invocations of our canned test step command,

893 and we're done.</para>

894

895 &interaction.bisect.search.rest;

896

897 <para id="x_149">Even though we had 40 changesets to search through, the

898 <command role="hg-cmd">hg bisect</command> command let us find

899 the changeset that introduced our <quote>bug</quote> with only

900 five tests. Because the number of tests that the <command

901 role="hg-cmd">hg bisect</command> command performs grows

902 logarithmically with the number of changesets to search, the

903 advantage that it has over the <quote>brute force</quote>

904 search approach increases with every changeset you add.</para>

905

906 </sect2>

907 <sect2>

908 <title>Cleaning up after your search</title>

909

910 <para id="x_14a">When you're finished using the <command role="hg-cmd">hg

911 bisect</command> command in a repository, you can use the

912 <command role="hg-cmd">hg bisect reset</command> command to

913 drop the information it was using to drive your search. The

914 command doesn't use much space, so it doesn't matter if you

915 forget to run this command. However, <command

916 role="hg-cmd">hg bisect</command> won't let you start a new

917 search in that repository until you do a <command

918 role="hg-cmd">hg bisect reset</command>.</para>

919

920 &interaction.bisect.search.reset;

921

922 </sect2>

923 </sect1>

924 <sect1>

925 <title>Tips for finding bugs effectively</title>

926

927 <sect2>

928 <title>Give consistent input</title>

929

930 <para id="x_14b">The <command role="hg-cmd">hg bisect</command> command

931 requires that you correctly report the result of every test

932 you perform. If you tell it that a test failed when it really

933 succeeded, it <emphasis>might</emphasis> be able to detect the

934 inconsistency. If it can identify an inconsistency in your

935 reports, it will tell you that a particular changeset is both

936 good and bad. However, it can't do this perfectly; it's about

937 as likely to report the wrong changeset as the source of the

938 bug.</para>

939

940 </sect2>

941 <sect2>

942 <title>Automate as much as possible</title>

943

944 <para id="x_14c">When I started using the <command role="hg-cmd">hg

945 bisect</command> command, I tried a few times to run my

946 tests by hand, on the command line. This is an approach that

947 I, at least, am not suited to. After a few tries, I found

948 that I was making enough mistakes that I was having to restart

949 my searches several times before finally getting correct

950 results.</para>

951

952 <para id="x_14d">My initial problems with driving the <command

953 role="hg-cmd">hg bisect</command> command by hand occurred

954 even with simple searches on small repositories; if the

955 problem you're looking for is more subtle, or the number of

956 tests that <command role="hg-cmd">hg bisect</command> must

957 perform increases, the likelihood of operator error ruining

958 the search is much higher. Once I started automating my

959 tests, I had much better results.</para>

960

961 <para id="x_14e">The key to automated testing is twofold:</para>

962 <itemizedlist>

963 <listitem><para id="x_14f">always test for the same symptom, and</para>

964 </listitem>

965 <listitem><para id="x_150">always feed consistent input to the <command

966 role="hg-cmd">hg bisect</command> command.</para>

967 </listitem></itemizedlist>

968 <para id="x_151">In my tutorial example above, the <command>grep</command>

969 command tests for the symptom, and the <literal>if</literal>

970 statement takes the result of this check and ensures that we

971 always feed the same input to the <command role="hg-cmd">hg

972 bisect</command> command. The <literal>mytest</literal>

973 function marries these together in a reproducible way, so that

974 every test is uniform and consistent.</para>

975

976 </sect2>

977 <sect2>

978 <title>Check your results</title>

979

980 <para id="x_152">Because the output of a <command role="hg-cmd">hg

981 bisect</command> search is only as good as the input you

982 give it, don't take the changeset it reports as the absolute

983 truth. A simple way to cross-check its report is to manually

984 run your test at each of the following changesets:</para>

985 <itemizedlist>

986 <listitem><para id="x_153">The changeset that it reports as the first bad

987 revision. Your test should still report this as

988 bad.</para>

989 </listitem>

990 <listitem><para id="x_154">The parent of that changeset (either parent,

991 if it's a merge). Your test should report this changeset

992 as good.</para>

993 </listitem>

994 <listitem><para id="x_155">A child of that changeset. Your test should

995 report this changeset as bad.</para>

996 </listitem></itemizedlist>

997

998 </sect2>

999 <sect2>

1000 <title>Beware interference between bugs</title>

1001

1002 <para id="x_156">It's possible that your search for one bug could be

1003 disrupted by the presence of another. For example, let's say

1004 your software crashes at revision 100, and worked correctly at

1005 revision 50. Unknown to you, someone else introduced a

1006 different crashing bug at revision 60, and fixed it at

1007 revision 80. This could distort your results in one of

1008 several ways.</para>

1009

1010 <para id="x_157">It is possible that this other bug completely

1011 <quote>masks</quote> yours, which is to say that it occurs

1012 before your bug has a chance to manifest itself. If you can't

1013 avoid that other bug (for example, it prevents your project

1014 from building), and so can't tell whether your bug is present

1015 in a particular changeset, the <command role="hg-cmd">hg

1016 bisect</command> command cannot help you directly. Instead,

1017 you can mark a changeset as untested by running <command

1018 role="hg-cmd">hg bisect --skip</command>.</para>

1019

1020 <para id="x_158">A different problem could arise if your test for a bug's

1021 presence is not specific enough. If you check for <quote>my

1022 program crashes</quote>, then both your crashing bug and an

1023 unrelated crashing bug that masks it will look like the same

1024 thing, and mislead <command role="hg-cmd">hg

1025 bisect</command>.</para>

1026

1027 <para id="x_159">Another useful situation in which to use <command

1028 role="hg-cmd">hg bisect --skip</command> is if you can't

1029 test a revision because your project was in a broken and hence

1030 untestable state at that revision, perhaps because someone

1031 checked in a change that prevented the project from

1032 building.</para>

1033

1034 </sect2>

1035 <sect2>

1036 <title>Bracket your search lazily</title>

1037

1038 <para id="x_15a">Choosing the first <quote>good</quote> and

1039 <quote>bad</quote> changesets that will mark the end points of

1040 your search is often easy, but it bears a little discussion

1041 nevertheless. From the perspective of <command

1042 role="hg-cmd">hg bisect</command>, the <quote>newest</quote>

1043 changeset is conventionally <quote>bad</quote>, and the older

1044 changeset is <quote>good</quote>.</para>

1045

1046 <para id="x_15b">If you're having trouble remembering when a suitable

1047 <quote>good</quote> change was, so that you can tell <command

1048 role="hg-cmd">hg bisect</command>, you could do worse than

1049 testing changesets at random. Just remember to eliminate

1050 contenders that can't possibly exhibit the bug (perhaps

1051 because the feature with the bug isn't present yet) and those

1052 where another problem masks the bug (as I discussed

1053 above).</para>

1054

1055 <para id="x_15c">Even if you end up <quote>early</quote> by thousands of

1056 changesets or months of history, you will only add a handful

1057 of tests to the total number that <command role="hg-cmd">hg

1058 bisect</command> must perform, thanks to its logarithmic

1059 behaviour.</para>

1060

1061 </sect2>

1062 </sect1>

1063 </chapter>

1064

1065 <!--

1066 local variables:

1067 sgml-parent-document: ("00book.xml" "book" "chapter")

1068 end:

1069 -->