hgbook: 84944c0ecde6 es/undo.tex

hgbook

view es/undo.tex @ 375:84944c0ecde6

Merged changes from jerojasro concerning tour-basic

author	Igor TAmara <igor@tamarapatino.org>
date	Mon Oct 27 13:44:21 2008 -0500 (2008-10-27)
parents	2fb78d342e07
children	d5f1049a79dd

line source

1 \chapter{Encontrar y arreglar sus equivocaciones}

2 \label{chap:undo}

4 Errar es humano, pero tratar adecuadamente las consecuencias requiere

5 un sistema de control de revisiones de primera categoría. En este

6 capítulo, discutiremos algunas técnicas que puede usar cuando

7 encuentra que hay un problema enraizado en su proyecto. Mercurial

8 tiene unas características poderosas que le ayudarán a isolar las

9 fuentes de los problemas, y a dar cuenta de ellas apropiadamente.

11 \section{Borrar la historia local}

13 \subsection{La consignación accidental}

15 Tengo el problema ocasional, pero persistente de teclear más rápido de

16 lo que pienso, que aveces resulta en consignar un conjunto de cambios

17 incompleto o simplemente malo. En mi caso, el conjunto de cambios

18 incompleto consiste en que creé un nuevo archivo fuente, pero olvidé

19 hacerle \hgcmd{add}. Un conjunto de cambios``simplemente malo'' no es

20 tan común, pero sí resulta muy molesto.

22 \subsection{Retroceder una transacción}

23 \label{sec:undo:rollback}

25 En la sección~\ref{sec:concepts:txn}, mencioné que Mercurial trata

26 modificación a un repositorio como una \emph{transacción}. Cada vez

27 que consigna un conjunto de cambios o lo jala de otro repositorio,

28 Mercurial recuerda lo que hizo. Puede deshacer, o \emph{retroceder},

29 exactamente una de tales acciones usando la orden \hgcmd{rollback}.

30 (Ver en la sección~\ref{sec:undo:rollback-after-push} una anotación

31 importante acerca del uso de esta orden.)

33 A continuación una equivocación que me sucede frecuentemente:

34 consignar un cambio en el cual he creado un nuevo fichero, pero he

35 olvidado hacerle \hgcmd{add}.

36 \interaction{rollback.commit}

37 La salida de \hgcmd{status} después de la consignación confirma

38 inmediatamente este error.

39 \interaction{rollback.status}

40 La consignación capturó los cambios en el archivo \filename{a}, pero

41 no el nuevo fichero \filename{b}. Si yo publicara este conjunto de

42 cambios a un repositorio compartido con un colega, es bastante

43 probable que algo en \filename{a} se refiriera a \filename{b}, el cual

44 podría no estar presente cuando jalen mis cambios del repositorio. Me

45 convertiría el sujeto de cierta indignación.

47 Como sea, la suerte me acompaña---Encontré mi error antes de publicar

48 el conjunto de cambios. Uso la orden \hgcmd{rollback}, y Mercurial

49 hace desaparecer el último conjunto de cambios.

50 \interaction{rollback.rollback}

51 El conjunto de cambios ya no está en la historia del repositorio, y el

52 directorio de trabajo cree que el fichero \filename{a} ha sido

53 modificado. La consignación y el retroceso dejaron el directorio de

54 trabajo exactamente como estaban antes de la consignación; el conjunto

55 de cambios ha sido eliminado totlamente. Ahora puedo hacer \hgcmd{add}

56 al fichero \filename{b}, y hacer de nuevo la consignación.

57 \interaction{rollback.add}

59 \subsection{Erroneamente jalado}

61 Mantener ramas de desarrollo separadas de un proyecto en distintos

62 repositorios es una práctica común con Mercurial. Su equipo de

63 desarrollo puede tener un repositorio compartido para la versión ``0.9''

64 y otra con cambios distintos para la versión ``1.0''.

66 Con este escenario, puede imaginar las consecuencias si tuviera un

67 repositorio local ``0.9'', y jalara accidentalmente los cambios del

68 repositorio compartido de la versión ``1.0'' en este. En el peor de

69 los casos, por falta de atención, es posible que publique tales

70 cambios en el árbol compartido ``0.9'', confundiendo a todo su equipo

71 de trabajo(pero no se preocupe, volveremos a este terrorífico

72 escenario posteriormente). En todo caso, es muy probable que usted se

73 de cuenta inmediatamente, dado que Mercurial mostrará el URL de donde

74 está jalando, o que vea jalando una sospechosa gran cantidad de

75 cambios en el repositorio.

77 La orden \hgcmd{rollback} command will work nicely to expunge all of the

78 changesets that you just pulled. Mercurial groups all changes from

79 one \hgcmd{pull} into a single transaction, so one \hgcmd{rollback} is

80 all you need to undo this mistake.

82 \subsection{Rolling back is useless once you've pushed}

83 \label{sec:undo:rollback-after-push}

85 The value of the \hgcmd{rollback} command drops to zero once you've

86 pushed your changes to another repository. Rolling back a change

87 makes it disappear entirely, but \emph{only} in the repository in

88 which you perform the \hgcmd{rollback}. Because a rollback eliminates

89 history, there's no way for the disappearance of a change to propagate

90 between repositories.

92 If you've pushed a change to another repository---particularly if it's

93 a shared repository---it has essentially ``escaped into the wild,''

94 and you'll have to recover from your mistake in a different way. What

95 will happen if you push a changeset somewhere, then roll it back, then

96 pull from the repository you pushed to, is that the changeset will

97 reappear in your repository.

99 (If you absolutely know for sure that the change you want to roll back

100 is the most recent change in the repository that you pushed to,

101 \emph{and} you know that nobody else could have pulled it from that

102 repository, you can roll back the changeset there, too, but you really

103 should really not rely on this working reliably. If you do this,

104 sooner or later a change really will make it into a repository that

105 you don't directly control (or have forgotten about), and come back to

106 bite you.)

107

108 \subsection{You can only roll back once}

109

110 Mercurial stores exactly one transaction in its transaction log; that

111 transaction is the most recent one that occurred in the repository.

112 This means that you can only roll back one transaction. If you expect

113 to be able to roll back one transaction, then its predecessor, this is

114 not the behaviour you will get.

115 \interaction{rollback.twice}

116 Once you've rolled back one transaction in a repository, you can't

117 roll back again in that repository until you perform another commit or

118 pull.

119

120 \section{Reverting the mistaken change}

121

122 If you make a modification to a file, and decide that you really

123 didn't want to change the file at all, and you haven't yet committed

124 your changes, the \hgcmd{revert} command is the one you'll need. It

125 looks at the changeset that's the parent of the working directory, and

126 restores the contents of the file to their state as of that changeset.

127 (That's a long-winded way of saying that, in the normal case, it

128 undoes your modifications.)

129

130 Let's illustrate how the \hgcmd{revert} command works with yet another

131 small example. We'll begin by modifying a file that Mercurial is

132 already tracking.

133 \interaction{daily.revert.modify}

134 If we don't want that change, we can simply \hgcmd{revert} the file.

135 \interaction{daily.revert.unmodify}

136 The \hgcmd{revert} command provides us with an extra degree of safety

137 by saving our modified file with a \filename{.orig} extension.

138 \interaction{daily.revert.status}

139

140 Here is a summary of the cases that the \hgcmd{revert} command can

141 deal with. We will describe each of these in more detail in the

142 section that follows.

143 \begin{itemize}

144 \item If you modify a file, it will restore the file to its unmodified

145 state.

146 \item If you \hgcmd{add} a file, it will undo the ``added'' state of

147 the file, but leave the file itself untouched.

148 \item If you delete a file without telling Mercurial, it will restore

149 the file to its unmodified contents.

150 \item If you use the \hgcmd{remove} command to remove a file, it will

151 undo the ``removed'' state of the file, and restore the file to its

152 unmodified contents.

153 \end{itemize}

154

155 \subsection{File management errors}

156 \label{sec:undo:mgmt}

157

158 The \hgcmd{revert} command is useful for more than just modified

159 files. It lets you reverse the results of all of Mercurial's file

160 management commands---\hgcmd{add}, \hgcmd{remove}, and so on.

161

162 If you \hgcmd{add} a file, then decide that in fact you don't want

163 Mercurial to track it, use \hgcmd{revert} to undo the add. Don't

164 worry; Mercurial will not modify the file in any way. It will just

165 ``unmark'' the file.

166 \interaction{daily.revert.add}

167

168 Similarly, if you ask Mercurial to \hgcmd{remove} a file, you can use

169 \hgcmd{revert} to restore it to the contents it had as of the parent

170 of the working directory.

171 \interaction{daily.revert.remove}

172 This works just as well for a file that you deleted by hand, without

173 telling Mercurial (recall that in Mercurial terminology, this kind of

174 file is called ``missing'').

175 \interaction{daily.revert.missing}

176

177 If you revert a \hgcmd{copy}, the copied-to file remains in your

178 working directory afterwards, untracked. Since a copy doesn't affect

179 the copied-from file in any way, Mercurial doesn't do anything with

180 the copied-from file.

181 \interaction{daily.revert.copy}

182

183 \subsubsection{A slightly special case: reverting a rename}

184

185 If you \hgcmd{rename} a file, there is one small detail that

186 you should remember. When you \hgcmd{revert} a rename, it's not

187 enough to provide the name of the renamed-to file, as you can see

188 here.

189 \interaction{daily.revert.rename}

190 As you can see from the output of \hgcmd{status}, the renamed-to file

191 is no longer identified as added, but the renamed-\emph{from} file is

192 still removed! This is counter-intuitive (at least to me), but at

193 least it's easy to deal with.

194 \interaction{daily.revert.rename-orig}

195 So remember, to revert a \hgcmd{rename}, you must provide \emph{both}

196 the source and destination names.

197

198 % TODO: the output doesn't look like it will be removed!

199

200 (By the way, if you rename a file, then modify the renamed-to file,

201 then revert both components of the rename, when Mercurial restores the

202 file that was removed as part of the rename, it will be unmodified.

203 If you need the modifications in the renamed-to file to show up in the

204 renamed-from file, don't forget to copy them over.)

205

206 These fiddly aspects of reverting a rename arguably constitute a small

207 bug in Mercurial.

208

209 \section{Dealing with committed changes}

210

211 Consider a case where you have committed a change $a$, and another

212 change $b$ on top of it; you then realise that change $a$ was

213 incorrect. Mercurial lets you ``back out'' an entire changeset

214 automatically, and building blocks that let you reverse part of a

215 changeset by hand.

216

217 Before you read this section, here's something to keep in mind: the

218 \hgcmd{backout} command undoes changes by \emph{adding} history, not

219 by modifying or erasing it. It's the right tool to use if you're

220 fixing bugs, but not if you're trying to undo some change that has

221 catastrophic consequences. To deal with those, see

222 section~\ref{sec:undo:aaaiiieee}.

223

224 \subsection{Backing out a changeset}

225

226 The \hgcmd{backout} command lets you ``undo'' the effects of an entire

227 changeset in an automated fashion. Because Mercurial's history is

228 immutable, this command \emph{does not} get rid of the changeset you

229 want to undo. Instead, it creates a new changeset that

230 \emph{reverses} the effect of the to-be-undone changeset.

231

232 The operation of the \hgcmd{backout} command is a little intricate, so

233 let's illustrate it with some examples. First, we'll create a

234 repository with some simple changes.

235 \interaction{backout.init}

236

237 The \hgcmd{backout} command takes a single changeset ID as its

238 argument; this is the changeset to back out. Normally,

239 \hgcmd{backout} will drop you into a text editor to write a commit

240 message, so you can record why you're backing the change out. In this

241 example, we provide a commit message on the command line using the

242 \hgopt{backout}{-m} option.

243

244 \subsection{Backing out the tip changeset}

245

246 We're going to start by backing out the last changeset we committed.

247 \interaction{backout.simple}

248 You can see that the second line from \filename{myfile} is no longer

249 present. Taking a look at the output of \hgcmd{log} gives us an idea

250 of what the \hgcmd{backout} command has done.

251 \interaction{backout.simple.log}

252 Notice that the new changeset that \hgcmd{backout} has created is a

253 child of the changeset we backed out. It's easier to see this in

254 figure~\ref{fig:undo:backout}, which presents a graphical view of the

255 change history. As you can see, the history is nice and linear.

256

257 \begin{figure}[htb]

258 \centering

259 \grafix{undo-simple}

260 \caption{Backing out a change using the \hgcmd{backout} command}

261 \label{fig:undo:backout}

262 \end{figure}

263

264 \subsection{Backing out a non-tip change}

265

266 If you want to back out a change other than the last one you

267 committed, pass the \hgopt{backout}{--merge} option to the

268 \hgcmd{backout} command.

269 \interaction{backout.non-tip.clone}

270 This makes backing out any changeset a ``one-shot'' operation that's

271 usually simple and fast.

272 \interaction{backout.non-tip.backout}

273

274 If you take a look at the contents of \filename{myfile} after the

275 backout finishes, you'll see that the first and third changes are

276 present, but not the second.

277 \interaction{backout.non-tip.cat}

278

279 As the graphical history in figure~\ref{fig:undo:backout-non-tip}

280 illustrates, Mercurial actually commits \emph{two} changes in this

281 kind of situation (the box-shaped nodes are the ones that Mercurial

282 commits automatically). Before Mercurial begins the backout process,

283 it first remembers what the current parent of the working directory

284 is. It then backs out the target changeset, and commits that as a

285 changeset. Finally, it merges back to the previous parent of the

286 working directory, and commits the result of the merge.

287

288 % TODO: to me it looks like mercurial doesn't commit the second merge automatically!

289

290 \begin{figure}[htb]

291 \centering

292 \grafix{undo-non-tip}

293 \caption{Automated backout of a non-tip change using the \hgcmd{backout} command}

294 \label{fig:undo:backout-non-tip}

295 \end{figure}

296

297 The result is that you end up ``back where you were'', only with some

298 extra history that undoes the effect of the changeset you wanted to

299 back out.

300

301 \subsubsection{Always use the \hgopt{backout}{--merge} option}

302

303 In fact, since the \hgopt{backout}{--merge} option will do the ``right

304 thing'' whether or not the changeset you're backing out is the tip

305 (i.e.~it won't try to merge if it's backing out the tip, since there's

306 no need), you should \emph{always} use this option when you run the

307 \hgcmd{backout} command.

308

309 \subsection{Gaining more control of the backout process}

310

311 While I've recommended that you always use the

312 \hgopt{backout}{--merge} option when backing out a change, the

313 \hgcmd{backout} command lets you decide how to merge a backout

314 changeset. Taking control of the backout process by hand is something

315 you will rarely need to do, but it can be useful to understand what

316 the \hgcmd{backout} command is doing for you automatically. To

317 illustrate this, let's clone our first repository, but omit the

318 backout change that it contains.

319

320 \interaction{backout.manual.clone}

321 As with our earlier example, We'll commit a third changeset, then back

322 out its parent, and see what happens.

323 \interaction{backout.manual.backout}

324 Our new changeset is again a descendant of the changeset we backout

325 out; it's thus a new head, \emph{not} a descendant of the changeset

326 that was the tip. The \hgcmd{backout} command was quite explicit in

327 telling us this.

328 \interaction{backout.manual.log}

329

330 Again, it's easier to see what has happened by looking at a graph of

331 the revision history, in figure~\ref{fig:undo:backout-manual}. This

332 makes it clear that when we use \hgcmd{backout} to back out a change

333 other than the tip, Mercurial adds a new head to the repository (the

334 change it committed is box-shaped).

335

336 \begin{figure}[htb]

337 \centering

338 \grafix{undo-manual}

339 \caption{Backing out a change using the \hgcmd{backout} command}

340 \label{fig:undo:backout-manual}

341 \end{figure}

342

343 After the \hgcmd{backout} command has completed, it leaves the new

344 ``backout'' changeset as the parent of the working directory.

345 \interaction{backout.manual.parents}

346 Now we have two isolated sets of changes.

347 \interaction{backout.manual.heads}

348

349 Let's think about what we expect to see as the contents of

350 \filename{myfile} now. The first change should be present, because

351 we've never backed it out. The second change should be missing, as

352 that's the change we backed out. Since the history graph shows the

353 third change as a separate head, we \emph{don't} expect to see the

354 third change present in \filename{myfile}.

355 \interaction{backout.manual.cat}

356 To get the third change back into the file, we just do a normal merge

357 of our two heads.

358 \interaction{backout.manual.merge}

359 Afterwards, the graphical history of our repository looks like

360 figure~\ref{fig:undo:backout-manual-merge}.

361

362 \begin{figure}[htb]

363 \centering

364 \grafix{undo-manual-merge}

365 \caption{Manually merging a backout change}

366 \label{fig:undo:backout-manual-merge}

367 \end{figure}

368

369 \subsection{Why \hgcmd{backout} works as it does}

370

371 Here's a brief description of how the \hgcmd{backout} command works.

372 \begin{enumerate}

373 \item It ensures that the working directory is ``clean'', i.e.~that

374 the output of \hgcmd{status} would be empty.

375 \item It remembers the current parent of the working directory. Let's

376 call this changeset \texttt{orig}

377 \item It does the equivalent of a \hgcmd{update} to sync the working

378 directory to the changeset you want to back out. Let's call this

379 changeset \texttt{backout}

380 \item It finds the parent of that changeset. Let's call that

381 changeset \texttt{parent}.

382 \item For each file that the \texttt{backout} changeset affected, it

383 does the equivalent of a \hgcmdargs{revert}{-r parent} on that file,

384 to restore it to the contents it had before that changeset was

385 committed.

386 \item It commits the result as a new changeset. This changeset has

387 \texttt{backout} as its parent.

388 \item If you specify \hgopt{backout}{--merge} on the command line, it

389 merges with \texttt{orig}, and commits the result of the merge.

390 \end{enumerate}

391

392 An alternative way to implement the \hgcmd{backout} command would be

393 to \hgcmd{export} the to-be-backed-out changeset as a diff, then use

394 the \cmdopt{patch}{--reverse} option to the \command{patch} command to

395 reverse the effect of the change without fiddling with the working

396 directory. This sounds much simpler, but it would not work nearly as

397 well.

398

399 The reason that \hgcmd{backout} does an update, a commit, a merge, and

400 another commit is to give the merge machinery the best chance to do a

401 good job when dealing with all the changes \emph{between} the change

402 you're backing out and the current tip.

403

404 If you're backing out a changeset that's~100 revisions back in your

405 project's history, the chances that the \command{patch} command will

406 be able to apply a reverse diff cleanly are not good, because

407 intervening changes are likely to have ``broken the context'' that

408 \command{patch} uses to determine whether it can apply a patch (if

409 this sounds like gibberish, see \ref{sec:mq:patch} for a

410 discussion of the \command{patch} command). Also, Mercurial's merge

411 machinery will handle files and directories being renamed, permission

412 changes, and modifications to binary files, none of which

413 \command{patch} can deal with.

414

415 \section{Changes that should never have been}

416 \label{sec:undo:aaaiiieee}

417

418 Most of the time, the \hgcmd{backout} command is exactly what you need

419 if you want to undo the effects of a change. It leaves a permanent

420 record of exactly what you did, both when committing the original

421 changeset and when you cleaned up after it.

422

423 On rare occasions, though, you may find that you've committed a change

424 that really should not be present in the repository at all. For

425 example, it would be very unusual, and usually considered a mistake,

426 to commit a software project's object files as well as its source

427 files. Object files have almost no intrinsic value, and they're

428 \emph{big}, so they increase the size of the repository and the amount

429 of time it takes to clone or pull changes.

430

431 Before I discuss the options that you have if you commit a ``brown

432 paper bag'' change (the kind that's so bad that you want to pull a

433 brown paper bag over your head), let me first discuss some approaches

434 that probably won't work.

435

436 Since Mercurial treats history as accumulative---every change builds

437 on top of all changes that preceded it---you generally can't just make

438 disastrous changes disappear. The one exception is when you've just

439 committed a change, and it hasn't been pushed or pulled into another

440 repository. That's when you can safely use the \hgcmd{rollback}

441 command, as I detailed in section~\ref{sec:undo:rollback}.

442

443 After you've pushed a bad change to another repository, you

444 \emph{could} still use \hgcmd{rollback} to make your local copy of the

445 change disappear, but it won't have the consequences you want. The

446 change will still be present in the remote repository, so it will

447 reappear in your local repository the next time you pull.

448

449 If a situation like this arises, and you know which repositories your

450 bad change has propagated into, you can \emph{try} to get rid of the

451 changeefrom \emph{every} one of those repositories. This is, of

452 course, not a satisfactory solution: if you miss even a single

453 repository while you're expunging, the change is still ``in the

454 wild'', and could propagate further.

455

456 If you've committed one or more changes \emph{after} the change that

457 you'd like to see disappear, your options are further reduced.

458 Mercurial doesn't provide a way to ``punch a hole'' in history,

459 leaving changesets intact.

460

461 XXX This needs filling out. The \texttt{hg-replay} script in the

462 \texttt{examples} directory works, but doesn't handle merge

463 changesets. Kind of an important omission.

464

465 \subsection{Protect yourself from ``escaped'' changes}

466

467 If you've committed some changes to your local repository and they've

468 been pushed or pulled somewhere else, this isn't necessarily a

469 disaster. You can protect yourself ahead of time against some classes

470 of bad changeset. This is particularly easy if your team usually

471 pulls changes from a central repository.

472

473 By configuring some hooks on that repository to validate incoming

474 changesets (see chapter~\ref{chap:hook}), you can automatically

475 prevent some kinds of bad changeset from being pushed to the central

476 repository at all. With such a configuration in place, some kinds of

477 bad changeset will naturally tend to ``die out'' because they can't

478 propagate into the central repository. Better yet, this happens

479 without any need for explicit intervention.

480

481 For instance, an incoming change hook that verifies that a changeset

482 will actually compile can prevent people from inadvertantly ``breaking

483 the build''.

484

485 \section{Finding the source of a bug}

486 \label{sec:undo:bisect}

487

488 While it's all very well to be able to back out a changeset that

489 introduced a bug, this requires that you know which changeset to back

490 out. Mercurial provides an invaluable command, called

491 \hgcmd{bisect}, that helps you to automate this process and accomplish

492 it very efficiently.

493

494 The idea behind the \hgcmd{bisect} command is that a changeset has

495 introduced some change of behaviour that you can identify with a

496 simple binary test. You don't know which piece of code introduced the

497 change, but you know how to test for the presence of the bug. The

498 \hgcmd{bisect} command uses your test to direct its search for the

499 changeset that introduced the code that caused the bug.

500

501 Here are a few scenarios to help you understand how you might apply

502 this command.

503 \begin{itemize}

504 \item The most recent version of your software has a bug that you

505 remember wasn't present a few weeks ago, but you don't know when it

506 was introduced. Here, your binary test checks for the presence of

507 that bug.

508 \item You fixed a bug in a rush, and now it's time to close the entry

509 in your team's bug database. The bug database requires a changeset

510 ID when you close an entry, but you don't remember which changeset

511 you fixed the bug in. Once again, your binary test checks for the

512 presence of the bug.

513 \item Your software works correctly, but runs~15\% slower than the

514 last time you measured it. You want to know which changeset

515 introduced the performance regression. In this case, your binary

516 test measures the performance of your software, to see whether it's

517 ``fast'' or ``slow''.

518 \item The sizes of the components of your project that you ship

519 exploded recently, and you suspect that something changed in the way

520 you build your project.

521 \end{itemize}

522

523 From these examples, it should be clear that the \hgcmd{bisect}

524 command is not useful only for finding the sources of bugs. You can

525 use it to find any ``emergent property'' of a repository (anything

526 that you can't find from a simple text search of the files in the

527 tree) for which you can write a binary test.

528

529 We'll introduce a little bit of terminology here, just to make it

530 clear which parts of the search process are your responsibility, and

531 which are Mercurial's. A \emph{test} is something that \emph{you} run

532 when \hgcmd{bisect} chooses a changeset. A \emph{probe} is what

533 \hgcmd{bisect} runs to tell whether a revision is good. Finally,

534 we'll use the word ``bisect'', as both a noun and a verb, to stand in

535 for the phrase ``search using the \hgcmd{bisect} command.

536

537 One simple way to automate the searching process would be simply to

538 probe every changeset. However, this scales poorly. If it took ten

539 minutes to test a single changeset, and you had 10,000 changesets in

540 your repository, the exhaustive approach would take on average~35

541 \emph{days} to find the changeset that introduced a bug. Even if you

542 knew that the bug was introduced by one of the last 500 changesets,

543 and limited your search to those, you'd still be looking at over 40

544 hours to find the changeset that introduced your bug.

545

546 What the \hgcmd{bisect} command does is use its knowledge of the

547 ``shape'' of your project's revision history to perform a search in

548 time proportional to the \emph{logarithm} of the number of changesets

549 to check (the kind of search it performs is called a dichotomic

550 search). With this approach, searching through 10,000 changesets will

551 take less than three hours, even at ten minutes per test (the search

552 will require about 14 tests). Limit your search to the last hundred

553 changesets, and it will take only about an hour (roughly seven tests).

554

555 The \hgcmd{bisect} command is aware of the ``branchy'' nature of a

556 Mercurial project's revision history, so it has no problems dealing

557 with branches, merges, or multiple heads in a repoository. It can

558 prune entire branches of history with a single probe, which is how it

559 operates so efficiently.

560

561 \subsection{Using the \hgcmd{bisect} command}

562

563 Here's an example of \hgcmd{bisect} in action.

564

565 \begin{note}

566 In versions 0.9.5 and earlier of Mercurial, \hgcmd{bisect} was not a

567 core command: it was distributed with Mercurial as an extension.

568 This section describes the built-in command, not the old extension.

569 \end{note}

570

571 Now let's create a repository, so that we can try out the

572 \hgcmd{bisect} command in isolation.

573 \interaction{bisect.init}

574 We'll simulate a project that has a bug in it in a simple-minded way:

575 create trivial changes in a loop, and nominate one specific change

576 that will have the ``bug''. This loop creates 35 changesets, each

577 adding a single file to the repository. We'll represent our ``bug''

578 with a file that contains the text ``i have a gub''.

579 \interaction{bisect.commits}

580

581 The next thing that we'd like to do is figure out how to use the

582 \hgcmd{bisect} command. We can use Mercurial's normal built-in help

583 mechanism for this.

584 \interaction{bisect.help}

585

586 The \hgcmd{bisect} command works in steps. Each step proceeds as follows.

587 \begin{enumerate}

588 \item You run your binary test.

589 \begin{itemize}

590 \item If the test succeeded, you tell \hgcmd{bisect} by running the

591 \hgcmdargs{bisect}{good} command.

592 \item If it failed, run the \hgcmdargs{bisect}{--bad} command.

593 \end{itemize}

594 \item The command uses your information to decide which changeset to

595 test next.

596 \item It updates the working directory to that changeset, and the

597 process begins again.

598 \end{enumerate}

599 The process ends when \hgcmd{bisect} identifies a unique changeset

600 that marks the point where your test transitioned from ``succeeding''

601 to ``failing''.

602

603 To start the search, we must run the \hgcmdargs{bisect}{--reset} command.

604 \interaction{bisect.search.init}

605

606 In our case, the binary test we use is simple: we check to see if any

607 file in the repository contains the string ``i have a gub''. If it

608 does, this changeset contains the change that ``caused the bug''. By

609 convention, a changeset that has the property we're searching for is

610 ``bad'', while one that doesn't is ``good''.

611

612 Most of the time, the revision to which the working directory is

613 synced (usually the tip) already exhibits the problem introduced by

614 the buggy change, so we'll mark it as ``bad''.

615 \interaction{bisect.search.bad-init}

616

617 Our next task is to nominate a changeset that we know \emph{doesn't}

618 have the bug; the \hgcmd{bisect} command will ``bracket'' its search

619 between the first pair of good and bad changesets. In our case, we

620 know that revision~10 didn't have the bug. (I'll have more words

621 about choosing the first ``good'' changeset later.)

622 \interaction{bisect.search.good-init}

623

624 Notice that this command printed some output.

625 \begin{itemize}

626 \item It told us how many changesets it must consider before it can

627 identify the one that introduced the bug, and how many tests that

628 will require.

629 \item It updated the working directory to the next changeset to test,

630 and told us which changeset it's testing.

631 \end{itemize}

632

633 We now run our test in the working directory. We use the

634 \command{grep} command to see if our ``bad'' file is present in the

635 working directory. If it is, this revision is bad; if not, this

636 revision is good.

637 \interaction{bisect.search.step1}

638

639 This test looks like a perfect candidate for automation, so let's turn

640 it into a shell function.

641 \interaction{bisect.search.mytest}

642 We can now run an entire test step with a single command,

643 \texttt{mytest}.

644 \interaction{bisect.search.step2}

645 A few more invocations of our canned test step command, and we're

646 done.

647 \interaction{bisect.search.rest}

648

649 Even though we had~40 changesets to search through, the \hgcmd{bisect}

650 command let us find the changeset that introduced our ``bug'' with

651 only five tests. Because the number of tests that the \hgcmd{bisect}

652 command performs grows logarithmically with the number of changesets to

653 search, the advantage that it has over the ``brute force'' search

654 approach increases with every changeset you add.

655

656 \subsection{Cleaning up after your search}

657

658 When you're finished using the \hgcmd{bisect} command in a

659 repository, you can use the \hgcmdargs{bisect}{reset} command to drop

660 the information it was using to drive your search. The command

661 doesn't use much space, so it doesn't matter if you forget to run this

662 command. However, \hgcmd{bisect} won't let you start a new search in

663 that repository until you do a \hgcmdargs{bisect}{reset}.

664 \interaction{bisect.search.reset}

665

666 \section{Tips for finding bugs effectively}

667

668 \subsection{Give consistent input}

669

670 The \hgcmd{bisect} command requires that you correctly report the

671 result of every test you perform. If you tell it that a test failed

672 when it really succeeded, it \emph{might} be able to detect the

673 inconsistency. If it can identify an inconsistency in your reports,

674 it will tell you that a particular changeset is both good and bad.

675 However, it can't do this perfectly; it's about as likely to report

676 the wrong changeset as the source of the bug.

677

678 \subsection{Automate as much as possible}

679

680 When I started using the \hgcmd{bisect} command, I tried a few times

681 to run my tests by hand, on the command line. This is an approach

682 that I, at least, am not suited to. After a few tries, I found that I

683 was making enough mistakes that I was having to restart my searches

684 several times before finally getting correct results.

685

686 My initial problems with driving the \hgcmd{bisect} command by hand

687 occurred even with simple searches on small repositories; if the

688 problem you're looking for is more subtle, or the number of tests that

689 \hgcmd{bisect} must perform increases, the likelihood of operator

690 error ruining the search is much higher. Once I started automating my

691 tests, I had much better results.

692

693 The key to automated testing is twofold:

694 \begin{itemize}

695 \item always test for the same symptom, and

696 \item always feed consistent input to the \hgcmd{bisect} command.

697 \end{itemize}

698 In my tutorial example above, the \command{grep} command tests for the

699 symptom, and the \texttt{if} statement takes the result of this check

700 and ensures that we always feed the same input to the \hgcmd{bisect}

701 command. The \texttt{mytest} function marries these together in a

702 reproducible way, so that every test is uniform and consistent.

703

704 \subsection{Check your results}

705

706 Because the output of a \hgcmd{bisect} search is only as good as the

707 input you give it, don't take the changeset it reports as the

708 absolute truth. A simple way to cross-check its report is to manually

709 run your test at each of the following changesets:

710 \begin{itemize}

711 \item The changeset that it reports as the first bad revision. Your

712 test should still report this as bad.

713 \item The parent of that changeset (either parent, if it's a merge).

714 Your test should report this changeset as good.

715 \item A child of that changeset. Your test should report this

716 changeset as bad.

717 \end{itemize}

718

719 \subsection{Beware interference between bugs}

720

721 It's possible that your search for one bug could be disrupted by the

722 presence of another. For example, let's say your software crashes at

723 revision 100, and worked correctly at revision 50. Unknown to you,

724 someone else introduced a different crashing bug at revision 60, and

725 fixed it at revision 80. This could distort your results in one of

726 several ways.

727

728 It is possible that this other bug completely ``masks'' yours, which

729 is to say that it occurs before your bug has a chance to manifest

730 itself. If you can't avoid that other bug (for example, it prevents

731 your project from building), and so can't tell whether your bug is

732 present in a particular changeset, the \hgcmd{bisect} command cannot

733 help you directly. Instead, you can mark a changeset as untested by

734 running \hgcmdargs{bisect}{--skip}.

735

736 A different problem could arise if your test for a bug's presence is

737 not specific enough. If you check for ``my program crashes'', then

738 both your crashing bug and an unrelated crashing bug that masks it

739 will look like the same thing, and mislead \hgcmd{bisect}.

740

741 Another useful situation in which to use \hgcmdargs{bisect}{--skip} is

742 if you can't test a revision because your project was in a broken and

743 hence untestable state at that revision, perhaps because someone

744 checked in a change that prevented the project from building.

745

746 \subsection{Bracket your search lazily}

747

748 Choosing the first ``good'' and ``bad'' changesets that will mark the

749 end points of your search is often easy, but it bears a little

750 discussion nevertheless. From the perspective of \hgcmd{bisect}, the

751 ``newest'' changeset is conventionally ``bad'', and the older

752 changeset is ``good''.

753

754 If you're having trouble remembering when a suitable ``good'' change

755 was, so that you can tell \hgcmd{bisect}, you could do worse than

756 testing changesets at random. Just remember to eliminate contenders

757 that can't possibly exhibit the bug (perhaps because the feature with

758 the bug isn't present yet) and those where another problem masks the

759 bug (as I discussed above).

760

761 Even if you end up ``early'' by thousands of changesets or months of

762 history, you will only add a handful of tests to the total number that

763 \hgcmd{bisect} must perform, thanks to its logarithmic behaviour.

764

765 %%% Local Variables:

766 %%% mode: latex

767 %%% TeX-master: "00book"

768 %%% End: