hgbook: 1a4b507935de es/undo.tex

hgbook

view es/undo.tex @ 380:1a4b507935de

began translation of tour-merge

author	Javier Rojas <jerojasro@devnull.li>
date	Wed Oct 29 00:37:55 2008 -0500 (2008-10-29)
parents	d5f1049a79dd
children	d2467817c934

line source

1 \chapter{Encontrar y arreglar sus equivocaciones}

2 \label{chap:undo}

4 Errar es humano, pero tratar adecuadamente las consecuencias requiere

5 un sistema de control de revisiones de primera categoría. En este

6 capítulo, discutiremos algunas técnicas que puede usar cuando

7 encuentra que hay un problema enraizado en su proyecto. Mercurial

8 tiene unas características poderosas que le ayudarán a isolar las

9 fuentes de los problemas, y a dar cuenta de ellas apropiadamente.

11 \section{Borrar la historia local}

13 \subsection{La consignación accidental}

15 Tengo el problema ocasional, pero persistente de teclear más rápido de

16 lo que pienso, que aveces resulta en consignar un conjunto de cambios

17 incompleto o simplemente malo. En mi caso, el conjunto de cambios

18 incompleto consiste en que creé un nuevo fichero fuente, pero olvidé

19 hacerle \hgcmd{add}. Un conjunto de cambios``simplemente malo'' no es

20 tan común, pero sí resulta muy molesto.

22 \subsection{Hacer rollback una transacción}

23 \label{sec:undo:rollback}

25 En la sección~\ref{sec:concepts:txn}, mencioné que Mercurial trata

26 modificación a un repositorio como una \emph{transacción}. Cada vez

27 que consigna un conjunto de cambios o lo jala de otro repositorio,

28 Mercurial recuerda lo que hizo. Puede deshacer, o hacer \emph{roll back}\ndt{El significado igual que en los

29 ambientes de sistemas manejadores de bases de datos se refiere a

30 la atomicidad e integridad al devolver un conjunto de acciones que

31 permitan dejar el repositorio en un estado consistente previo},

32 exactamente una de tales acciones usando la orden \hgcmd{rollback}.

33 (Ver en la sección~\ref{sec:undo:rollback-after-push} una anotación

34 importante acerca del uso de esta orden.)

36 A continuación una equivocación que me sucede frecuentemente:

37 consignar un cambio en el cual he creado un nuevo fichero, pero he

38 olvidado hacerle \hgcmd{add}.

39 \interaction{rollback.commit}

40 La salida de \hgcmd{status} después de la consignación confirma

41 inmediatamente este error.

42 \interaction{rollback.status}

43 La consignación capturó los cambios en el fichero \filename{a}, pero

44 no el nuevo fichero \filename{b}. Si yo publicara este conjunto de

45 cambios a un repositorio compartido con un colega, es bastante

46 probable que algo en \filename{a} se refiriera a \filename{b}, el cual

47 podría no estar presente cuando jalen mis cambios del repositorio. Me

48 convertiría el sujeto de cierta indignación.

50 Como sea, la suerte me acompaña---Encontré mi error antes de publicar

51 el conjunto de cambios. Uso la orden \hgcmd{rollback}, y Mercurial

52 hace desaparecer el último conjunto de cambios.

53 \interaction{rollback.rollback}

54 El conjunto de cambios ya no está en la historia del repositorio, y el

55 directorio de trabajo cree que el fichero \filename{a} ha sido

56 modificado. La consignación y el roll back dejaron el directorio de

57 trabajo exactamente como estaba antes de la consignación; el conjunto

58 de cambios ha sido eliminado totlamente. Ahora puedo hacer \hgcmd{add}

59 al fichero \filename{b}, y hacer de nuevo la consignación.

60 \interaction{rollback.add}

62 \subsection{Erroneamente jalado}

64 Mantener ramas de desarrollo separadas de un proyecto en distintos

65 repositorios es una práctica común con Mercurial. Su equipo de

66 desarrollo puede tener un repositorio compartido para la versión ``0.9''

67 y otra con cambios distintos para la versión ``1.0''.

69 Con este escenario, puede imaginar las consecuencias si tuviera un

70 repositorio local ``0.9'', y jalara accidentalmente los cambios del

71 repositorio compartido de la versión ``1.0'' en este. En el peor de

72 los casos, por falta de atención, es posible que publique tales

73 cambios en el árbol compartido ``0.9'', confundiendo a todo su equipo

74 de trabajo(pero no se preocupe, volveremos a este terrorífico

75 escenario posteriormente). En todo caso, es muy probable que usted se

76 de cuenta inmediatamente, dado que Mercurial mostrará el URL de donde

77 está jalando, o que vea jalando una sospechosa gran cantidad de

78 cambios en el repositorio.

80 La orden \hgcmd{rollback} excluirá eficientemente los conjuntos de

81 cambios que haya acabado de jalar. Mercurial agrupa todos los cambios

82 de un \hgcmd{pull} a una única transacción y bastará con un

83 \hgcmd{rollback} para deshacer esta equivocación.

85 \subsection{Después de publicar, un roll back es futil}

86 \label{sec:undo:rollback-after-push}

88 El valor de \hgcmd{rollback} se anula cuando ha publicado sus cambios

89 a otro repositorio. Un cambio desaparece totalmente al hacer roll back,

90 pero \emph{solamente} en el repositorio en el cual aplica

91 \hgcmd{rollback}. Debido a que un roll back elimina la historia,

92 no hay forma de que la desaparición de un cambio se propague entre

93 repositorios.

95 Si ha publicado un cambio en otro repositorio---particularmente si es

96 un repositorio público---esencialmente está ``en terreno agreste,''

97 y tendrá que reparar la equivocación de un modo distinto. Lo que

98 pasará si publica un conjunto de cambios en algún sitio, hacer

99 rollback y después volver a jalar del repositorio del cual había

100 publicado, es que el conjunto de cambios reaparecerá en su repositorio.

101

102 (Si está absolutamente segruro de que el conjunto de cambios al que

103 desea hacer rollback es el cambio más reciente del repositorio en el

104 cual publicó, \emph{y} sabe que nadie más pudo haber jalado de tal

105 repositorio, puede hacer rollback del conjunto de cambios allí, pero

106 es mejor no confiar en una solución de este estilo. Si lo hace, tarde

107 o temprano un conjunto de cambios logrará colarse en un repositorio

108 que usted no controle directamente(o del cual se ha olvidado), y

109 volverá a hostigarle.)

110

111 \subsection{Solamente hay un roll back}

112

113 Mercurial almacena exactamente una transacción en su bitácora de

114 transacciones; tal transacción es la más reciente de las que haya

115 ocurrido en el repositorio. Esto significa que solamente puede hacer

116 roll back a una transacción. Si espera poder hacer roll back a una

117 transacción después al antecesor, observará que no es el

118 comportamiento que obtendrá.

119 \interaction{rollback.twice}

120 Una vez que haya aplicado un rollback en una transacción a un

121 repositorio, no podrá volver a hacer rollback hasta que haga una

122 consignación o haya jalado.

123

124 \section{Reverting the mistaken change}

125

126 If you make a modification to a file, and decide that you really

127 didn't want to change the file at all, and you haven't yet committed

128 your changes, the \hgcmd{revert} command is the one you'll need. It

129 looks at the changeset that's the parent of the working directory, and

130 restores the contents of the file to their state as of that changeset.

131 (That's a long-winded way of saying that, in the normal case, it

132 undoes your modifications.)

133

134 Let's illustrate how the \hgcmd{revert} command works with yet another

135 small example. We'll begin by modifying a file that Mercurial is

136 already tracking.

137 \interaction{daily.revert.modify}

138 If we don't want that change, we can simply \hgcmd{revert} the file.

139 \interaction{daily.revert.unmodify}

140 The \hgcmd{revert} command provides us with an extra degree of safety

141 by saving our modified file with a \filename{.orig} extension.

142 \interaction{daily.revert.status}

143

144 Here is a summary of the cases that the \hgcmd{revert} command can

145 deal with. We will describe each of these in more detail in the

146 section that follows.

147 \begin{itemize}

148 \item If you modify a file, it will restore the file to its unmodified

149 state.

150 \item If you \hgcmd{add} a file, it will undo the ``added'' state of

151 the file, but leave the file itself untouched.

152 \item If you delete a file without telling Mercurial, it will restore

153 the file to its unmodified contents.

154 \item If you use the \hgcmd{remove} command to remove a file, it will

155 undo the ``removed'' state of the file, and restore the file to its

156 unmodified contents.

157 \end{itemize}

158

159 \subsection{File management errors}

160 \label{sec:undo:mgmt}

161

162 The \hgcmd{revert} command is useful for more than just modified

163 files. It lets you reverse the results of all of Mercurial's file

164 management commands---\hgcmd{add}, \hgcmd{remove}, and so on.

165

166 If you \hgcmd{add} a file, then decide that in fact you don't want

167 Mercurial to track it, use \hgcmd{revert} to undo the add. Don't

168 worry; Mercurial will not modify the file in any way. It will just

169 ``unmark'' the file.

170 \interaction{daily.revert.add}

171

172 Similarly, if you ask Mercurial to \hgcmd{remove} a file, you can use

173 \hgcmd{revert} to restore it to the contents it had as of the parent

174 of the working directory.

175 \interaction{daily.revert.remove}

176 This works just as well for a file that you deleted by hand, without

177 telling Mercurial (recall that in Mercurial terminology, this kind of

178 file is called ``missing'').

179 \interaction{daily.revert.missing}

180

181 If you revert a \hgcmd{copy}, the copied-to file remains in your

182 working directory afterwards, untracked. Since a copy doesn't affect

183 the copied-from file in any way, Mercurial doesn't do anything with

184 the copied-from file.

185 \interaction{daily.revert.copy}

186

187 \subsubsection{A slightly special case: reverting a rename}

188

189 If you \hgcmd{rename} a file, there is one small detail that

190 you should remember. When you \hgcmd{revert} a rename, it's not

191 enough to provide the name of the renamed-to file, as you can see

192 here.

193 \interaction{daily.revert.rename}

194 As you can see from the output of \hgcmd{status}, the renamed-to file

195 is no longer identified as added, but the renamed-\emph{from} file is

196 still removed! This is counter-intuitive (at least to me), but at

197 least it's easy to deal with.

198 \interaction{daily.revert.rename-orig}

199 So remember, to revert a \hgcmd{rename}, you must provide \emph{both}

200 the source and destination names.

201

202 % TODO: the output doesn't look like it will be removed!

203

204 (By the way, if you rename a file, then modify the renamed-to file,

205 then revert both components of the rename, when Mercurial restores the

206 file that was removed as part of the rename, it will be unmodified.

207 If you need the modifications in the renamed-to file to show up in the

208 renamed-from file, don't forget to copy them over.)

209

210 These fiddly aspects of reverting a rename arguably constitute a small

211 bug in Mercurial.

212

213 \section{Dealing with committed changes}

214

215 Consider a case where you have committed a change $a$, and another

216 change $b$ on top of it; you then realise that change $a$ was

217 incorrect. Mercurial lets you ``back out'' an entire changeset

218 automatically, and building blocks that let you reverse part of a

219 changeset by hand.

220

221 Before you read this section, here's something to keep in mind: the

222 \hgcmd{backout} command undoes changes by \emph{adding} history, not

223 by modifying or erasing it. It's the right tool to use if you're

224 fixing bugs, but not if you're trying to undo some change that has

225 catastrophic consequences. To deal with those, see

226 section~\ref{sec:undo:aaaiiieee}.

227

228 \subsection{Backing out a changeset}

229

230 The \hgcmd{backout} command lets you ``undo'' the effects of an entire

231 changeset in an automated fashion. Because Mercurial's history is

232 immutable, this command \emph{does not} get rid of the changeset you

233 want to undo. Instead, it creates a new changeset that

234 \emph{reverses} the effect of the to-be-undone changeset.

235

236 The operation of the \hgcmd{backout} command is a little intricate, so

237 let's illustrate it with some examples. First, we'll create a

238 repository with some simple changes.

239 \interaction{backout.init}

240

241 The \hgcmd{backout} command takes a single changeset ID as its

242 argument; this is the changeset to back out. Normally,

243 \hgcmd{backout} will drop you into a text editor to write a commit

244 message, so you can record why you're backing the change out. In this

245 example, we provide a commit message on the command line using the

246 \hgopt{backout}{-m} option.

247

248 \subsection{Backing out the tip changeset}

249

250 We're going to start by backing out the last changeset we committed.

251 \interaction{backout.simple}

252 You can see that the second line from \filename{myfile} is no longer

253 present. Taking a look at the output of \hgcmd{log} gives us an idea

254 of what the \hgcmd{backout} command has done.

255 \interaction{backout.simple.log}

256 Notice that the new changeset that \hgcmd{backout} has created is a

257 child of the changeset we backed out. It's easier to see this in

258 figure~\ref{fig:undo:backout}, which presents a graphical view of the

259 change history. As you can see, the history is nice and linear.

260

261 \begin{figure}[htb]

262 \centering

263 \grafix{undo-simple}

264 \caption{Backing out a change using the \hgcmd{backout} command}

265 \label{fig:undo:backout}

266 \end{figure}

267

268 \subsection{Backing out a non-tip change}

269

270 If you want to back out a change other than the last one you

271 committed, pass the \hgopt{backout}{--merge} option to the

272 \hgcmd{backout} command.

273 \interaction{backout.non-tip.clone}

274 This makes backing out any changeset a ``one-shot'' operation that's

275 usually simple and fast.

276 \interaction{backout.non-tip.backout}

277

278 If you take a look at the contents of \filename{myfile} after the

279 backout finishes, you'll see that the first and third changes are

280 present, but not the second.

281 \interaction{backout.non-tip.cat}

282

283 As the graphical history in figure~\ref{fig:undo:backout-non-tip}

284 illustrates, Mercurial actually commits \emph{two} changes in this

285 kind of situation (the box-shaped nodes are the ones that Mercurial

286 commits automatically). Before Mercurial begins the backout process,

287 it first remembers what the current parent of the working directory

288 is. It then backs out the target changeset, and commits that as a

289 changeset. Finally, it merges back to the previous parent of the

290 working directory, and commits the result of the merge.

291

292 % TODO: to me it looks like mercurial doesn't commit the second merge automatically!

293

294 \begin{figure}[htb]

295 \centering

296 \grafix{undo-non-tip}

297 \caption{Automated backout of a non-tip change using the \hgcmd{backout} command}

298 \label{fig:undo:backout-non-tip}

299 \end{figure}

300

301 The result is that you end up ``back where you were'', only with some

302 extra history that undoes the effect of the changeset you wanted to

303 back out.

304

305 \subsubsection{Always use the \hgopt{backout}{--merge} option}

306

307 In fact, since the \hgopt{backout}{--merge} option will do the ``right

308 thing'' whether or not the changeset you're backing out is the tip

309 (i.e.~it won't try to merge if it's backing out the tip, since there's

310 no need), you should \emph{always} use this option when you run the

311 \hgcmd{backout} command.

312

313 \subsection{Gaining more control of the backout process}

314

315 While I've recommended that you always use the

316 \hgopt{backout}{--merge} option when backing out a change, the

317 \hgcmd{backout} command lets you decide how to merge a backout

318 changeset. Taking control of the backout process by hand is something

319 you will rarely need to do, but it can be useful to understand what

320 the \hgcmd{backout} command is doing for you automatically. To

321 illustrate this, let's clone our first repository, but omit the

322 backout change that it contains.

323

324 \interaction{backout.manual.clone}

325 As with our earlier example, We'll commit a third changeset, then back

326 out its parent, and see what happens.

327 \interaction{backout.manual.backout}

328 Our new changeset is again a descendant of the changeset we backout

329 out; it's thus a new head, \emph{not} a descendant of the changeset

330 that was the tip. The \hgcmd{backout} command was quite explicit in

331 telling us this.

332 \interaction{backout.manual.log}

333

334 Again, it's easier to see what has happened by looking at a graph of

335 the revision history, in figure~\ref{fig:undo:backout-manual}. This

336 makes it clear that when we use \hgcmd{backout} to back out a change

337 other than the tip, Mercurial adds a new head to the repository (the

338 change it committed is box-shaped).

339

340 \begin{figure}[htb]

341 \centering

342 \grafix{undo-manual}

343 \caption{Backing out a change using the \hgcmd{backout} command}

344 \label{fig:undo:backout-manual}

345 \end{figure}

346

347 After the \hgcmd{backout} command has completed, it leaves the new

348 ``backout'' changeset as the parent of the working directory.

349 \interaction{backout.manual.parents}

350 Now we have two isolated sets of changes.

351 \interaction{backout.manual.heads}

352

353 Let's think about what we expect to see as the contents of

354 \filename{myfile} now. The first change should be present, because

355 we've never backed it out. The second change should be missing, as

356 that's the change we backed out. Since the history graph shows the

357 third change as a separate head, we \emph{don't} expect to see the

358 third change present in \filename{myfile}.

359 \interaction{backout.manual.cat}

360 To get the third change back into the file, we just do a normal merge

361 of our two heads.

362 \interaction{backout.manual.merge}

363 Afterwards, the graphical history of our repository looks like

364 figure~\ref{fig:undo:backout-manual-merge}.

365

366 \begin{figure}[htb]

367 \centering

368 \grafix{undo-manual-merge}

369 \caption{Manually merging a backout change}

370 \label{fig:undo:backout-manual-merge}

371 \end{figure}

372

373 \subsection{Why \hgcmd{backout} works as it does}

374

375 Here's a brief description of how the \hgcmd{backout} command works.

376 \begin{enumerate}

377 \item It ensures that the working directory is ``clean'', i.e.~that

378 the output of \hgcmd{status} would be empty.

379 \item It remembers the current parent of the working directory. Let's

380 call this changeset \texttt{orig}

381 \item It does the equivalent of a \hgcmd{update} to sync the working

382 directory to the changeset you want to back out. Let's call this

383 changeset \texttt{backout}

384 \item It finds the parent of that changeset. Let's call that

385 changeset \texttt{parent}.

386 \item For each file that the \texttt{backout} changeset affected, it

387 does the equivalent of a \hgcmdargs{revert}{-r parent} on that file,

388 to restore it to the contents it had before that changeset was

389 committed.

390 \item It commits the result as a new changeset. This changeset has

391 \texttt{backout} as its parent.

392 \item If you specify \hgopt{backout}{--merge} on the command line, it

393 merges with \texttt{orig}, and commits the result of the merge.

394 \end{enumerate}

395

396 An alternative way to implement the \hgcmd{backout} command would be

397 to \hgcmd{export} the to-be-backed-out changeset as a diff, then use

398 the \cmdopt{patch}{--reverse} option to the \command{patch} command to

399 reverse the effect of the change without fiddling with the working

400 directory. This sounds much simpler, but it would not work nearly as

401 well.

402

403 The reason that \hgcmd{backout} does an update, a commit, a merge, and

404 another commit is to give the merge machinery the best chance to do a

405 good job when dealing with all the changes \emph{between} the change

406 you're backing out and the current tip.

407

408 If you're backing out a changeset that's~100 revisions back in your

409 project's history, the chances that the \command{patch} command will

410 be able to apply a reverse diff cleanly are not good, because

411 intervening changes are likely to have ``broken the context'' that

412 \command{patch} uses to determine whether it can apply a patch (if

413 this sounds like gibberish, see \ref{sec:mq:patch} for a

414 discussion of the \command{patch} command). Also, Mercurial's merge

415 machinery will handle files and directories being renamed, permission

416 changes, and modifications to binary files, none of which

417 \command{patch} can deal with.

418

419 \section{Changes that should never have been}

420 \label{sec:undo:aaaiiieee}

421

422 Most of the time, the \hgcmd{backout} command is exactly what you need

423 if you want to undo the effects of a change. It leaves a permanent

424 record of exactly what you did, both when committing the original

425 changeset and when you cleaned up after it.

426

427 On rare occasions, though, you may find that you've committed a change

428 that really should not be present in the repository at all. For

429 example, it would be very unusual, and usually considered a mistake,

430 to commit a software project's object files as well as its source

431 files. Object files have almost no intrinsic value, and they're

432 \emph{big}, so they increase the size of the repository and the amount

433 of time it takes to clone or pull changes.

434

435 Before I discuss the options that you have if you commit a ``brown

436 paper bag'' change (the kind that's so bad that you want to pull a

437 brown paper bag over your head), let me first discuss some approaches

438 that probably won't work.

439

440 Since Mercurial treats history as accumulative---every change builds

441 on top of all changes that preceded it---you generally can't just make

442 disastrous changes disappear. The one exception is when you've just

443 committed a change, and it hasn't been pushed or pulled into another

444 repository. That's when you can safely use the \hgcmd{rollback}

445 command, as I detailed in section~\ref{sec:undo:rollback}.

446

447 After you've pushed a bad change to another repository, you

448 \emph{could} still use \hgcmd{rollback} to make your local copy of the

449 change disappear, but it won't have the consequences you want. The

450 change will still be present in the remote repository, so it will

451 reappear in your local repository the next time you pull.

452

453 If a situation like this arises, and you know which repositories your

454 bad change has propagated into, you can \emph{try} to get rid of the

455 changeefrom \emph{every} one of those repositories. This is, of

456 course, not a satisfactory solution: if you miss even a single

457 repository while you're expunging, the change is still ``in the

458 wild'', and could propagate further.

459

460 If you've committed one or more changes \emph{after} the change that

461 you'd like to see disappear, your options are further reduced.

462 Mercurial doesn't provide a way to ``punch a hole'' in history,

463 leaving changesets intact.

464

465 XXX This needs filling out. The \texttt{hg-replay} script in the

466 \texttt{examples} directory works, but doesn't handle merge

467 changesets. Kind of an important omission.

468

469 \subsection{Protect yourself from ``escaped'' changes}

470

471 If you've committed some changes to your local repository and they've

472 been pushed or pulled somewhere else, this isn't necessarily a

473 disaster. You can protect yourself ahead of time against some classes

474 of bad changeset. This is particularly easy if your team usually

475 pulls changes from a central repository.

476

477 By configuring some hooks on that repository to validate incoming

478 changesets (see chapter~\ref{chap:hook}), you can automatically

479 prevent some kinds of bad changeset from being pushed to the central

480 repository at all. With such a configuration in place, some kinds of

481 bad changeset will naturally tend to ``die out'' because they can't

482 propagate into the central repository. Better yet, this happens

483 without any need for explicit intervention.

484

485 For instance, an incoming change hook that verifies that a changeset

486 will actually compile can prevent people from inadvertantly ``breaking

487 the build''.

488

489 \section{Finding the source of a bug}

490 \label{sec:undo:bisect}

491

492 While it's all very well to be able to back out a changeset that

493 introduced a bug, this requires that you know which changeset to back

494 out. Mercurial provides an invaluable command, called

495 \hgcmd{bisect}, that helps you to automate this process and accomplish

496 it very efficiently.

497

498 The idea behind the \hgcmd{bisect} command is that a changeset has

499 introduced some change of behaviour that you can identify with a

500 simple binary test. You don't know which piece of code introduced the

501 change, but you know how to test for the presence of the bug. The

502 \hgcmd{bisect} command uses your test to direct its search for the

503 changeset that introduced the code that caused the bug.

504

505 Here are a few scenarios to help you understand how you might apply

506 this command.

507 \begin{itemize}

508 \item The most recent version of your software has a bug that you

509 remember wasn't present a few weeks ago, but you don't know when it

510 was introduced. Here, your binary test checks for the presence of

511 that bug.

512 \item You fixed a bug in a rush, and now it's time to close the entry

513 in your team's bug database. The bug database requires a changeset

514 ID when you close an entry, but you don't remember which changeset

515 you fixed the bug in. Once again, your binary test checks for the

516 presence of the bug.

517 \item Your software works correctly, but runs~15\% slower than the

518 last time you measured it. You want to know which changeset

519 introduced the performance regression. In this case, your binary

520 test measures the performance of your software, to see whether it's

521 ``fast'' or ``slow''.

522 \item The sizes of the components of your project that you ship

523 exploded recently, and you suspect that something changed in the way

524 you build your project.

525 \end{itemize}

526

527 From these examples, it should be clear that the \hgcmd{bisect}

528 command is not useful only for finding the sources of bugs. You can

529 use it to find any ``emergent property'' of a repository (anything

530 that you can't find from a simple text search of the files in the

531 tree) for which you can write a binary test.

532

533 We'll introduce a little bit of terminology here, just to make it

534 clear which parts of the search process are your responsibility, and

535 which are Mercurial's. A \emph{test} is something that \emph{you} run

536 when \hgcmd{bisect} chooses a changeset. A \emph{probe} is what

537 \hgcmd{bisect} runs to tell whether a revision is good. Finally,

538 we'll use the word ``bisect'', as both a noun and a verb, to stand in

539 for the phrase ``search using the \hgcmd{bisect} command.

540

541 One simple way to automate the searching process would be simply to

542 probe every changeset. However, this scales poorly. If it took ten

543 minutes to test a single changeset, and you had 10,000 changesets in

544 your repository, the exhaustive approach would take on average~35

545 \emph{days} to find the changeset that introduced a bug. Even if you

546 knew that the bug was introduced by one of the last 500 changesets,

547 and limited your search to those, you'd still be looking at over 40

548 hours to find the changeset that introduced your bug.

549

550 What the \hgcmd{bisect} command does is use its knowledge of the

551 ``shape'' of your project's revision history to perform a search in

552 time proportional to the \emph{logarithm} of the number of changesets

553 to check (the kind of search it performs is called a dichotomic

554 search). With this approach, searching through 10,000 changesets will

555 take less than three hours, even at ten minutes per test (the search

556 will require about 14 tests). Limit your search to the last hundred

557 changesets, and it will take only about an hour (roughly seven tests).

558

559 The \hgcmd{bisect} command is aware of the ``branchy'' nature of a

560 Mercurial project's revision history, so it has no problems dealing

561 with branches, merges, or multiple heads in a repoository. It can

562 prune entire branches of history with a single probe, which is how it

563 operates so efficiently.

564

565 \subsection{Using the \hgcmd{bisect} command}

566

567 Here's an example of \hgcmd{bisect} in action.

568

569 \begin{note}

570 In versions 0.9.5 and earlier of Mercurial, \hgcmd{bisect} was not a

571 core command: it was distributed with Mercurial as an extension.

572 This section describes the built-in command, not the old extension.

573 \end{note}

574

575 Now let's create a repository, so that we can try out the

576 \hgcmd{bisect} command in isolation.

577 \interaction{bisect.init}

578 We'll simulate a project that has a bug in it in a simple-minded way:

579 create trivial changes in a loop, and nominate one specific change

580 that will have the ``bug''. This loop creates 35 changesets, each

581 adding a single file to the repository. We'll represent our ``bug''

582 with a file that contains the text ``i have a gub''.

583 \interaction{bisect.commits}

584

585 The next thing that we'd like to do is figure out how to use the

586 \hgcmd{bisect} command. We can use Mercurial's normal built-in help

587 mechanism for this.

588 \interaction{bisect.help}

589

590 The \hgcmd{bisect} command works in steps. Each step proceeds as follows.

591 \begin{enumerate}

592 \item You run your binary test.

593 \begin{itemize}

594 \item If the test succeeded, you tell \hgcmd{bisect} by running the

595 \hgcmdargs{bisect}{good} command.

596 \item If it failed, run the \hgcmdargs{bisect}{--bad} command.

597 \end{itemize}

598 \item The command uses your information to decide which changeset to

599 test next.

600 \item It updates the working directory to that changeset, and the

601 process begins again.

602 \end{enumerate}

603 The process ends when \hgcmd{bisect} identifies a unique changeset

604 that marks the point where your test transitioned from ``succeeding''

605 to ``failing''.

606

607 To start the search, we must run the \hgcmdargs{bisect}{--reset} command.

608 \interaction{bisect.search.init}

609

610 In our case, the binary test we use is simple: we check to see if any

611 file in the repository contains the string ``i have a gub''. If it

612 does, this changeset contains the change that ``caused the bug''. By

613 convention, a changeset that has the property we're searching for is

614 ``bad'', while one that doesn't is ``good''.

615

616 Most of the time, the revision to which the working directory is

617 synced (usually the tip) already exhibits the problem introduced by

618 the buggy change, so we'll mark it as ``bad''.

619 \interaction{bisect.search.bad-init}

620

621 Our next task is to nominate a changeset that we know \emph{doesn't}

622 have the bug; the \hgcmd{bisect} command will ``bracket'' its search

623 between the first pair of good and bad changesets. In our case, we

624 know that revision~10 didn't have the bug. (I'll have more words

625 about choosing the first ``good'' changeset later.)

626 \interaction{bisect.search.good-init}

627

628 Notice that this command printed some output.

629 \begin{itemize}

630 \item It told us how many changesets it must consider before it can

631 identify the one that introduced the bug, and how many tests that

632 will require.

633 \item It updated the working directory to the next changeset to test,

634 and told us which changeset it's testing.

635 \end{itemize}

636

637 We now run our test in the working directory. We use the

638 \command{grep} command to see if our ``bad'' file is present in the

639 working directory. If it is, this revision is bad; if not, this

640 revision is good.

641 \interaction{bisect.search.step1}

642

643 This test looks like a perfect candidate for automation, so let's turn

644 it into a shell function.

645 \interaction{bisect.search.mytest}

646 We can now run an entire test step with a single command,

647 \texttt{mytest}.

648 \interaction{bisect.search.step2}

649 A few more invocations of our canned test step command, and we're

650 done.

651 \interaction{bisect.search.rest}

652

653 Even though we had~40 changesets to search through, the \hgcmd{bisect}

654 command let us find the changeset that introduced our ``bug'' with

655 only five tests. Because the number of tests that the \hgcmd{bisect}

656 command performs grows logarithmically with the number of changesets to

657 search, the advantage that it has over the ``brute force'' search

658 approach increases with every changeset you add.

659

660 \subsection{Cleaning up after your search}

661

662 When you're finished using the \hgcmd{bisect} command in a

663 repository, you can use the \hgcmdargs{bisect}{reset} command to drop

664 the information it was using to drive your search. The command

665 doesn't use much space, so it doesn't matter if you forget to run this

666 command. However, \hgcmd{bisect} won't let you start a new search in

667 that repository until you do a \hgcmdargs{bisect}{reset}.

668 \interaction{bisect.search.reset}

669

670 \section{Tips for finding bugs effectively}

671

672 \subsection{Give consistent input}

673

674 The \hgcmd{bisect} command requires that you correctly report the

675 result of every test you perform. If you tell it that a test failed

676 when it really succeeded, it \emph{might} be able to detect the

677 inconsistency. If it can identify an inconsistency in your reports,

678 it will tell you that a particular changeset is both good and bad.

679 However, it can't do this perfectly; it's about as likely to report

680 the wrong changeset as the source of the bug.

681

682 \subsection{Automate as much as possible}

683

684 When I started using the \hgcmd{bisect} command, I tried a few times

685 to run my tests by hand, on the command line. This is an approach

686 that I, at least, am not suited to. After a few tries, I found that I

687 was making enough mistakes that I was having to restart my searches

688 several times before finally getting correct results.

689

690 My initial problems with driving the \hgcmd{bisect} command by hand

691 occurred even with simple searches on small repositories; if the

692 problem you're looking for is more subtle, or the number of tests that

693 \hgcmd{bisect} must perform increases, the likelihood of operator

694 error ruining the search is much higher. Once I started automating my

695 tests, I had much better results.

696

697 The key to automated testing is twofold:

698 \begin{itemize}

699 \item always test for the same symptom, and

700 \item always feed consistent input to the \hgcmd{bisect} command.

701 \end{itemize}

702 In my tutorial example above, the \command{grep} command tests for the

703 symptom, and the \texttt{if} statement takes the result of this check

704 and ensures that we always feed the same input to the \hgcmd{bisect}

705 command. The \texttt{mytest} function marries these together in a

706 reproducible way, so that every test is uniform and consistent.

707

708 \subsection{Check your results}

709

710 Because the output of a \hgcmd{bisect} search is only as good as the

711 input you give it, don't take the changeset it reports as the

712 absolute truth. A simple way to cross-check its report is to manually

713 run your test at each of the following changesets:

714 \begin{itemize}

715 \item The changeset that it reports as the first bad revision. Your

716 test should still report this as bad.

717 \item The parent of that changeset (either parent, if it's a merge).

718 Your test should report this changeset as good.

719 \item A child of that changeset. Your test should report this

720 changeset as bad.

721 \end{itemize}

722

723 \subsection{Beware interference between bugs}

724

725 It's possible that your search for one bug could be disrupted by the

726 presence of another. For example, let's say your software crashes at

727 revision 100, and worked correctly at revision 50. Unknown to you,

728 someone else introduced a different crashing bug at revision 60, and

729 fixed it at revision 80. This could distort your results in one of

730 several ways.

731

732 It is possible that this other bug completely ``masks'' yours, which

733 is to say that it occurs before your bug has a chance to manifest

734 itself. If you can't avoid that other bug (for example, it prevents

735 your project from building), and so can't tell whether your bug is

736 present in a particular changeset, the \hgcmd{bisect} command cannot

737 help you directly. Instead, you can mark a changeset as untested by

738 running \hgcmdargs{bisect}{--skip}.

739

740 A different problem could arise if your test for a bug's presence is

741 not specific enough. If you check for ``my program crashes'', then

742 both your crashing bug and an unrelated crashing bug that masks it

743 will look like the same thing, and mislead \hgcmd{bisect}.

744

745 Another useful situation in which to use \hgcmdargs{bisect}{--skip} is

746 if you can't test a revision because your project was in a broken and

747 hence untestable state at that revision, perhaps because someone

748 checked in a change that prevented the project from building.

749

750 \subsection{Bracket your search lazily}

751

752 Choosing the first ``good'' and ``bad'' changesets that will mark the

753 end points of your search is often easy, but it bears a little

754 discussion nevertheless. From the perspective of \hgcmd{bisect}, the

755 ``newest'' changeset is conventionally ``bad'', and the older

756 changeset is ``good''.

757

758 If you're having trouble remembering when a suitable ``good'' change

759 was, so that you can tell \hgcmd{bisect}, you could do worse than

760 testing changesets at random. Just remember to eliminate contenders

761 that can't possibly exhibit the bug (perhaps because the feature with

762 the bug isn't present yet) and those where another problem masks the

763 bug (as I discussed above).

764

765 Even if you end up ``early'' by thousands of changesets or months of

766 history, you will only add a handful of tests to the total number that

767 \hgcmd{bisect} must perform, thanks to its logarithmic behaviour.

768

769 %%% Local Variables:

770 %%% mode: latex

771 %%% TeX-master: "00book"

772 %%% End: