igor@374: \chapter{Encontrar y arreglar sus equivocaciones} jerojasro@343: \label{chap:undo} jerojasro@343: igor@374: Errar es humano, pero tratar adecuadamente las consecuencias requiere igor@374: un sistema de control de revisiones de primera categoría. En este igor@374: capítulo, discutiremos algunas técnicas que puede usar cuando igor@374: encuentra que hay un problema enraizado en su proyecto. Mercurial igor@374: tiene unas características poderosas que le ayudarán a isolar las igor@374: fuentes de los problemas, y a dar cuenta de ellas apropiadamente. igor@374: igor@374: \section{Borrar la historia local} igor@374: igor@374: \subsection{La consignación accidental} igor@374: igor@374: Tengo el problema ocasional, pero persistente de teclear más rápido de igor@374: lo que pienso, que aveces resulta en consignar un conjunto de cambios igor@374: incompleto o simplemente malo. En mi caso, el conjunto de cambios igor@374: incompleto consiste en que creé un nuevo archivo fuente, pero olvidé igor@374: hacerle \hgcmd{add}. Un conjunto de cambios``simplemente malo'' no es igor@374: tan común, pero sí resulta muy molesto. igor@374: igor@374: \subsection{Retroceder una transacción} jerojasro@343: \label{sec:undo:rollback} jerojasro@343: igor@374: En la sección~\ref{sec:concepts:txn}, mencioné que Mercurial trata igor@374: modificación a un repositorio como una \emph{transacción}. Cada vez igor@374: que consigna un conjunto de cambios o lo jala de otro repositorio, igor@374: Mercurial recuerda lo que hizo. Puede deshacer, o \emph{retroceder}, igor@374: exactamente una de tales acciones usando la orden \hgcmd{rollback}. igor@374: (Ver en la sección~\ref{sec:undo:rollback-after-push} una anotación igor@374: importante acerca del uso de esta orden.) igor@374: igor@374: A continuación una equivocación que me sucede frecuentemente: igor@374: consignar un cambio en el cual he creado un nuevo fichero, pero he igor@374: olvidado hacerle \hgcmd{add}. jerojasro@343: \interaction{rollback.commit} igor@374: La salida de \hgcmd{status} después de la consignación confirma igor@374: inmediatamente este error. jerojasro@343: \interaction{rollback.status} igor@374: La consignación capturó los cambios en el archivo \filename{a}, pero igor@374: no el nuevo fichero \filename{b}. Si yo publicara este conjunto de igor@374: cambios a un repositorio compartido con un colega, es bastante igor@374: probable que algo en \filename{a} se refiriera a \filename{b}, el cual igor@374: podría no estar presente cuando jalen mis cambios del repositorio. Me igor@374: convertiría el sujeto de cierta indignación. igor@374: igor@374: Como sea, la suerte me acompaña---Encontré mi error antes de publicar igor@374: el conjunto de cambios. Uso la orden \hgcmd{rollback}, y Mercurial igor@374: hace desaparecer el último conjunto de cambios. jerojasro@343: \interaction{rollback.rollback} igor@374: El conjunto de cambios ya no está en la historia del repositorio, y el igor@374: directorio de trabajo cree que el fichero \filename{a} ha sido igor@374: modificado. La consignación y el retroceso dejaron el directorio de igor@374: trabajo exactamente como estaban antes de la consignación; el conjunto igor@374: de cambios ha sido eliminado totlamente. Ahora puedo hacer \hgcmd{add} igor@374: al fichero \filename{b}, y hacer de nuevo la consignación. jerojasro@343: \interaction{rollback.add} jerojasro@343: igor@374: \subsection{Erroneamente jalado} igor@374: igor@374: Mantener ramas de desarrollo separadas de un proyecto en distintos igor@374: repositorios es una práctica común con Mercurial. Su equipo de igor@374: desarrollo puede tener un repositorio compartido para la versión ``0.9'' igor@374: y otra con cambios distintos para la versión ``1.0''. igor@374: igor@374: Con este escenario, puede imaginar las consecuencias si tuviera un igor@374: repositorio local ``0.9'', y jalara accidentalmente los cambios del igor@374: repositorio compartido de la versión ``1.0'' en este. En el peor de igor@374: los casos, por falta de atención, es posible que publique tales igor@374: cambios en el árbol compartido ``0.9'', confundiendo a todo su equipo igor@374: de trabajo(pero no se preocupe, volveremos a este terrorífico igor@374: escenario posteriormente). En todo caso, es muy probable que usted se igor@374: de cuenta inmediatamente, dado que Mercurial mostrará el URL de donde igor@374: está jalando, o que vea jalando una sospechosa gran cantidad de igor@374: cambios en el repositorio. igor@374: igor@374: La orden \hgcmd{rollback} command will work nicely to expunge all of the jerojasro@343: changesets that you just pulled. Mercurial groups all changes from jerojasro@343: one \hgcmd{pull} into a single transaction, so one \hgcmd{rollback} is jerojasro@343: all you need to undo this mistake. jerojasro@343: jerojasro@343: \subsection{Rolling back is useless once you've pushed} jerojasro@343: \label{sec:undo:rollback-after-push} jerojasro@343: jerojasro@343: The value of the \hgcmd{rollback} command drops to zero once you've jerojasro@343: pushed your changes to another repository. Rolling back a change jerojasro@343: makes it disappear entirely, but \emph{only} in the repository in jerojasro@343: which you perform the \hgcmd{rollback}. Because a rollback eliminates jerojasro@343: history, there's no way for the disappearance of a change to propagate jerojasro@343: between repositories. jerojasro@343: jerojasro@343: If you've pushed a change to another repository---particularly if it's jerojasro@343: a shared repository---it has essentially ``escaped into the wild,'' jerojasro@343: and you'll have to recover from your mistake in a different way. What jerojasro@343: will happen if you push a changeset somewhere, then roll it back, then jerojasro@343: pull from the repository you pushed to, is that the changeset will jerojasro@343: reappear in your repository. jerojasro@343: jerojasro@343: (If you absolutely know for sure that the change you want to roll back jerojasro@343: is the most recent change in the repository that you pushed to, jerojasro@343: \emph{and} you know that nobody else could have pulled it from that jerojasro@343: repository, you can roll back the changeset there, too, but you really jerojasro@343: should really not rely on this working reliably. If you do this, jerojasro@343: sooner or later a change really will make it into a repository that jerojasro@343: you don't directly control (or have forgotten about), and come back to jerojasro@343: bite you.) jerojasro@343: jerojasro@343: \subsection{You can only roll back once} jerojasro@343: jerojasro@343: Mercurial stores exactly one transaction in its transaction log; that jerojasro@343: transaction is the most recent one that occurred in the repository. jerojasro@343: This means that you can only roll back one transaction. If you expect jerojasro@343: to be able to roll back one transaction, then its predecessor, this is jerojasro@343: not the behaviour you will get. jerojasro@343: \interaction{rollback.twice} jerojasro@343: Once you've rolled back one transaction in a repository, you can't jerojasro@343: roll back again in that repository until you perform another commit or jerojasro@343: pull. jerojasro@343: jerojasro@343: \section{Reverting the mistaken change} jerojasro@343: jerojasro@343: If you make a modification to a file, and decide that you really jerojasro@343: didn't want to change the file at all, and you haven't yet committed jerojasro@343: your changes, the \hgcmd{revert} command is the one you'll need. It jerojasro@343: looks at the changeset that's the parent of the working directory, and jerojasro@343: restores the contents of the file to their state as of that changeset. jerojasro@343: (That's a long-winded way of saying that, in the normal case, it jerojasro@343: undoes your modifications.) jerojasro@343: jerojasro@343: Let's illustrate how the \hgcmd{revert} command works with yet another jerojasro@343: small example. We'll begin by modifying a file that Mercurial is jerojasro@343: already tracking. jerojasro@343: \interaction{daily.revert.modify} jerojasro@343: If we don't want that change, we can simply \hgcmd{revert} the file. jerojasro@343: \interaction{daily.revert.unmodify} jerojasro@343: The \hgcmd{revert} command provides us with an extra degree of safety jerojasro@343: by saving our modified file with a \filename{.orig} extension. jerojasro@343: \interaction{daily.revert.status} jerojasro@343: jerojasro@343: Here is a summary of the cases that the \hgcmd{revert} command can jerojasro@343: deal with. We will describe each of these in more detail in the jerojasro@343: section that follows. jerojasro@343: \begin{itemize} jerojasro@343: \item If you modify a file, it will restore the file to its unmodified jerojasro@343: state. jerojasro@343: \item If you \hgcmd{add} a file, it will undo the ``added'' state of jerojasro@343: the file, but leave the file itself untouched. jerojasro@343: \item If you delete a file without telling Mercurial, it will restore jerojasro@343: the file to its unmodified contents. jerojasro@343: \item If you use the \hgcmd{remove} command to remove a file, it will jerojasro@343: undo the ``removed'' state of the file, and restore the file to its jerojasro@343: unmodified contents. jerojasro@343: \end{itemize} jerojasro@343: jerojasro@343: \subsection{File management errors} jerojasro@343: \label{sec:undo:mgmt} jerojasro@343: jerojasro@343: The \hgcmd{revert} command is useful for more than just modified jerojasro@343: files. It lets you reverse the results of all of Mercurial's file jerojasro@343: management commands---\hgcmd{add}, \hgcmd{remove}, and so on. jerojasro@343: jerojasro@343: If you \hgcmd{add} a file, then decide that in fact you don't want jerojasro@343: Mercurial to track it, use \hgcmd{revert} to undo the add. Don't jerojasro@343: worry; Mercurial will not modify the file in any way. It will just jerojasro@343: ``unmark'' the file. jerojasro@343: \interaction{daily.revert.add} jerojasro@343: jerojasro@343: Similarly, if you ask Mercurial to \hgcmd{remove} a file, you can use jerojasro@343: \hgcmd{revert} to restore it to the contents it had as of the parent jerojasro@343: of the working directory. jerojasro@343: \interaction{daily.revert.remove} jerojasro@343: This works just as well for a file that you deleted by hand, without jerojasro@343: telling Mercurial (recall that in Mercurial terminology, this kind of jerojasro@343: file is called ``missing''). jerojasro@343: \interaction{daily.revert.missing} jerojasro@343: jerojasro@343: If you revert a \hgcmd{copy}, the copied-to file remains in your jerojasro@343: working directory afterwards, untracked. Since a copy doesn't affect jerojasro@343: the copied-from file in any way, Mercurial doesn't do anything with jerojasro@343: the copied-from file. jerojasro@343: \interaction{daily.revert.copy} jerojasro@343: jerojasro@343: \subsubsection{A slightly special case: reverting a rename} jerojasro@343: jerojasro@343: If you \hgcmd{rename} a file, there is one small detail that jerojasro@343: you should remember. When you \hgcmd{revert} a rename, it's not jerojasro@343: enough to provide the name of the renamed-to file, as you can see jerojasro@343: here. jerojasro@343: \interaction{daily.revert.rename} jerojasro@343: As you can see from the output of \hgcmd{status}, the renamed-to file jerojasro@343: is no longer identified as added, but the renamed-\emph{from} file is jerojasro@343: still removed! This is counter-intuitive (at least to me), but at jerojasro@343: least it's easy to deal with. jerojasro@343: \interaction{daily.revert.rename-orig} jerojasro@343: So remember, to revert a \hgcmd{rename}, you must provide \emph{both} jerojasro@343: the source and destination names. jerojasro@343: jerojasro@343: % TODO: the output doesn't look like it will be removed! jerojasro@343: jerojasro@343: (By the way, if you rename a file, then modify the renamed-to file, jerojasro@343: then revert both components of the rename, when Mercurial restores the jerojasro@343: file that was removed as part of the rename, it will be unmodified. jerojasro@343: If you need the modifications in the renamed-to file to show up in the jerojasro@343: renamed-from file, don't forget to copy them over.) jerojasro@343: jerojasro@343: These fiddly aspects of reverting a rename arguably constitute a small jerojasro@343: bug in Mercurial. jerojasro@343: jerojasro@343: \section{Dealing with committed changes} jerojasro@343: jerojasro@343: Consider a case where you have committed a change $a$, and another jerojasro@343: change $b$ on top of it; you then realise that change $a$ was jerojasro@343: incorrect. Mercurial lets you ``back out'' an entire changeset jerojasro@343: automatically, and building blocks that let you reverse part of a jerojasro@343: changeset by hand. jerojasro@343: jerojasro@343: Before you read this section, here's something to keep in mind: the jerojasro@343: \hgcmd{backout} command undoes changes by \emph{adding} history, not jerojasro@343: by modifying or erasing it. It's the right tool to use if you're jerojasro@343: fixing bugs, but not if you're trying to undo some change that has jerojasro@343: catastrophic consequences. To deal with those, see jerojasro@343: section~\ref{sec:undo:aaaiiieee}. jerojasro@343: jerojasro@343: \subsection{Backing out a changeset} jerojasro@343: jerojasro@343: The \hgcmd{backout} command lets you ``undo'' the effects of an entire jerojasro@343: changeset in an automated fashion. Because Mercurial's history is jerojasro@343: immutable, this command \emph{does not} get rid of the changeset you jerojasro@343: want to undo. Instead, it creates a new changeset that jerojasro@343: \emph{reverses} the effect of the to-be-undone changeset. jerojasro@343: jerojasro@343: The operation of the \hgcmd{backout} command is a little intricate, so jerojasro@343: let's illustrate it with some examples. First, we'll create a jerojasro@343: repository with some simple changes. jerojasro@343: \interaction{backout.init} jerojasro@343: jerojasro@343: The \hgcmd{backout} command takes a single changeset ID as its jerojasro@343: argument; this is the changeset to back out. Normally, jerojasro@343: \hgcmd{backout} will drop you into a text editor to write a commit jerojasro@343: message, so you can record why you're backing the change out. In this jerojasro@343: example, we provide a commit message on the command line using the jerojasro@343: \hgopt{backout}{-m} option. jerojasro@343: jerojasro@343: \subsection{Backing out the tip changeset} jerojasro@343: jerojasro@343: We're going to start by backing out the last changeset we committed. jerojasro@343: \interaction{backout.simple} jerojasro@343: You can see that the second line from \filename{myfile} is no longer jerojasro@343: present. Taking a look at the output of \hgcmd{log} gives us an idea jerojasro@343: of what the \hgcmd{backout} command has done. jerojasro@343: \interaction{backout.simple.log} jerojasro@343: Notice that the new changeset that \hgcmd{backout} has created is a jerojasro@343: child of the changeset we backed out. It's easier to see this in jerojasro@343: figure~\ref{fig:undo:backout}, which presents a graphical view of the jerojasro@343: change history. As you can see, the history is nice and linear. jerojasro@343: jerojasro@343: \begin{figure}[htb] jerojasro@343: \centering jerojasro@343: \grafix{undo-simple} jerojasro@343: \caption{Backing out a change using the \hgcmd{backout} command} jerojasro@343: \label{fig:undo:backout} jerojasro@343: \end{figure} jerojasro@343: jerojasro@343: \subsection{Backing out a non-tip change} jerojasro@343: jerojasro@343: If you want to back out a change other than the last one you jerojasro@343: committed, pass the \hgopt{backout}{--merge} option to the jerojasro@343: \hgcmd{backout} command. jerojasro@343: \interaction{backout.non-tip.clone} jerojasro@343: This makes backing out any changeset a ``one-shot'' operation that's jerojasro@343: usually simple and fast. jerojasro@343: \interaction{backout.non-tip.backout} jerojasro@343: jerojasro@343: If you take a look at the contents of \filename{myfile} after the jerojasro@343: backout finishes, you'll see that the first and third changes are jerojasro@343: present, but not the second. jerojasro@343: \interaction{backout.non-tip.cat} jerojasro@343: jerojasro@343: As the graphical history in figure~\ref{fig:undo:backout-non-tip} jerojasro@343: illustrates, Mercurial actually commits \emph{two} changes in this jerojasro@343: kind of situation (the box-shaped nodes are the ones that Mercurial jerojasro@343: commits automatically). Before Mercurial begins the backout process, jerojasro@343: it first remembers what the current parent of the working directory jerojasro@343: is. It then backs out the target changeset, and commits that as a jerojasro@343: changeset. Finally, it merges back to the previous parent of the jerojasro@343: working directory, and commits the result of the merge. jerojasro@343: jerojasro@343: % TODO: to me it looks like mercurial doesn't commit the second merge automatically! jerojasro@343: jerojasro@343: \begin{figure}[htb] jerojasro@343: \centering jerojasro@343: \grafix{undo-non-tip} jerojasro@343: \caption{Automated backout of a non-tip change using the \hgcmd{backout} command} jerojasro@343: \label{fig:undo:backout-non-tip} jerojasro@343: \end{figure} jerojasro@343: jerojasro@343: The result is that you end up ``back where you were'', only with some jerojasro@343: extra history that undoes the effect of the changeset you wanted to jerojasro@343: back out. jerojasro@343: jerojasro@343: \subsubsection{Always use the \hgopt{backout}{--merge} option} jerojasro@343: jerojasro@343: In fact, since the \hgopt{backout}{--merge} option will do the ``right jerojasro@343: thing'' whether or not the changeset you're backing out is the tip jerojasro@343: (i.e.~it won't try to merge if it's backing out the tip, since there's jerojasro@343: no need), you should \emph{always} use this option when you run the jerojasro@343: \hgcmd{backout} command. jerojasro@343: jerojasro@343: \subsection{Gaining more control of the backout process} jerojasro@343: jerojasro@343: While I've recommended that you always use the jerojasro@343: \hgopt{backout}{--merge} option when backing out a change, the jerojasro@343: \hgcmd{backout} command lets you decide how to merge a backout jerojasro@343: changeset. Taking control of the backout process by hand is something jerojasro@343: you will rarely need to do, but it can be useful to understand what jerojasro@343: the \hgcmd{backout} command is doing for you automatically. To jerojasro@343: illustrate this, let's clone our first repository, but omit the jerojasro@343: backout change that it contains. jerojasro@343: jerojasro@343: \interaction{backout.manual.clone} jerojasro@343: As with our earlier example, We'll commit a third changeset, then back jerojasro@343: out its parent, and see what happens. jerojasro@343: \interaction{backout.manual.backout} jerojasro@343: Our new changeset is again a descendant of the changeset we backout jerojasro@343: out; it's thus a new head, \emph{not} a descendant of the changeset jerojasro@343: that was the tip. The \hgcmd{backout} command was quite explicit in jerojasro@343: telling us this. jerojasro@343: \interaction{backout.manual.log} jerojasro@343: jerojasro@343: Again, it's easier to see what has happened by looking at a graph of jerojasro@343: the revision history, in figure~\ref{fig:undo:backout-manual}. This jerojasro@343: makes it clear that when we use \hgcmd{backout} to back out a change jerojasro@343: other than the tip, Mercurial adds a new head to the repository (the jerojasro@343: change it committed is box-shaped). jerojasro@343: jerojasro@343: \begin{figure}[htb] jerojasro@343: \centering jerojasro@343: \grafix{undo-manual} jerojasro@343: \caption{Backing out a change using the \hgcmd{backout} command} jerojasro@343: \label{fig:undo:backout-manual} jerojasro@343: \end{figure} jerojasro@343: jerojasro@343: After the \hgcmd{backout} command has completed, it leaves the new jerojasro@343: ``backout'' changeset as the parent of the working directory. jerojasro@343: \interaction{backout.manual.parents} jerojasro@343: Now we have two isolated sets of changes. jerojasro@343: \interaction{backout.manual.heads} jerojasro@343: jerojasro@343: Let's think about what we expect to see as the contents of jerojasro@343: \filename{myfile} now. The first change should be present, because jerojasro@343: we've never backed it out. The second change should be missing, as jerojasro@343: that's the change we backed out. Since the history graph shows the jerojasro@343: third change as a separate head, we \emph{don't} expect to see the jerojasro@343: third change present in \filename{myfile}. jerojasro@343: \interaction{backout.manual.cat} jerojasro@343: To get the third change back into the file, we just do a normal merge jerojasro@343: of our two heads. jerojasro@343: \interaction{backout.manual.merge} jerojasro@343: Afterwards, the graphical history of our repository looks like jerojasro@343: figure~\ref{fig:undo:backout-manual-merge}. jerojasro@343: jerojasro@343: \begin{figure}[htb] jerojasro@343: \centering jerojasro@343: \grafix{undo-manual-merge} jerojasro@343: \caption{Manually merging a backout change} jerojasro@343: \label{fig:undo:backout-manual-merge} jerojasro@343: \end{figure} jerojasro@343: jerojasro@343: \subsection{Why \hgcmd{backout} works as it does} jerojasro@343: jerojasro@343: Here's a brief description of how the \hgcmd{backout} command works. jerojasro@343: \begin{enumerate} jerojasro@343: \item It ensures that the working directory is ``clean'', i.e.~that jerojasro@343: the output of \hgcmd{status} would be empty. jerojasro@343: \item It remembers the current parent of the working directory. Let's jerojasro@343: call this changeset \texttt{orig} jerojasro@343: \item It does the equivalent of a \hgcmd{update} to sync the working jerojasro@343: directory to the changeset you want to back out. Let's call this jerojasro@343: changeset \texttt{backout} jerojasro@343: \item It finds the parent of that changeset. Let's call that jerojasro@343: changeset \texttt{parent}. jerojasro@343: \item For each file that the \texttt{backout} changeset affected, it jerojasro@343: does the equivalent of a \hgcmdargs{revert}{-r parent} on that file, jerojasro@343: to restore it to the contents it had before that changeset was jerojasro@343: committed. jerojasro@343: \item It commits the result as a new changeset. This changeset has jerojasro@343: \texttt{backout} as its parent. jerojasro@343: \item If you specify \hgopt{backout}{--merge} on the command line, it jerojasro@343: merges with \texttt{orig}, and commits the result of the merge. jerojasro@343: \end{enumerate} jerojasro@343: jerojasro@343: An alternative way to implement the \hgcmd{backout} command would be jerojasro@343: to \hgcmd{export} the to-be-backed-out changeset as a diff, then use jerojasro@343: the \cmdopt{patch}{--reverse} option to the \command{patch} command to jerojasro@343: reverse the effect of the change without fiddling with the working jerojasro@343: directory. This sounds much simpler, but it would not work nearly as jerojasro@343: well. jerojasro@343: jerojasro@343: The reason that \hgcmd{backout} does an update, a commit, a merge, and jerojasro@343: another commit is to give the merge machinery the best chance to do a jerojasro@343: good job when dealing with all the changes \emph{between} the change jerojasro@343: you're backing out and the current tip. jerojasro@343: jerojasro@343: If you're backing out a changeset that's~100 revisions back in your jerojasro@343: project's history, the chances that the \command{patch} command will jerojasro@343: be able to apply a reverse diff cleanly are not good, because jerojasro@343: intervening changes are likely to have ``broken the context'' that jerojasro@343: \command{patch} uses to determine whether it can apply a patch (if jerojasro@343: this sounds like gibberish, see \ref{sec:mq:patch} for a jerojasro@343: discussion of the \command{patch} command). Also, Mercurial's merge jerojasro@343: machinery will handle files and directories being renamed, permission jerojasro@343: changes, and modifications to binary files, none of which jerojasro@343: \command{patch} can deal with. jerojasro@343: jerojasro@343: \section{Changes that should never have been} jerojasro@343: \label{sec:undo:aaaiiieee} jerojasro@343: jerojasro@343: Most of the time, the \hgcmd{backout} command is exactly what you need jerojasro@343: if you want to undo the effects of a change. It leaves a permanent jerojasro@343: record of exactly what you did, both when committing the original jerojasro@343: changeset and when you cleaned up after it. jerojasro@343: jerojasro@343: On rare occasions, though, you may find that you've committed a change jerojasro@343: that really should not be present in the repository at all. For jerojasro@343: example, it would be very unusual, and usually considered a mistake, jerojasro@343: to commit a software project's object files as well as its source jerojasro@343: files. Object files have almost no intrinsic value, and they're jerojasro@343: \emph{big}, so they increase the size of the repository and the amount jerojasro@343: of time it takes to clone or pull changes. jerojasro@343: jerojasro@343: Before I discuss the options that you have if you commit a ``brown jerojasro@343: paper bag'' change (the kind that's so bad that you want to pull a jerojasro@343: brown paper bag over your head), let me first discuss some approaches jerojasro@343: that probably won't work. jerojasro@343: jerojasro@343: Since Mercurial treats history as accumulative---every change builds jerojasro@343: on top of all changes that preceded it---you generally can't just make jerojasro@343: disastrous changes disappear. The one exception is when you've just jerojasro@343: committed a change, and it hasn't been pushed or pulled into another jerojasro@343: repository. That's when you can safely use the \hgcmd{rollback} jerojasro@343: command, as I detailed in section~\ref{sec:undo:rollback}. jerojasro@343: jerojasro@343: After you've pushed a bad change to another repository, you jerojasro@343: \emph{could} still use \hgcmd{rollback} to make your local copy of the jerojasro@343: change disappear, but it won't have the consequences you want. The jerojasro@343: change will still be present in the remote repository, so it will jerojasro@343: reappear in your local repository the next time you pull. jerojasro@343: jerojasro@343: If a situation like this arises, and you know which repositories your jerojasro@343: bad change has propagated into, you can \emph{try} to get rid of the jerojasro@343: changeefrom \emph{every} one of those repositories. This is, of jerojasro@343: course, not a satisfactory solution: if you miss even a single jerojasro@343: repository while you're expunging, the change is still ``in the jerojasro@343: wild'', and could propagate further. jerojasro@343: jerojasro@343: If you've committed one or more changes \emph{after} the change that jerojasro@343: you'd like to see disappear, your options are further reduced. jerojasro@343: Mercurial doesn't provide a way to ``punch a hole'' in history, jerojasro@343: leaving changesets intact. jerojasro@343: jerojasro@343: XXX This needs filling out. The \texttt{hg-replay} script in the jerojasro@343: \texttt{examples} directory works, but doesn't handle merge jerojasro@343: changesets. Kind of an important omission. jerojasro@343: jerojasro@343: \subsection{Protect yourself from ``escaped'' changes} jerojasro@343: jerojasro@343: If you've committed some changes to your local repository and they've jerojasro@343: been pushed or pulled somewhere else, this isn't necessarily a jerojasro@343: disaster. You can protect yourself ahead of time against some classes jerojasro@343: of bad changeset. This is particularly easy if your team usually jerojasro@343: pulls changes from a central repository. jerojasro@343: jerojasro@343: By configuring some hooks on that repository to validate incoming jerojasro@343: changesets (see chapter~\ref{chap:hook}), you can automatically jerojasro@343: prevent some kinds of bad changeset from being pushed to the central jerojasro@343: repository at all. With such a configuration in place, some kinds of jerojasro@343: bad changeset will naturally tend to ``die out'' because they can't jerojasro@343: propagate into the central repository. Better yet, this happens jerojasro@343: without any need for explicit intervention. jerojasro@343: jerojasro@343: For instance, an incoming change hook that verifies that a changeset jerojasro@343: will actually compile can prevent people from inadvertantly ``breaking jerojasro@343: the build''. jerojasro@343: jerojasro@343: \section{Finding the source of a bug} jerojasro@343: \label{sec:undo:bisect} jerojasro@343: jerojasro@343: While it's all very well to be able to back out a changeset that jerojasro@343: introduced a bug, this requires that you know which changeset to back jerojasro@343: out. Mercurial provides an invaluable command, called jerojasro@343: \hgcmd{bisect}, that helps you to automate this process and accomplish jerojasro@343: it very efficiently. jerojasro@343: jerojasro@343: The idea behind the \hgcmd{bisect} command is that a changeset has jerojasro@343: introduced some change of behaviour that you can identify with a jerojasro@343: simple binary test. You don't know which piece of code introduced the jerojasro@343: change, but you know how to test for the presence of the bug. The jerojasro@343: \hgcmd{bisect} command uses your test to direct its search for the jerojasro@343: changeset that introduced the code that caused the bug. jerojasro@343: jerojasro@343: Here are a few scenarios to help you understand how you might apply jerojasro@343: this command. jerojasro@343: \begin{itemize} jerojasro@343: \item The most recent version of your software has a bug that you jerojasro@343: remember wasn't present a few weeks ago, but you don't know when it jerojasro@343: was introduced. Here, your binary test checks for the presence of jerojasro@343: that bug. jerojasro@343: \item You fixed a bug in a rush, and now it's time to close the entry jerojasro@343: in your team's bug database. The bug database requires a changeset jerojasro@343: ID when you close an entry, but you don't remember which changeset jerojasro@343: you fixed the bug in. Once again, your binary test checks for the jerojasro@343: presence of the bug. jerojasro@343: \item Your software works correctly, but runs~15\% slower than the jerojasro@343: last time you measured it. You want to know which changeset jerojasro@343: introduced the performance regression. In this case, your binary jerojasro@343: test measures the performance of your software, to see whether it's jerojasro@343: ``fast'' or ``slow''. jerojasro@343: \item The sizes of the components of your project that you ship jerojasro@343: exploded recently, and you suspect that something changed in the way jerojasro@343: you build your project. jerojasro@343: \end{itemize} jerojasro@343: jerojasro@343: From these examples, it should be clear that the \hgcmd{bisect} jerojasro@343: command is not useful only for finding the sources of bugs. You can jerojasro@343: use it to find any ``emergent property'' of a repository (anything jerojasro@343: that you can't find from a simple text search of the files in the jerojasro@343: tree) for which you can write a binary test. jerojasro@343: jerojasro@343: We'll introduce a little bit of terminology here, just to make it jerojasro@343: clear which parts of the search process are your responsibility, and jerojasro@343: which are Mercurial's. A \emph{test} is something that \emph{you} run jerojasro@343: when \hgcmd{bisect} chooses a changeset. A \emph{probe} is what jerojasro@343: \hgcmd{bisect} runs to tell whether a revision is good. Finally, jerojasro@343: we'll use the word ``bisect'', as both a noun and a verb, to stand in jerojasro@343: for the phrase ``search using the \hgcmd{bisect} command. jerojasro@343: jerojasro@343: One simple way to automate the searching process would be simply to jerojasro@343: probe every changeset. However, this scales poorly. If it took ten jerojasro@343: minutes to test a single changeset, and you had 10,000 changesets in jerojasro@343: your repository, the exhaustive approach would take on average~35 jerojasro@343: \emph{days} to find the changeset that introduced a bug. Even if you jerojasro@343: knew that the bug was introduced by one of the last 500 changesets, jerojasro@343: and limited your search to those, you'd still be looking at over 40 jerojasro@343: hours to find the changeset that introduced your bug. jerojasro@343: jerojasro@343: What the \hgcmd{bisect} command does is use its knowledge of the jerojasro@343: ``shape'' of your project's revision history to perform a search in jerojasro@343: time proportional to the \emph{logarithm} of the number of changesets jerojasro@343: to check (the kind of search it performs is called a dichotomic jerojasro@343: search). With this approach, searching through 10,000 changesets will jerojasro@343: take less than three hours, even at ten minutes per test (the search jerojasro@343: will require about 14 tests). Limit your search to the last hundred jerojasro@343: changesets, and it will take only about an hour (roughly seven tests). jerojasro@343: jerojasro@343: The \hgcmd{bisect} command is aware of the ``branchy'' nature of a jerojasro@343: Mercurial project's revision history, so it has no problems dealing jerojasro@343: with branches, merges, or multiple heads in a repoository. It can jerojasro@343: prune entire branches of history with a single probe, which is how it jerojasro@343: operates so efficiently. jerojasro@343: jerojasro@343: \subsection{Using the \hgcmd{bisect} command} jerojasro@343: jerojasro@343: Here's an example of \hgcmd{bisect} in action. jerojasro@343: jerojasro@343: \begin{note} jerojasro@343: In versions 0.9.5 and earlier of Mercurial, \hgcmd{bisect} was not a jerojasro@343: core command: it was distributed with Mercurial as an extension. jerojasro@343: This section describes the built-in command, not the old extension. jerojasro@343: \end{note} jerojasro@343: jerojasro@343: Now let's create a repository, so that we can try out the jerojasro@343: \hgcmd{bisect} command in isolation. jerojasro@343: \interaction{bisect.init} jerojasro@343: We'll simulate a project that has a bug in it in a simple-minded way: jerojasro@343: create trivial changes in a loop, and nominate one specific change jerojasro@343: that will have the ``bug''. This loop creates 35 changesets, each jerojasro@343: adding a single file to the repository. We'll represent our ``bug'' jerojasro@343: with a file that contains the text ``i have a gub''. jerojasro@343: \interaction{bisect.commits} jerojasro@343: jerojasro@343: The next thing that we'd like to do is figure out how to use the jerojasro@343: \hgcmd{bisect} command. We can use Mercurial's normal built-in help jerojasro@343: mechanism for this. jerojasro@343: \interaction{bisect.help} jerojasro@343: jerojasro@343: The \hgcmd{bisect} command works in steps. Each step proceeds as follows. jerojasro@343: \begin{enumerate} jerojasro@343: \item You run your binary test. jerojasro@343: \begin{itemize} jerojasro@343: \item If the test succeeded, you tell \hgcmd{bisect} by running the jerojasro@343: \hgcmdargs{bisect}{good} command. jerojasro@343: \item If it failed, run the \hgcmdargs{bisect}{--bad} command. jerojasro@343: \end{itemize} jerojasro@343: \item The command uses your information to decide which changeset to jerojasro@343: test next. jerojasro@343: \item It updates the working directory to that changeset, and the jerojasro@343: process begins again. jerojasro@343: \end{enumerate} jerojasro@343: The process ends when \hgcmd{bisect} identifies a unique changeset jerojasro@343: that marks the point where your test transitioned from ``succeeding'' jerojasro@343: to ``failing''. jerojasro@343: jerojasro@343: To start the search, we must run the \hgcmdargs{bisect}{--reset} command. jerojasro@343: \interaction{bisect.search.init} jerojasro@343: jerojasro@343: In our case, the binary test we use is simple: we check to see if any jerojasro@343: file in the repository contains the string ``i have a gub''. If it jerojasro@343: does, this changeset contains the change that ``caused the bug''. By jerojasro@343: convention, a changeset that has the property we're searching for is jerojasro@343: ``bad'', while one that doesn't is ``good''. jerojasro@343: jerojasro@343: Most of the time, the revision to which the working directory is jerojasro@343: synced (usually the tip) already exhibits the problem introduced by jerojasro@343: the buggy change, so we'll mark it as ``bad''. jerojasro@343: \interaction{bisect.search.bad-init} jerojasro@343: jerojasro@343: Our next task is to nominate a changeset that we know \emph{doesn't} jerojasro@343: have the bug; the \hgcmd{bisect} command will ``bracket'' its search jerojasro@343: between the first pair of good and bad changesets. In our case, we jerojasro@343: know that revision~10 didn't have the bug. (I'll have more words jerojasro@343: about choosing the first ``good'' changeset later.) jerojasro@343: \interaction{bisect.search.good-init} jerojasro@343: jerojasro@343: Notice that this command printed some output. jerojasro@343: \begin{itemize} jerojasro@343: \item It told us how many changesets it must consider before it can jerojasro@343: identify the one that introduced the bug, and how many tests that jerojasro@343: will require. jerojasro@343: \item It updated the working directory to the next changeset to test, jerojasro@343: and told us which changeset it's testing. jerojasro@343: \end{itemize} jerojasro@343: jerojasro@343: We now run our test in the working directory. We use the jerojasro@343: \command{grep} command to see if our ``bad'' file is present in the jerojasro@343: working directory. If it is, this revision is bad; if not, this jerojasro@343: revision is good. jerojasro@343: \interaction{bisect.search.step1} jerojasro@343: jerojasro@343: This test looks like a perfect candidate for automation, so let's turn jerojasro@343: it into a shell function. jerojasro@343: \interaction{bisect.search.mytest} jerojasro@343: We can now run an entire test step with a single command, jerojasro@343: \texttt{mytest}. jerojasro@343: \interaction{bisect.search.step2} jerojasro@343: A few more invocations of our canned test step command, and we're jerojasro@343: done. jerojasro@343: \interaction{bisect.search.rest} jerojasro@343: jerojasro@343: Even though we had~40 changesets to search through, the \hgcmd{bisect} jerojasro@343: command let us find the changeset that introduced our ``bug'' with jerojasro@343: only five tests. Because the number of tests that the \hgcmd{bisect} jerojasro@343: command performs grows logarithmically with the number of changesets to jerojasro@343: search, the advantage that it has over the ``brute force'' search jerojasro@343: approach increases with every changeset you add. jerojasro@343: jerojasro@343: \subsection{Cleaning up after your search} jerojasro@343: jerojasro@343: When you're finished using the \hgcmd{bisect} command in a jerojasro@343: repository, you can use the \hgcmdargs{bisect}{reset} command to drop jerojasro@343: the information it was using to drive your search. The command jerojasro@343: doesn't use much space, so it doesn't matter if you forget to run this jerojasro@343: command. However, \hgcmd{bisect} won't let you start a new search in jerojasro@343: that repository until you do a \hgcmdargs{bisect}{reset}. jerojasro@343: \interaction{bisect.search.reset} jerojasro@343: jerojasro@343: \section{Tips for finding bugs effectively} jerojasro@343: jerojasro@343: \subsection{Give consistent input} jerojasro@343: jerojasro@343: The \hgcmd{bisect} command requires that you correctly report the jerojasro@343: result of every test you perform. If you tell it that a test failed jerojasro@343: when it really succeeded, it \emph{might} be able to detect the jerojasro@343: inconsistency. If it can identify an inconsistency in your reports, jerojasro@343: it will tell you that a particular changeset is both good and bad. jerojasro@343: However, it can't do this perfectly; it's about as likely to report jerojasro@343: the wrong changeset as the source of the bug. jerojasro@343: jerojasro@343: \subsection{Automate as much as possible} jerojasro@343: jerojasro@343: When I started using the \hgcmd{bisect} command, I tried a few times jerojasro@343: to run my tests by hand, on the command line. This is an approach jerojasro@343: that I, at least, am not suited to. After a few tries, I found that I jerojasro@343: was making enough mistakes that I was having to restart my searches jerojasro@343: several times before finally getting correct results. jerojasro@343: jerojasro@343: My initial problems with driving the \hgcmd{bisect} command by hand jerojasro@343: occurred even with simple searches on small repositories; if the jerojasro@343: problem you're looking for is more subtle, or the number of tests that jerojasro@343: \hgcmd{bisect} must perform increases, the likelihood of operator jerojasro@343: error ruining the search is much higher. Once I started automating my jerojasro@343: tests, I had much better results. jerojasro@343: jerojasro@343: The key to automated testing is twofold: jerojasro@343: \begin{itemize} jerojasro@343: \item always test for the same symptom, and jerojasro@343: \item always feed consistent input to the \hgcmd{bisect} command. jerojasro@343: \end{itemize} jerojasro@343: In my tutorial example above, the \command{grep} command tests for the jerojasro@343: symptom, and the \texttt{if} statement takes the result of this check jerojasro@343: and ensures that we always feed the same input to the \hgcmd{bisect} jerojasro@343: command. The \texttt{mytest} function marries these together in a jerojasro@343: reproducible way, so that every test is uniform and consistent. jerojasro@343: jerojasro@343: \subsection{Check your results} jerojasro@343: jerojasro@343: Because the output of a \hgcmd{bisect} search is only as good as the jerojasro@343: input you give it, don't take the changeset it reports as the jerojasro@343: absolute truth. A simple way to cross-check its report is to manually jerojasro@343: run your test at each of the following changesets: jerojasro@343: \begin{itemize} jerojasro@343: \item The changeset that it reports as the first bad revision. Your jerojasro@343: test should still report this as bad. jerojasro@343: \item The parent of that changeset (either parent, if it's a merge). jerojasro@343: Your test should report this changeset as good. jerojasro@343: \item A child of that changeset. Your test should report this jerojasro@343: changeset as bad. jerojasro@343: \end{itemize} jerojasro@343: jerojasro@343: \subsection{Beware interference between bugs} jerojasro@343: jerojasro@343: It's possible that your search for one bug could be disrupted by the jerojasro@343: presence of another. For example, let's say your software crashes at jerojasro@343: revision 100, and worked correctly at revision 50. Unknown to you, jerojasro@343: someone else introduced a different crashing bug at revision 60, and jerojasro@343: fixed it at revision 80. This could distort your results in one of jerojasro@343: several ways. jerojasro@343: jerojasro@343: It is possible that this other bug completely ``masks'' yours, which jerojasro@343: is to say that it occurs before your bug has a chance to manifest jerojasro@343: itself. If you can't avoid that other bug (for example, it prevents jerojasro@343: your project from building), and so can't tell whether your bug is jerojasro@343: present in a particular changeset, the \hgcmd{bisect} command cannot jerojasro@343: help you directly. Instead, you can mark a changeset as untested by jerojasro@343: running \hgcmdargs{bisect}{--skip}. jerojasro@343: jerojasro@343: A different problem could arise if your test for a bug's presence is jerojasro@343: not specific enough. If you check for ``my program crashes'', then jerojasro@343: both your crashing bug and an unrelated crashing bug that masks it jerojasro@343: will look like the same thing, and mislead \hgcmd{bisect}. jerojasro@343: jerojasro@343: Another useful situation in which to use \hgcmdargs{bisect}{--skip} is jerojasro@343: if you can't test a revision because your project was in a broken and jerojasro@343: hence untestable state at that revision, perhaps because someone jerojasro@343: checked in a change that prevented the project from building. jerojasro@343: jerojasro@343: \subsection{Bracket your search lazily} jerojasro@343: jerojasro@343: Choosing the first ``good'' and ``bad'' changesets that will mark the jerojasro@343: end points of your search is often easy, but it bears a little jerojasro@343: discussion nevertheless. From the perspective of \hgcmd{bisect}, the jerojasro@343: ``newest'' changeset is conventionally ``bad'', and the older jerojasro@343: changeset is ``good''. jerojasro@343: jerojasro@343: If you're having trouble remembering when a suitable ``good'' change jerojasro@343: was, so that you can tell \hgcmd{bisect}, you could do worse than jerojasro@343: testing changesets at random. Just remember to eliminate contenders jerojasro@343: that can't possibly exhibit the bug (perhaps because the feature with jerojasro@343: the bug isn't present yet) and those where another problem masks the jerojasro@343: bug (as I discussed above). jerojasro@343: jerojasro@343: Even if you end up ``early'' by thousands of changesets or months of jerojasro@343: history, you will only add a handful of tests to the total number that jerojasro@343: \hgcmd{bisect} must perform, thanks to its logarithmic behaviour. jerojasro@343: jerojasro@343: %%% Local Variables: jerojasro@343: %%% mode: latex jerojasro@343: %%% TeX-master: "00book" jerojasro@343: %%% End: