bos@34: \chapter{Handling repository events with hooks}
bos@34: \label{chap:hook}
bos@34: 
bos@34: Mercurial offers a powerful mechanism to let you perform automated
bos@34: actions in response to events that occur in a repository.  In some
bos@34: cases, you can even control Mercurial's response to those events.
bos@34: 
bos@34: The name Mercurial uses for one of these actions is a \emph{hook}.
bos@34: Hooks are called ``triggers'' in some revision control systems, but
bos@34: the two names refer to the same idea.
bos@34: 
bos@38: \section{An overview of hooks in Mercurial}
bos@38: 
bos@38: Here is a brief list of the hooks that Mercurial supports. For each
bos@38: hook, we indicate when it is run, and a few examples of common tasks
bos@38: you can use it for.  We will revisit each of these hooks in more
bos@38: detail later.
bos@38: \begin{itemize}
bos@38: \item[\small\hook{changegroup}] This is run after a group of
bos@38:   changesets has been brought into the repository from elsewhere.  In
bos@38:   other words, it is run after a \hgcmd{pull} or \hgcmd{push} into a
bos@38:   repository, but not after a \hgcmd{commit}.  You can use this for
bos@38:   performing an action once for the entire group of newly arrived
bos@38:   changesets.  For example, you could use this hook to send out email
bos@38:   notifications, or kick off an automated build or test.
bos@38: \item[\small\hook{commit}] This is run after a new changeset has been
bos@38:   created in the local repository, typically using the \hgcmd{commit}
bos@38:   command.
bos@38: \item[\small\hook{incoming}] This is run once for each new changeset
bos@38:   that is brought into the repository from elsewhere.  Notice the
bos@38:   difference from \hook{changegroup}, which is run once per
bos@38:   \emph{group} of changesets brought in.  You can use this for the
bos@38:   same purposes as the \hook{changegroup} hook; it's simply more
bos@38:   convenient sometimes to run a hook once per group of changesets,
bos@38:   while othher times it's handier once per changeset.
bos@38: \item[\small\hook{outgoing}] This is run after a group of changesets
bos@38:   has been transmitted from this repository to another.  You can use
bos@38:   this, for example, to notify subscribers every time changes are
bos@38:   cloned or pulled from the repository.
bos@38: \item[\small\hook{prechangegroup}] This is run before starting to
bos@38:   bring a group of changesets into the repository.  It cannot see the
bos@38:   actual changesets, because they have not yet been transmitted.  If
bos@38:   it fails, the changesets will not be transmitted.  You can use this
bos@38:   hook to ``lock down'' a repository against incoming changes.
bos@38: \item[\small\hook{precommit}] This is run before starting a commit.
bos@38:   It cannot tell what files are included in the commit, or any other
bos@38:   information about the commit.  If it fails, the commit will not be
bos@38:   allowed to start.  You can use this to perform a build and require
bos@38:   it to complete successfully before a commit can proceed, or
bos@38:   automatically enforce a requirement that modified files pass your
bos@38:   coding style guidelines.
bos@38: \item[\small\hook{preoutgoing}] This is run before starting to
bos@38:   transmit a group of changesets from this repository.  You can use
bos@38:   this to lock a repository against clones or pulls from remote
bos@38:   clients.
bos@38: \item[\small\hook{pretag}] This is run before creating a tag.  If it
bos@38:   fails, the tag will not be created.  You can use this to enforce a
bos@38:   uniform tag naming convention.
bos@38: \item[\small\hook{pretxnchangegroup}] This is run after a group of
bos@38:   changesets has been brought into the local repository from another,
bos@38:   but before the transaction completes that will make the changes
bos@38:   permanent in the repository.  If it fails, the transaction will be
bos@38:   rolled back and the changes will disappear from the local
bos@38:   repository.  You can use this to automatically check newly arrived
bos@38:   changes and, for example, roll them back if the group as a whole
bos@38:   does not build or pass your test suite.
bos@38: \item[\small\hook{pretxncommit}] This is run after a new changeset has
bos@38:   been created in the local repository, but before the transaction
bos@38:   completes that will make it permanent.  Unlike the \hook{precommit}
bos@38:   hook, this hook can see which changes are present in the changeset,
bos@38:   and it can also see all other changeset metadata, such as the commit
bos@38:   message.  You can use this to require that a commit message follows
bos@38:   your local conventions, or that a changeset builds cleanly.
bos@38: \item[\small\hook{preupdate}] This is run before starting an update or
bos@38:   merge of the working directory.
bos@38: \item[\small\hook{tag}] This is run after a tag is created.
bos@38: \item[\small\hook{update}] This is run after an update or merge of the
bos@38:   working directory has finished.
bos@38: \end{itemize}
bos@38: Each of the hooks with a ``\texttt{pre}'' prefix has the ability to
bos@38: \emph{control} an activity.  If the hook succeeds, the activity may
bos@38: proceed; if it fails, the activity is either not permitted or undone,
bos@38: depending on the hook.
bos@38: 
bos@38: \section{Hooks and security}
bos@38: 
bos@38: \subsection{Hooks are run with your privileges}
bos@38: 
bos@38: When you run a Mercurial command in a repository, and the command
bos@38: causes a hook to run, that hook runs on your system, under your user
bos@38: account, with your privilege level.  Since hooks are arbitrary pieces
bos@38: of executable code, you should treat them with an appropriate level of
bos@38: suspicion.  Do not install a hook unless you are confident that you
bos@38: know who created it and what it does.
bos@38: 
bos@38: In some cases, you may be exposed to hooks that you did not install
bos@38: yourself.  If you work with Mercurial on an unfamiliar system,
bos@38: Mercurial will run hooks defined in that system's global \hgrc\ file.
bos@38: 
bos@38: If you are working with a repository owned by another user, Mercurial
bos@38: will run hooks defined in that repository.  For example, if you
bos@38: \hgcmd{pull} from that repository, and its \sfilename{.hg/hgrc}
bos@38: defines a local \hook{outgoing} hook, that hook will run under your
bos@38: user account, even though you don't own that repository.
bos@38: 
bos@38: \begin{note}
bos@38:   This only applies if you are pulling from a repository on a local or
bos@38:   network filesystem.  If you're pulling over http or ssh, any
bos@38:   \hook{outgoing} hook will run under the account of the server
bos@38:   process, on the server.
bos@38: \end{note}
bos@38: 
bos@38: XXX To see what hooks are defined in a repository, use the
bos@38: \hgcmdargs{config}{hooks} command.  If you are working in one
bos@38: repository, but talking to another that you do not own (e.g.~using
bos@38: \hgcmd{pull} or \hgcmd{incoming}), remember that it is the other
bos@38: repository's hooks you should be checking, not your own.
bos@38: 
bos@38: \subsection{Hooks do not propagate}
bos@38: 
bos@38: In Mercurial, hooks are not revision controlled, and do not propagate
bos@38: when you clone, or pull from, a repository.  The reason for this is
bos@38: simple: a hook is a completely arbitrary piece of executable code.  It
bos@38: runs under your user identity, with your privilege level, on your
bos@38: machine.
bos@38: 
bos@38: It would be extremely reckless for any distributed revision control
bos@38: system to implement revision-controlled hooks, as this would offer an
bos@38: easily exploitable way to subvert the accounts of users of the
bos@38: revision control system.
bos@38: 
bos@38: Since Mercurial does not propagate hooks, if you are collaborating
bos@38: with other people on a common project, you should not assume that they
bos@38: are using the same Mercurial hooks as you are, or that theirs are
bos@38: correctly configured.  You should document the hooks you expect people
bos@38: to use.
bos@38: 
bos@38: In a corporate intranet, this is somewhat easier to control, as you
bos@38: can for example provide a ``standard'' installation of Mercurial on an
bos@38: NFS filesystem, and use a site-wide \hgrc\ file to define hooks that
bos@38: all users will see.  However, this too has its limits; see below.
bos@38: 
bos@38: \subsection{Hooks can be overridden}
bos@38: 
bos@38: Mercurial allows you to override a hook definition by redefining the
bos@38: hook.  You can disable it by setting its value to the empty string, or
bos@38: change its behaviour as you wish.
bos@38: 
bos@38: If you deploy a system-~or site-wide \hgrc\ file that defines some
bos@38: hooks, you should thus understand that your users can disable or
bos@38: override those hooks.
bos@38: 
bos@38: \subsection{Ensuring that critical hooks are run}
bos@38: 
bos@38: Sometimes you may want to enforce a policy that you do not want others
bos@38: to be able to work around.  For example, you may have a requirement
bos@38: that every changeset must pass a rigorous set of tests.  Defining this
bos@38: requirement via a hook in a site-wide \hgrc\ won't work for remote
bos@38: users on laptops, and of course local users can subvert it at will by
bos@38: overriding the hook.
bos@38: 
bos@38: Instead, you can set up your policies for use of Mercurial so that
bos@38: people are expected to propagate changes through a well-known
bos@38: ``canonical'' server that you have locked down and configured
bos@38: appropriately.
bos@38: 
bos@38: One way to do this is via a combination of social engineering and
bos@38: technology.  Set up a restricted-access account; users can push
bos@38: changes over the network to repositories managed by this account, but
bos@38: they cannot log into the account and run normal shell commands.  In
bos@38: this scenario, a user can commit a changeset that contains any old
bos@38: garbage they want.
bos@38: 
bos@38: When someone pushes a changeset to the server that everyone pulls
bos@38: from, the server will test the changeset before it accepts it as
bos@38: permanent, and reject it if it fails to pass the test suite.  If
bos@38: people only pull changes from this filtering server, it will serve to
bos@38: ensure that all changes that people pull have been automatically
bos@38: vetted.
bos@38: 
bos@34: \section{A short tutorial on using hooks}
bos@34: \label{sec:hook:simple}
bos@34: 
bos@34: It is easy to write a Mercurial hook.  Let's start with a hook that
bos@34: runs when you finish a \hgcmd{commit}, and simply prints the hash of
bos@34: the changeset you just created.  The hook is called \hook{commit}.
bos@34: 
bos@34: \begin{figure}[ht]
bos@34:   \interaction{hook.simple.init}
bos@34:   \caption{A simple hook that runs when a changeset is committed}
bos@34:   \label{ex:hook:init}
bos@34: \end{figure}
bos@34: 
bos@34: All hooks follow the pattern in example~\ref{ex:hook:init}.  You add
bos@34: an entry to the \rcsection{hooks} section of your \hgrc\.  On the left
bos@34: is the name of the event to trigger on; on the right is the action to
bos@34: take.  As you can see, you can run an arbitrary shell command in a
bos@34: hook.  Mercurial passes extra information to the hook using
bos@34: environment variables (look for \envar{HG\_NODE} in the example).
bos@34: 
bos@34: \subsection{Performing multiple actions per event}
bos@34: 
bos@34: Quite often, you will want to define more than one hook for a
bos@34: particular kind of event, as shown in example~\ref{ex:hook:ext}.
bos@34: Mercurial lets you do this by adding an \emph{extension} to the end of
bos@34: a hook's name.  You extend a hook's name by giving the name of the
bos@34: hook, followed by a full stop (the ``\texttt{.}'' character), followed
bos@34: by some more text of your choosing.  For example, Mercurial will run
bos@34: both \texttt{commit.foo} and \texttt{commit.bar} when the
bos@34: \texttt{commit} event occurs.
bos@34: 
bos@34: \begin{figure}[ht]
bos@34:   \interaction{hook.simple.ext}
bos@34:   \caption{Defining a second \hook{commit} hook}
bos@34:   \label{ex:hook:ext}
bos@34: \end{figure}
bos@34: 
bos@34: To give a well-defined order of execution when there are multiple
bos@34: hooks defined for an event, Mercurial sorts hooks by extension, and
bos@34: executes the hook commands in this sorted order.  In the above
bos@34: example, it will execute \texttt{commit.bar} before
bos@34: \texttt{commit.foo}, and \texttt{commit} before both.
bos@34: 
bos@34: It is a good idea to use a somewhat descriptive extension when you
bos@34: define a new hook.  This will help you to remember what the hook was
bos@34: for.  If the hook fails, you'll get an error message that contains the
bos@34: hook name and extension, so using a descriptive extension could give
bos@34: you an immediate hint as to why the hook failed (see
bos@34: section~\ref{sec:hook:perm} for an example).
bos@34: 
bos@34: \subsection{Controlling whether an activity can proceed}
bos@34: \label{sec:hook:perm}
bos@34: 
bos@34: In our earlier examples, we used the \hook{commit} hook, which is
bos@34: run after a commit has completed.  This is one of several Mercurial
bos@34: hooks that run after an activity finishes.  Such hooks have no way of
bos@34: influencing the activity itself.
bos@34: 
bos@34: Mercurial defines a number of events that occur before an activity
bos@34: starts; or after it starts, but before it finishes.  Hooks that
bos@34: trigger on these events have the added ability to choose whether the
bos@34: activity can continue, or will abort.  
bos@34: 
bos@34: The \hook{pretxncommit} hook runs after a commit has all but
bos@34: completed.  In other words, the metadata representing the changeset
bos@34: has been written out to disk, but the transaction has not yet been
bos@34: allowed to complete.  The \hook{pretxncommit} hook has the ability to
bos@34: decide whether the transaction can complete, or must be rolled back.
bos@34: 
bos@34: If the \hook{pretxncommit} hook exits with a status code of zero, the
bos@34: transaction is allowed to complete; the commit finishes; and the
bos@34: \hook{commit} hook is run.  If the \hook{pretxncommit} hook exits with
bos@34: a non-zero status code, the transaction is rolled back; the metadata
bos@34: representing the changeset is erased; and the \hook{commit} hook is
bos@34: not run.
bos@34: 
bos@34: \begin{figure}[ht]
bos@34:   \interaction{hook.simple.pretxncommit}
bos@34:   \caption{Using the \hook{pretxncommit} hook to control commits}
bos@34:   \label{ex:hook:pretxncommit}
bos@34: \end{figure}
bos@34: 
bos@34: The hook in example~\ref{ex:hook:pretxncommit} checks that a commit
bos@34: comment contains a bug ID.  If it does, the commit can complete.  If
bos@34: not, the commit is rolled back.
bos@34: 
bos@37: \section{Writing your own hooks}
bos@37: 
bos@37: When you are writing a hook, you might find it useful to run Mercurial
bos@37: either with the \hggopt{-v} option, or the \rcitem{ui}{verbose} config
bos@37: item set to ``true''.  When you do so, Mercurial will print a message
bos@37: before it calls each hook.
bos@37: 
bos@37: \subsection{Choosing how your hook should run}
bos@37: \label{sec:hook:lang}
bos@34: 
bos@34: You can write a hook either as a normal program---typically a shell
bos@37: script---or as a Python function that is executed within the Mercurial
bos@34: process.
bos@34: 
bos@34: Writing a hook as an external program has the advantage that it
bos@34: requires no knowledge of Mercurial's internals.  You can call normal
bos@34: Mercurial commands to get any added information you need.  The
bos@34: trade-off is that external hooks are slower than in-process hooks.
bos@34: 
bos@34: An in-process Python hook has complete access to the Mercurial API,
bos@34: and does not ``shell out'' to another process, so it is inherently
bos@34: faster than an external hook.  It is also easier to obtain much of the
bos@34: information that a hook requires by using the Mercurial API than by
bos@34: running Mercurial commands.
bos@34: 
bos@34: If you are comfortable with Python, or require high performance,
bos@34: writing your hooks in Python may be a good choice.  However, when you
bos@34: have a straightforward hook to write and you don't need to care about
bos@34: performance (probably the majority of hooks), a shell script is
bos@34: perfectly fine.
bos@34: 
bos@37: \subsection{Hook parameters}
bos@34: \label{sec:hook:param}
bos@34: 
bos@34: Mercurial calls each hook with a set of well-defined parameters.  In
bos@34: Python, a parameter is passed as a keyword argument to your hook
bos@34: function.  For an external program, a parameter is passed as an
bos@34: environment variable.
bos@34: 
bos@34: Whether your hook is written in Python or as a shell script, the
bos@37: hook-specific parameter names and values will be the same.  A boolean
bos@37: parameter will be represented as a boolean value in Python, but as the
bos@37: number 1 (for ``true'') or 0 (for ``false'') as an environment
bos@37: variable for an external hook.  If a hook parameter is named
bos@37: \texttt{foo}, the keyword argument for a Python hook will also be
bos@37: named \texttt{foo} Python, while the environment variable for an
bos@37: external hook will be named \texttt{HG\_FOO}.
bos@37: 
bos@37: \subsection{Hook return values and activity control}
bos@37: 
bos@37: A hook that executes successfully must exit with a status of zero if
bos@37: external, or return boolean ``false'' if in-process.  Failure is
bos@37: indicated with a non-zero exit status from an external hook, or an
bos@37: in-process hook returning boolean ``true''.  If an in-process hook
bos@37: raises an exception, the hook is considered to have failed.
bos@37: 
bos@37: For a hook that controls whether an activity can proceed, zero/false
bos@37: means ``allow'', while non-zero/true/exception means ``deny''.
bos@37: 
bos@37: \subsection{Writing an external hook}
bos@37: 
bos@37: When you define an external hook in your \hgrc\ and the hook is run,
bos@37: its value is passed to your shell, which interprets it.  This means
bos@37: that you can use normal shell constructs in the body of the hook.
bos@37: 
bos@37: An executable hook is always run with its current directory set to a
bos@37: repository's root directory.
bos@37: 
bos@37: Each hook parameter is passed in as an environment variable; the name
bos@37: is upper-cased, and prefixed with the string ``\texttt{HG\_}''.
bos@37: 
bos@37: With the exception of hook parameters, Mercurial does not set or
bos@37: modify any environment variables when running a hook.  This is useful
bos@37: to remember if you are writing a site-wide hook that may be run by a
bos@37: number of different users with differing environment variables set.
bos@37: In multi-user situations, you should not rely on environment variables
bos@37: being set to the values you have in your environment when testing the
bos@37: hook.
bos@37: 
bos@37: \subsection{Telling Mercurial to use an in-process hook}
bos@37: 
bos@37: The \hgrc\ syntax for defining an in-process hook is slightly
bos@37: different than for an executable hook.  The value of the hook must
bos@37: start with the text ``\texttt{python:}'', and continue with the
bos@37: fully-qualified name of a callable object to use as the hook's value.
bos@37: 
bos@37: The module in which a hook lives is automatically imported when a hook
bos@37: is run.  So long as you have the module name and \envar{PYTHONPATH}
bos@37: right, it should ``just work''.
bos@37: 
bos@37: The following \hgrc\ example snippet illustrates the syntax and
bos@37: meaning of the notions we just described.
bos@37: \begin{codesample2}
bos@37:   [hooks]
bos@37:   commit.example = python:mymodule.submodule.myhook
bos@37: \end{codesample2}
bos@37: When Mercurial runs the \texttt{commit.example} hook, it imports
bos@37: \texttt{mymodule.submodule}, looks for the callable object named
bos@37: \texttt{myhook}, and calls it.
bos@37: 
bos@37: \subsection{Writing an in-process hook}
bos@37: 
bos@37: The simplest in-process hook does nothing, but illustrates the basic
bos@37: shape of the hook API:
bos@37: \begin{codesample2}
bos@37:   def myhook(ui, repo, **kwargs):
bos@37:       pass
bos@37: \end{codesample2}
bos@37: The first argument to a Python hook is always a
bos@37: \pymodclass{mercurial.ui}{ui} object.  The second is a repository object;
bos@37: at the moment, it is always an instance of
bos@37: \pymodclass{mercurial.localrepo}{localrepository}.  Following these two
bos@37: arguments are other keyword arguments.  Which ones are passed in
bos@37: depends on the hook being called, but a hook can ignore arguments it
bos@37: doesn't care about by dropping them into a keyword argument dict, as
bos@37: with \texttt{**kwargs} above.
bos@34: 
bos@39: \section{Hook reference}
bos@39: 
bos@39: 
bos@39: \subsection{In-process hook execution}
bos@39: 
bos@39: An in-process hook is called with arguments of the following form:
bos@39: \begin{codesample2}
bos@39:   def myhook(ui, repo, **kwargs):
bos@39:       pass
bos@39: \end{codesample2}
bos@39: The \texttt{ui} parameter is a \pymodclass{mercurial.ui}{ui} object.
bos@39: The \texttt{repo} parameter is a
bos@39: \pymodclass{mercurial.localrepo}{localrepository} object.  The
bos@39: names and values of the \texttt{**kwargs} parameters depend on the
bos@39: hook being invoked, with the following common features:
bos@39: \begin{itemize}
bos@39: \item If a parameter is named \texttt{node} or
bos@39:   \texttt{parent\emph{N}}, it will contain a hexadecimal changeset ID.
bos@39:   The empty string is used to represent ``null changeset ID'' instead
bos@39:   of a string of zeroes.
bos@39: \item Boolean-valued parameters are represented as Python
bos@39:   \texttt{bool} objects.
bos@39: \end{itemize}
bos@39: 
bos@39: An in-process hook is called without a change to the process's working
bos@39: directory (unlike external hooks, which are run in the root of the
bos@39: repository).  It must not change the process's working directory.  If
bos@39: it were to do so, it would probably cause calls to the Mercurial API,
bos@39: or operations after the hook finishes, to fail.
bos@39: 
bos@39: If a hook returns a boolean ``false'' value, it is considered to
bos@39: have succeeded.  If it returns a boolean ``true'' value or raises an
bos@39: exception, it is considered to have failed.
bos@39: 
bos@39: \subsection{External hook execution}
bos@39: 
bos@39: An external hook is passed to the user's shell for execution, so
bos@39: features of that shell, such as variable substitution and command
bos@39: redirection, are available.  The hook is run in the root directory of
bos@39: the repository.
bos@39: 
bos@39: Hook parameters are passed to the hook as environment variables.  Each
bos@39: environment variable's name is converted in upper case and prefixed
bos@39: with the string ``\texttt{HG\_}''.  For example, if the name of a
bos@39: parameter is ``\texttt{node}'', the name of the environment variable
bos@39: representing that parameter will be ``\texttt{HG\_NODE}''.
bos@39: 
bos@39: A boolean parameter is represented as the string ``\texttt{1}'' for
bos@39: ``true'', ``\texttt{0}'' for ``false''.  If an environment variable is
bos@39: named \envar{HG\_NODE}, \envar{HG\_PARENT1} or \envar{HG\_PARENT2}, it
bos@39: contains a changeset ID represented as a hexadecimal string.  The
bos@39: empty string is used to represent ``null changeset ID'' instead of a
bos@39: string of zeroes.
bos@39: 
bos@39: If a hook exits with a status of zero, it is considered to have
bos@39: succeeded.  If it exits with a non-zero status, it is considered to
bos@39: have failed.
bos@39: 
bos@39: \subsection{The \hook{changegroup} hook}
bos@39: \label{sec:hook:changegroup}
bos@39: 
bos@39: Parameters to this hook:
bos@39: \begin{itemize}
bos@39: \item[\texttt{node}] The changeset ID of the first changeset in the
bos@39:   group that was added.  All changesets between this and
bos@39:   \index{tags!\texttt{tip}}\texttt{tip}, inclusive, were added by a
bos@39:   single \hgcmd{pull}, \hgcmd{push} or \hgcmd{unbundle}.
bos@39: \end{itemize}
bos@39: 
bos@39: 
bos@39: \subsection{The \hook{commit} hook}
bos@39: \label{sec:hook:commit}
bos@39: 
bos@39: Parameters to this hook:
bos@39: \begin{itemize}
bos@39: \item[\texttt{node}] The changeset ID of the newly committed
bos@39:   changeset.
bos@39: \item[\texttt{parent1}] The changeset ID of the first parent of the
bos@39:   newly committed changeset.
bos@39: \item[\texttt{parent2}] The changeset ID of the second parent of the
bos@39:   newly committed changeset.
bos@39: \end{itemize}
bos@34: 
bos@34: %%% Local Variables: 
bos@34: %%% mode: latex
bos@34: %%% TeX-master: "00book"
bos@34: %%% End: