hgbook: en/hook.tex annotate

hgbook

annotate en/hook.tex @ 38:b49a7dd4e564

More content for hook chapter.
Overview of hooks.
Description of hook security implications.

author	Bryan O'Sullivan <bos@serpentine.com>
date	Wed Jul 19 00:06:21 2006 -0700 (2006-07-19)
parents	9fd0c59b009a
children	576fef93bb49

rev	line source
bos@34	1 \chapter{Handling repository events with hooks}
bos@34	2 \label{chap:hook}
bos@34	3
bos@34	4 Mercurial offers a powerful mechanism to let you perform automated
bos@34	5 actions in response to events that occur in a repository. In some
bos@34	6 cases, you can even control Mercurial's response to those events.
bos@34	7
bos@34	8 The name Mercurial uses for one of these actions is a \emph{hook}.
bos@34	9 Hooks are called ``triggers'' in some revision control systems, but
bos@34	10 the two names refer to the same idea.
bos@34	11
bos@38	12 \section{An overview of hooks in Mercurial}
bos@38	13
bos@38	14 Here is a brief list of the hooks that Mercurial supports. For each
bos@38	15 hook, we indicate when it is run, and a few examples of common tasks
bos@38	16 you can use it for. We will revisit each of these hooks in more
bos@38	17 detail later.
bos@38	18 \begin{itemize}
bos@38	19 \item[\small\hook{changegroup}] This is run after a group of
bos@38	20 changesets has been brought into the repository from elsewhere. In
bos@38	21 other words, it is run after a \hgcmd{pull} or \hgcmd{push} into a
bos@38	22 repository, but not after a \hgcmd{commit}. You can use this for
bos@38	23 performing an action once for the entire group of newly arrived
bos@38	24 changesets. For example, you could use this hook to send out email
bos@38	25 notifications, or kick off an automated build or test.
bos@38	26 \item[\small\hook{commit}] This is run after a new changeset has been
bos@38	27 created in the local repository, typically using the \hgcmd{commit}
bos@38	28 command.
bos@38	29 \item[\small\hook{incoming}] This is run once for each new changeset
bos@38	30 that is brought into the repository from elsewhere. Notice the
bos@38	31 difference from \hook{changegroup}, which is run once per
bos@38	32 \emph{group} of changesets brought in. You can use this for the
bos@38	33 same purposes as the \hook{changegroup} hook; it's simply more
bos@38	34 convenient sometimes to run a hook once per group of changesets,
bos@38	35 while othher times it's handier once per changeset.
bos@38	36 \item[\small\hook{outgoing}] This is run after a group of changesets
bos@38	37 has been transmitted from this repository to another. You can use
bos@38	38 this, for example, to notify subscribers every time changes are
bos@38	39 cloned or pulled from the repository.
bos@38	40 \item[\small\hook{prechangegroup}] This is run before starting to
bos@38	41 bring a group of changesets into the repository. It cannot see the
bos@38	42 actual changesets, because they have not yet been transmitted. If
bos@38	43 it fails, the changesets will not be transmitted. You can use this
bos@38	44 hook to ``lock down'' a repository against incoming changes.
bos@38	45 \item[\small\hook{precommit}] This is run before starting a commit.
bos@38	46 It cannot tell what files are included in the commit, or any other
bos@38	47 information about the commit. If it fails, the commit will not be
bos@38	48 allowed to start. You can use this to perform a build and require
bos@38	49 it to complete successfully before a commit can proceed, or
bos@38	50 automatically enforce a requirement that modified files pass your
bos@38	51 coding style guidelines.
bos@38	52 \item[\small\hook{preoutgoing}] This is run before starting to
bos@38	53 transmit a group of changesets from this repository. You can use
bos@38	54 this to lock a repository against clones or pulls from remote
bos@38	55 clients.
bos@38	56 \item[\small\hook{pretag}] This is run before creating a tag. If it
bos@38	57 fails, the tag will not be created. You can use this to enforce a
bos@38	58 uniform tag naming convention.
bos@38	59 \item[\small\hook{pretxnchangegroup}] This is run after a group of
bos@38	60 changesets has been brought into the local repository from another,
bos@38	61 but before the transaction completes that will make the changes
bos@38	62 permanent in the repository. If it fails, the transaction will be
bos@38	63 rolled back and the changes will disappear from the local
bos@38	64 repository. You can use this to automatically check newly arrived
bos@38	65 changes and, for example, roll them back if the group as a whole
bos@38	66 does not build or pass your test suite.
bos@38	67 \item[\small\hook{pretxncommit}] This is run after a new changeset has
bos@38	68 been created in the local repository, but before the transaction
bos@38	69 completes that will make it permanent. Unlike the \hook{precommit}
bos@38	70 hook, this hook can see which changes are present in the changeset,
bos@38	71 and it can also see all other changeset metadata, such as the commit
bos@38	72 message. You can use this to require that a commit message follows
bos@38	73 your local conventions, or that a changeset builds cleanly.
bos@38	74 \item[\small\hook{preupdate}] This is run before starting an update or
bos@38	75 merge of the working directory.
bos@38	76 \item[\small\hook{tag}] This is run after a tag is created.
bos@38	77 \item[\small\hook{update}] This is run after an update or merge of the
bos@38	78 working directory has finished.
bos@38	79 \end{itemize}
bos@38	80 Each of the hooks with a ``\texttt{pre}'' prefix has the ability to
bos@38	81 \emph{control} an activity. If the hook succeeds, the activity may
bos@38	82 proceed; if it fails, the activity is either not permitted or undone,
bos@38	83 depending on the hook.
bos@38	84
bos@38	85 \section{Hooks and security}
bos@38	86
bos@38	87 \subsection{Hooks are run with your privileges}
bos@38	88
bos@38	89 When you run a Mercurial command in a repository, and the command
bos@38	90 causes a hook to run, that hook runs on your system, under your user
bos@38	91 account, with your privilege level. Since hooks are arbitrary pieces
bos@38	92 of executable code, you should treat them with an appropriate level of
bos@38	93 suspicion. Do not install a hook unless you are confident that you
bos@38	94 know who created it and what it does.
bos@38	95
bos@38	96 In some cases, you may be exposed to hooks that you did not install
bos@38	97 yourself. If you work with Mercurial on an unfamiliar system,
bos@38	98 Mercurial will run hooks defined in that system's global \hgrc\ file.
bos@38	99
bos@38	100 If you are working with a repository owned by another user, Mercurial
bos@38	101 will run hooks defined in that repository. For example, if you
bos@38	102 \hgcmd{pull} from that repository, and its \sfilename{.hg/hgrc}
bos@38	103 defines a local \hook{outgoing} hook, that hook will run under your
bos@38	104 user account, even though you don't own that repository.
bos@38	105
bos@38	106 \begin{note}
bos@38	107 This only applies if you are pulling from a repository on a local or
bos@38	108 network filesystem. If you're pulling over http or ssh, any
bos@38	109 \hook{outgoing} hook will run under the account of the server
bos@38	110 process, on the server.
bos@38	111 \end{note}
bos@38	112
bos@38	113 XXX To see what hooks are defined in a repository, use the
bos@38	114 \hgcmdargs{config}{hooks} command. If you are working in one
bos@38	115 repository, but talking to another that you do not own (e.g.~using
bos@38	116 \hgcmd{pull} or \hgcmd{incoming}), remember that it is the other
bos@38	117 repository's hooks you should be checking, not your own.
bos@38	118
bos@38	119 \subsection{Hooks do not propagate}
bos@38	120
bos@38	121 In Mercurial, hooks are not revision controlled, and do not propagate
bos@38	122 when you clone, or pull from, a repository. The reason for this is
bos@38	123 simple: a hook is a completely arbitrary piece of executable code. It
bos@38	124 runs under your user identity, with your privilege level, on your
bos@38	125 machine.
bos@38	126
bos@38	127 It would be extremely reckless for any distributed revision control
bos@38	128 system to implement revision-controlled hooks, as this would offer an
bos@38	129 easily exploitable way to subvert the accounts of users of the
bos@38	130 revision control system.
bos@38	131
bos@38	132 Since Mercurial does not propagate hooks, if you are collaborating
bos@38	133 with other people on a common project, you should not assume that they
bos@38	134 are using the same Mercurial hooks as you are, or that theirs are
bos@38	135 correctly configured. You should document the hooks you expect people
bos@38	136 to use.
bos@38	137
bos@38	138 In a corporate intranet, this is somewhat easier to control, as you
bos@38	139 can for example provide a ``standard'' installation of Mercurial on an
bos@38	140 NFS filesystem, and use a site-wide \hgrc\ file to define hooks that
bos@38	141 all users will see. However, this too has its limits; see below.
bos@38	142
bos@38	143 \subsection{Hooks can be overridden}
bos@38	144
bos@38	145 Mercurial allows you to override a hook definition by redefining the
bos@38	146 hook. You can disable it by setting its value to the empty string, or
bos@38	147 change its behaviour as you wish.
bos@38	148
bos@38	149 If you deploy a system-~or site-wide \hgrc\ file that defines some
bos@38	150 hooks, you should thus understand that your users can disable or
bos@38	151 override those hooks.
bos@38	152
bos@38	153 \subsection{Ensuring that critical hooks are run}
bos@38	154
bos@38	155 Sometimes you may want to enforce a policy that you do not want others
bos@38	156 to be able to work around. For example, you may have a requirement
bos@38	157 that every changeset must pass a rigorous set of tests. Defining this
bos@38	158 requirement via a hook in a site-wide \hgrc\ won't work for remote
bos@38	159 users on laptops, and of course local users can subvert it at will by
bos@38	160 overriding the hook.
bos@38	161
bos@38	162 Instead, you can set up your policies for use of Mercurial so that
bos@38	163 people are expected to propagate changes through a well-known
bos@38	164 ``canonical'' server that you have locked down and configured
bos@38	165 appropriately.
bos@38	166
bos@38	167 One way to do this is via a combination of social engineering and
bos@38	168 technology. Set up a restricted-access account; users can push
bos@38	169 changes over the network to repositories managed by this account, but
bos@38	170 they cannot log into the account and run normal shell commands. In
bos@38	171 this scenario, a user can commit a changeset that contains any old
bos@38	172 garbage they want.
bos@38	173
bos@38	174 When someone pushes a changeset to the server that everyone pulls
bos@38	175 from, the server will test the changeset before it accepts it as
bos@38	176 permanent, and reject it if it fails to pass the test suite. If
bos@38	177 people only pull changes from this filtering server, it will serve to
bos@38	178 ensure that all changes that people pull have been automatically
bos@38	179 vetted.
bos@38	180
bos@34	181 \section{A short tutorial on using hooks}
bos@34	182 \label{sec:hook:simple}
bos@34	183
bos@34	184 It is easy to write a Mercurial hook. Let's start with a hook that
bos@34	185 runs when you finish a \hgcmd{commit}, and simply prints the hash of
bos@34	186 the changeset you just created. The hook is called \hook{commit}.
bos@34	187
bos@34	188 \begin{figure}[ht]
bos@34	189 \interaction{hook.simple.init}
bos@34	190 \caption{A simple hook that runs when a changeset is committed}
bos@34	191 \label{ex:hook:init}
bos@34	192 \end{figure}
bos@34	193
bos@34	194 All hooks follow the pattern in example~\ref{ex:hook:init}. You add
bos@34	195 an entry to the \rcsection{hooks} section of your \hgrc\. On the left
bos@34	196 is the name of the event to trigger on; on the right is the action to
bos@34	197 take. As you can see, you can run an arbitrary shell command in a
bos@34	198 hook. Mercurial passes extra information to the hook using
bos@34	199 environment variables (look for \envar{HG\_NODE} in the example).
bos@34	200
bos@34	201 \subsection{Performing multiple actions per event}
bos@34	202
bos@34	203 Quite often, you will want to define more than one hook for a
bos@34	204 particular kind of event, as shown in example~\ref{ex:hook:ext}.
bos@34	205 Mercurial lets you do this by adding an \emph{extension} to the end of
bos@34	206 a hook's name. You extend a hook's name by giving the name of the
bos@34	207 hook, followed by a full stop (the ``\texttt{.}'' character), followed
bos@34	208 by some more text of your choosing. For example, Mercurial will run
bos@34	209 both \texttt{commit.foo} and \texttt{commit.bar} when the
bos@34	210 \texttt{commit} event occurs.
bos@34	211
bos@34	212 \begin{figure}[ht]
bos@34	213 \interaction{hook.simple.ext}
bos@34	214 \caption{Defining a second \hook{commit} hook}
bos@34	215 \label{ex:hook:ext}
bos@34	216 \end{figure}
bos@34	217
bos@34	218 To give a well-defined order of execution when there are multiple
bos@34	219 hooks defined for an event, Mercurial sorts hooks by extension, and
bos@34	220 executes the hook commands in this sorted order. In the above
bos@34	221 example, it will execute \texttt{commit.bar} before
bos@34	222 \texttt{commit.foo}, and \texttt{commit} before both.
bos@34	223
bos@34	224 It is a good idea to use a somewhat descriptive extension when you
bos@34	225 define a new hook. This will help you to remember what the hook was
bos@34	226 for. If the hook fails, you'll get an error message that contains the
bos@34	227 hook name and extension, so using a descriptive extension could give
bos@34	228 you an immediate hint as to why the hook failed (see
bos@34	229 section~\ref{sec:hook:perm} for an example).
bos@34	230
bos@34	231 \subsection{Controlling whether an activity can proceed}
bos@34	232 \label{sec:hook:perm}
bos@34	233
bos@34	234 In our earlier examples, we used the \hook{commit} hook, which is
bos@34	235 run after a commit has completed. This is one of several Mercurial
bos@34	236 hooks that run after an activity finishes. Such hooks have no way of
bos@34	237 influencing the activity itself.
bos@34	238
bos@34	239 Mercurial defines a number of events that occur before an activity
bos@34	240 starts; or after it starts, but before it finishes. Hooks that
bos@34	241 trigger on these events have the added ability to choose whether the
bos@34	242 activity can continue, or will abort.
bos@34	243
bos@34	244 The \hook{pretxncommit} hook runs after a commit has all but
bos@34	245 completed. In other words, the metadata representing the changeset
bos@34	246 has been written out to disk, but the transaction has not yet been
bos@34	247 allowed to complete. The \hook{pretxncommit} hook has the ability to
bos@34	248 decide whether the transaction can complete, or must be rolled back.
bos@34	249
bos@34	250 If the \hook{pretxncommit} hook exits with a status code of zero, the
bos@34	251 transaction is allowed to complete; the commit finishes; and the
bos@34	252 \hook{commit} hook is run. If the \hook{pretxncommit} hook exits with
bos@34	253 a non-zero status code, the transaction is rolled back; the metadata
bos@34	254 representing the changeset is erased; and the \hook{commit} hook is
bos@34	255 not run.
bos@34	256
bos@34	257 \begin{figure}[ht]
bos@34	258 \interaction{hook.simple.pretxncommit}
bos@34	259 \caption{Using the \hook{pretxncommit} hook to control commits}
bos@34	260 \label{ex:hook:pretxncommit}
bos@34	261 \end{figure}
bos@34	262
bos@34	263 The hook in example~\ref{ex:hook:pretxncommit} checks that a commit
bos@34	264 comment contains a bug ID. If it does, the commit can complete. If
bos@34	265 not, the commit is rolled back.
bos@34	266
bos@37	267 \section{Writing your own hooks}
bos@37	268
bos@37	269 When you are writing a hook, you might find it useful to run Mercurial
bos@37	270 either with the \hggopt{-v} option, or the \rcitem{ui}{verbose} config
bos@37	271 item set to ``true''. When you do so, Mercurial will print a message
bos@37	272 before it calls each hook.
bos@37	273
bos@37	274 \subsection{Choosing how your hook should run}
bos@37	275 \label{sec:hook:lang}
bos@34	276
bos@34	277 You can write a hook either as a normal program---typically a shell
bos@37	278 script---or as a Python function that is executed within the Mercurial
bos@34	279 process.
bos@34	280
bos@34	281 Writing a hook as an external program has the advantage that it
bos@34	282 requires no knowledge of Mercurial's internals. You can call normal
bos@34	283 Mercurial commands to get any added information you need. The
bos@34	284 trade-off is that external hooks are slower than in-process hooks.
bos@34	285
bos@34	286 An in-process Python hook has complete access to the Mercurial API,
bos@34	287 and does not ``shell out'' to another process, so it is inherently
bos@34	288 faster than an external hook. It is also easier to obtain much of the
bos@34	289 information that a hook requires by using the Mercurial API than by
bos@34	290 running Mercurial commands.
bos@34	291
bos@34	292 If you are comfortable with Python, or require high performance,
bos@34	293 writing your hooks in Python may be a good choice. However, when you
bos@34	294 have a straightforward hook to write and you don't need to care about
bos@34	295 performance (probably the majority of hooks), a shell script is
bos@34	296 perfectly fine.
bos@34	297
bos@37	298 \subsection{Hook parameters}
bos@34	299 \label{sec:hook:param}
bos@34	300
bos@34	301 Mercurial calls each hook with a set of well-defined parameters. In
bos@34	302 Python, a parameter is passed as a keyword argument to your hook
bos@34	303 function. For an external program, a parameter is passed as an
bos@34	304 environment variable.
bos@34	305
bos@34	306 Whether your hook is written in Python or as a shell script, the
bos@37	307 hook-specific parameter names and values will be the same. A boolean
bos@37	308 parameter will be represented as a boolean value in Python, but as the
bos@37	309 number 1 (for ``true'') or 0 (for ``false'') as an environment
bos@37	310 variable for an external hook. If a hook parameter is named
bos@37	311 \texttt{foo}, the keyword argument for a Python hook will also be
bos@37	312 named \texttt{foo} Python, while the environment variable for an
bos@37	313 external hook will be named \texttt{HG\_FOO}.
bos@37	314
bos@37	315 \subsection{Hook return values and activity control}
bos@37	316
bos@37	317 A hook that executes successfully must exit with a status of zero if
bos@37	318 external, or return boolean ``false'' if in-process. Failure is
bos@37	319 indicated with a non-zero exit status from an external hook, or an
bos@37	320 in-process hook returning boolean ``true''. If an in-process hook
bos@37	321 raises an exception, the hook is considered to have failed.
bos@37	322
bos@37	323 For a hook that controls whether an activity can proceed, zero/false
bos@37	324 means ``allow'', while non-zero/true/exception means ``deny''.
bos@37	325
bos@37	326 \subsection{Writing an external hook}
bos@37	327
bos@37	328 When you define an external hook in your \hgrc\ and the hook is run,
bos@37	329 its value is passed to your shell, which interprets it. This means
bos@37	330 that you can use normal shell constructs in the body of the hook.
bos@37	331
bos@37	332 An executable hook is always run with its current directory set to a
bos@37	333 repository's root directory.
bos@37	334
bos@37	335 Each hook parameter is passed in as an environment variable; the name
bos@37	336 is upper-cased, and prefixed with the string ``\texttt{HG\_}''.
bos@37	337
bos@37	338 With the exception of hook parameters, Mercurial does not set or
bos@37	339 modify any environment variables when running a hook. This is useful
bos@37	340 to remember if you are writing a site-wide hook that may be run by a
bos@37	341 number of different users with differing environment variables set.
bos@37	342 In multi-user situations, you should not rely on environment variables
bos@37	343 being set to the values you have in your environment when testing the
bos@37	344 hook.
bos@37	345
bos@37	346 \subsection{Telling Mercurial to use an in-process hook}
bos@37	347
bos@37	348 The \hgrc\ syntax for defining an in-process hook is slightly
bos@37	349 different than for an executable hook. The value of the hook must
bos@37	350 start with the text ``\texttt{python:}'', and continue with the
bos@37	351 fully-qualified name of a callable object to use as the hook's value.
bos@37	352
bos@37	353 The module in which a hook lives is automatically imported when a hook
bos@37	354 is run. So long as you have the module name and \envar{PYTHONPATH}
bos@37	355 right, it should ``just work''.
bos@37	356
bos@37	357 The following \hgrc\ example snippet illustrates the syntax and
bos@37	358 meaning of the notions we just described.
bos@37	359 \begin{codesample2}
bos@37	360 [hooks]
bos@37	361 commit.example = python:mymodule.submodule.myhook
bos@37	362 \end{codesample2}
bos@37	363 When Mercurial runs the \texttt{commit.example} hook, it imports
bos@37	364 \texttt{mymodule.submodule}, looks for the callable object named
bos@37	365 \texttt{myhook}, and calls it.
bos@37	366
bos@37	367 \subsection{Writing an in-process hook}
bos@37	368
bos@37	369 The simplest in-process hook does nothing, but illustrates the basic
bos@37	370 shape of the hook API:
bos@37	371 \begin{codesample2}
bos@37	372 def myhook(ui, repo, **kwargs):
bos@37	373 pass
bos@37	374 \end{codesample2}
bos@37	375 The first argument to a Python hook is always a
bos@37	376 \pymodclass{mercurial.ui}{ui} object. The second is a repository object;
bos@37	377 at the moment, it is always an instance of
bos@37	378 \pymodclass{mercurial.localrepo}{localrepository}. Following these two
bos@37	379 arguments are other keyword arguments. Which ones are passed in
bos@37	380 depends on the hook being called, but a hook can ignore arguments it
bos@37	381 doesn't care about by dropping them into a keyword argument dict, as
bos@37	382 with \texttt{**kwargs} above.
bos@34	383
bos@34	384
bos@34	385 %%% Local Variables:
bos@34	386 %%% mode: latex
bos@34	387 %%% TeX-master: "00book"
bos@34	388 %%% End: