hgbook
annotate en/hook.tex @ 38:b49a7dd4e564
More content for hook chapter.
Overview of hooks.
Description of hook security implications.
Overview of hooks.
Description of hook security implications.
author | Bryan O'Sullivan <bos@serpentine.com> |
---|---|
date | Wed Jul 19 00:06:21 2006 -0700 (2006-07-19) |
parents | 9fd0c59b009a |
children | 576fef93bb49 |
rev | line source |
---|---|
bos@34 | 1 \chapter{Handling repository events with hooks} |
bos@34 | 2 \label{chap:hook} |
bos@34 | 3 |
bos@34 | 4 Mercurial offers a powerful mechanism to let you perform automated |
bos@34 | 5 actions in response to events that occur in a repository. In some |
bos@34 | 6 cases, you can even control Mercurial's response to those events. |
bos@34 | 7 |
bos@34 | 8 The name Mercurial uses for one of these actions is a \emph{hook}. |
bos@34 | 9 Hooks are called ``triggers'' in some revision control systems, but |
bos@34 | 10 the two names refer to the same idea. |
bos@34 | 11 |
bos@38 | 12 \section{An overview of hooks in Mercurial} |
bos@38 | 13 |
bos@38 | 14 Here is a brief list of the hooks that Mercurial supports. For each |
bos@38 | 15 hook, we indicate when it is run, and a few examples of common tasks |
bos@38 | 16 you can use it for. We will revisit each of these hooks in more |
bos@38 | 17 detail later. |
bos@38 | 18 \begin{itemize} |
bos@38 | 19 \item[\small\hook{changegroup}] This is run after a group of |
bos@38 | 20 changesets has been brought into the repository from elsewhere. In |
bos@38 | 21 other words, it is run after a \hgcmd{pull} or \hgcmd{push} into a |
bos@38 | 22 repository, but not after a \hgcmd{commit}. You can use this for |
bos@38 | 23 performing an action once for the entire group of newly arrived |
bos@38 | 24 changesets. For example, you could use this hook to send out email |
bos@38 | 25 notifications, or kick off an automated build or test. |
bos@38 | 26 \item[\small\hook{commit}] This is run after a new changeset has been |
bos@38 | 27 created in the local repository, typically using the \hgcmd{commit} |
bos@38 | 28 command. |
bos@38 | 29 \item[\small\hook{incoming}] This is run once for each new changeset |
bos@38 | 30 that is brought into the repository from elsewhere. Notice the |
bos@38 | 31 difference from \hook{changegroup}, which is run once per |
bos@38 | 32 \emph{group} of changesets brought in. You can use this for the |
bos@38 | 33 same purposes as the \hook{changegroup} hook; it's simply more |
bos@38 | 34 convenient sometimes to run a hook once per group of changesets, |
bos@38 | 35 while othher times it's handier once per changeset. |
bos@38 | 36 \item[\small\hook{outgoing}] This is run after a group of changesets |
bos@38 | 37 has been transmitted from this repository to another. You can use |
bos@38 | 38 this, for example, to notify subscribers every time changes are |
bos@38 | 39 cloned or pulled from the repository. |
bos@38 | 40 \item[\small\hook{prechangegroup}] This is run before starting to |
bos@38 | 41 bring a group of changesets into the repository. It cannot see the |
bos@38 | 42 actual changesets, because they have not yet been transmitted. If |
bos@38 | 43 it fails, the changesets will not be transmitted. You can use this |
bos@38 | 44 hook to ``lock down'' a repository against incoming changes. |
bos@38 | 45 \item[\small\hook{precommit}] This is run before starting a commit. |
bos@38 | 46 It cannot tell what files are included in the commit, or any other |
bos@38 | 47 information about the commit. If it fails, the commit will not be |
bos@38 | 48 allowed to start. You can use this to perform a build and require |
bos@38 | 49 it to complete successfully before a commit can proceed, or |
bos@38 | 50 automatically enforce a requirement that modified files pass your |
bos@38 | 51 coding style guidelines. |
bos@38 | 52 \item[\small\hook{preoutgoing}] This is run before starting to |
bos@38 | 53 transmit a group of changesets from this repository. You can use |
bos@38 | 54 this to lock a repository against clones or pulls from remote |
bos@38 | 55 clients. |
bos@38 | 56 \item[\small\hook{pretag}] This is run before creating a tag. If it |
bos@38 | 57 fails, the tag will not be created. You can use this to enforce a |
bos@38 | 58 uniform tag naming convention. |
bos@38 | 59 \item[\small\hook{pretxnchangegroup}] This is run after a group of |
bos@38 | 60 changesets has been brought into the local repository from another, |
bos@38 | 61 but before the transaction completes that will make the changes |
bos@38 | 62 permanent in the repository. If it fails, the transaction will be |
bos@38 | 63 rolled back and the changes will disappear from the local |
bos@38 | 64 repository. You can use this to automatically check newly arrived |
bos@38 | 65 changes and, for example, roll them back if the group as a whole |
bos@38 | 66 does not build or pass your test suite. |
bos@38 | 67 \item[\small\hook{pretxncommit}] This is run after a new changeset has |
bos@38 | 68 been created in the local repository, but before the transaction |
bos@38 | 69 completes that will make it permanent. Unlike the \hook{precommit} |
bos@38 | 70 hook, this hook can see which changes are present in the changeset, |
bos@38 | 71 and it can also see all other changeset metadata, such as the commit |
bos@38 | 72 message. You can use this to require that a commit message follows |
bos@38 | 73 your local conventions, or that a changeset builds cleanly. |
bos@38 | 74 \item[\small\hook{preupdate}] This is run before starting an update or |
bos@38 | 75 merge of the working directory. |
bos@38 | 76 \item[\small\hook{tag}] This is run after a tag is created. |
bos@38 | 77 \item[\small\hook{update}] This is run after an update or merge of the |
bos@38 | 78 working directory has finished. |
bos@38 | 79 \end{itemize} |
bos@38 | 80 Each of the hooks with a ``\texttt{pre}'' prefix has the ability to |
bos@38 | 81 \emph{control} an activity. If the hook succeeds, the activity may |
bos@38 | 82 proceed; if it fails, the activity is either not permitted or undone, |
bos@38 | 83 depending on the hook. |
bos@38 | 84 |
bos@38 | 85 \section{Hooks and security} |
bos@38 | 86 |
bos@38 | 87 \subsection{Hooks are run with your privileges} |
bos@38 | 88 |
bos@38 | 89 When you run a Mercurial command in a repository, and the command |
bos@38 | 90 causes a hook to run, that hook runs on your system, under your user |
bos@38 | 91 account, with your privilege level. Since hooks are arbitrary pieces |
bos@38 | 92 of executable code, you should treat them with an appropriate level of |
bos@38 | 93 suspicion. Do not install a hook unless you are confident that you |
bos@38 | 94 know who created it and what it does. |
bos@38 | 95 |
bos@38 | 96 In some cases, you may be exposed to hooks that you did not install |
bos@38 | 97 yourself. If you work with Mercurial on an unfamiliar system, |
bos@38 | 98 Mercurial will run hooks defined in that system's global \hgrc\ file. |
bos@38 | 99 |
bos@38 | 100 If you are working with a repository owned by another user, Mercurial |
bos@38 | 101 will run hooks defined in that repository. For example, if you |
bos@38 | 102 \hgcmd{pull} from that repository, and its \sfilename{.hg/hgrc} |
bos@38 | 103 defines a local \hook{outgoing} hook, that hook will run under your |
bos@38 | 104 user account, even though you don't own that repository. |
bos@38 | 105 |
bos@38 | 106 \begin{note} |
bos@38 | 107 This only applies if you are pulling from a repository on a local or |
bos@38 | 108 network filesystem. If you're pulling over http or ssh, any |
bos@38 | 109 \hook{outgoing} hook will run under the account of the server |
bos@38 | 110 process, on the server. |
bos@38 | 111 \end{note} |
bos@38 | 112 |
bos@38 | 113 XXX To see what hooks are defined in a repository, use the |
bos@38 | 114 \hgcmdargs{config}{hooks} command. If you are working in one |
bos@38 | 115 repository, but talking to another that you do not own (e.g.~using |
bos@38 | 116 \hgcmd{pull} or \hgcmd{incoming}), remember that it is the other |
bos@38 | 117 repository's hooks you should be checking, not your own. |
bos@38 | 118 |
bos@38 | 119 \subsection{Hooks do not propagate} |
bos@38 | 120 |
bos@38 | 121 In Mercurial, hooks are not revision controlled, and do not propagate |
bos@38 | 122 when you clone, or pull from, a repository. The reason for this is |
bos@38 | 123 simple: a hook is a completely arbitrary piece of executable code. It |
bos@38 | 124 runs under your user identity, with your privilege level, on your |
bos@38 | 125 machine. |
bos@38 | 126 |
bos@38 | 127 It would be extremely reckless for any distributed revision control |
bos@38 | 128 system to implement revision-controlled hooks, as this would offer an |
bos@38 | 129 easily exploitable way to subvert the accounts of users of the |
bos@38 | 130 revision control system. |
bos@38 | 131 |
bos@38 | 132 Since Mercurial does not propagate hooks, if you are collaborating |
bos@38 | 133 with other people on a common project, you should not assume that they |
bos@38 | 134 are using the same Mercurial hooks as you are, or that theirs are |
bos@38 | 135 correctly configured. You should document the hooks you expect people |
bos@38 | 136 to use. |
bos@38 | 137 |
bos@38 | 138 In a corporate intranet, this is somewhat easier to control, as you |
bos@38 | 139 can for example provide a ``standard'' installation of Mercurial on an |
bos@38 | 140 NFS filesystem, and use a site-wide \hgrc\ file to define hooks that |
bos@38 | 141 all users will see. However, this too has its limits; see below. |
bos@38 | 142 |
bos@38 | 143 \subsection{Hooks can be overridden} |
bos@38 | 144 |
bos@38 | 145 Mercurial allows you to override a hook definition by redefining the |
bos@38 | 146 hook. You can disable it by setting its value to the empty string, or |
bos@38 | 147 change its behaviour as you wish. |
bos@38 | 148 |
bos@38 | 149 If you deploy a system-~or site-wide \hgrc\ file that defines some |
bos@38 | 150 hooks, you should thus understand that your users can disable or |
bos@38 | 151 override those hooks. |
bos@38 | 152 |
bos@38 | 153 \subsection{Ensuring that critical hooks are run} |
bos@38 | 154 |
bos@38 | 155 Sometimes you may want to enforce a policy that you do not want others |
bos@38 | 156 to be able to work around. For example, you may have a requirement |
bos@38 | 157 that every changeset must pass a rigorous set of tests. Defining this |
bos@38 | 158 requirement via a hook in a site-wide \hgrc\ won't work for remote |
bos@38 | 159 users on laptops, and of course local users can subvert it at will by |
bos@38 | 160 overriding the hook. |
bos@38 | 161 |
bos@38 | 162 Instead, you can set up your policies for use of Mercurial so that |
bos@38 | 163 people are expected to propagate changes through a well-known |
bos@38 | 164 ``canonical'' server that you have locked down and configured |
bos@38 | 165 appropriately. |
bos@38 | 166 |
bos@38 | 167 One way to do this is via a combination of social engineering and |
bos@38 | 168 technology. Set up a restricted-access account; users can push |
bos@38 | 169 changes over the network to repositories managed by this account, but |
bos@38 | 170 they cannot log into the account and run normal shell commands. In |
bos@38 | 171 this scenario, a user can commit a changeset that contains any old |
bos@38 | 172 garbage they want. |
bos@38 | 173 |
bos@38 | 174 When someone pushes a changeset to the server that everyone pulls |
bos@38 | 175 from, the server will test the changeset before it accepts it as |
bos@38 | 176 permanent, and reject it if it fails to pass the test suite. If |
bos@38 | 177 people only pull changes from this filtering server, it will serve to |
bos@38 | 178 ensure that all changes that people pull have been automatically |
bos@38 | 179 vetted. |
bos@38 | 180 |
bos@34 | 181 \section{A short tutorial on using hooks} |
bos@34 | 182 \label{sec:hook:simple} |
bos@34 | 183 |
bos@34 | 184 It is easy to write a Mercurial hook. Let's start with a hook that |
bos@34 | 185 runs when you finish a \hgcmd{commit}, and simply prints the hash of |
bos@34 | 186 the changeset you just created. The hook is called \hook{commit}. |
bos@34 | 187 |
bos@34 | 188 \begin{figure}[ht] |
bos@34 | 189 \interaction{hook.simple.init} |
bos@34 | 190 \caption{A simple hook that runs when a changeset is committed} |
bos@34 | 191 \label{ex:hook:init} |
bos@34 | 192 \end{figure} |
bos@34 | 193 |
bos@34 | 194 All hooks follow the pattern in example~\ref{ex:hook:init}. You add |
bos@34 | 195 an entry to the \rcsection{hooks} section of your \hgrc\. On the left |
bos@34 | 196 is the name of the event to trigger on; on the right is the action to |
bos@34 | 197 take. As you can see, you can run an arbitrary shell command in a |
bos@34 | 198 hook. Mercurial passes extra information to the hook using |
bos@34 | 199 environment variables (look for \envar{HG\_NODE} in the example). |
bos@34 | 200 |
bos@34 | 201 \subsection{Performing multiple actions per event} |
bos@34 | 202 |
bos@34 | 203 Quite often, you will want to define more than one hook for a |
bos@34 | 204 particular kind of event, as shown in example~\ref{ex:hook:ext}. |
bos@34 | 205 Mercurial lets you do this by adding an \emph{extension} to the end of |
bos@34 | 206 a hook's name. You extend a hook's name by giving the name of the |
bos@34 | 207 hook, followed by a full stop (the ``\texttt{.}'' character), followed |
bos@34 | 208 by some more text of your choosing. For example, Mercurial will run |
bos@34 | 209 both \texttt{commit.foo} and \texttt{commit.bar} when the |
bos@34 | 210 \texttt{commit} event occurs. |
bos@34 | 211 |
bos@34 | 212 \begin{figure}[ht] |
bos@34 | 213 \interaction{hook.simple.ext} |
bos@34 | 214 \caption{Defining a second \hook{commit} hook} |
bos@34 | 215 \label{ex:hook:ext} |
bos@34 | 216 \end{figure} |
bos@34 | 217 |
bos@34 | 218 To give a well-defined order of execution when there are multiple |
bos@34 | 219 hooks defined for an event, Mercurial sorts hooks by extension, and |
bos@34 | 220 executes the hook commands in this sorted order. In the above |
bos@34 | 221 example, it will execute \texttt{commit.bar} before |
bos@34 | 222 \texttt{commit.foo}, and \texttt{commit} before both. |
bos@34 | 223 |
bos@34 | 224 It is a good idea to use a somewhat descriptive extension when you |
bos@34 | 225 define a new hook. This will help you to remember what the hook was |
bos@34 | 226 for. If the hook fails, you'll get an error message that contains the |
bos@34 | 227 hook name and extension, so using a descriptive extension could give |
bos@34 | 228 you an immediate hint as to why the hook failed (see |
bos@34 | 229 section~\ref{sec:hook:perm} for an example). |
bos@34 | 230 |
bos@34 | 231 \subsection{Controlling whether an activity can proceed} |
bos@34 | 232 \label{sec:hook:perm} |
bos@34 | 233 |
bos@34 | 234 In our earlier examples, we used the \hook{commit} hook, which is |
bos@34 | 235 run after a commit has completed. This is one of several Mercurial |
bos@34 | 236 hooks that run after an activity finishes. Such hooks have no way of |
bos@34 | 237 influencing the activity itself. |
bos@34 | 238 |
bos@34 | 239 Mercurial defines a number of events that occur before an activity |
bos@34 | 240 starts; or after it starts, but before it finishes. Hooks that |
bos@34 | 241 trigger on these events have the added ability to choose whether the |
bos@34 | 242 activity can continue, or will abort. |
bos@34 | 243 |
bos@34 | 244 The \hook{pretxncommit} hook runs after a commit has all but |
bos@34 | 245 completed. In other words, the metadata representing the changeset |
bos@34 | 246 has been written out to disk, but the transaction has not yet been |
bos@34 | 247 allowed to complete. The \hook{pretxncommit} hook has the ability to |
bos@34 | 248 decide whether the transaction can complete, or must be rolled back. |
bos@34 | 249 |
bos@34 | 250 If the \hook{pretxncommit} hook exits with a status code of zero, the |
bos@34 | 251 transaction is allowed to complete; the commit finishes; and the |
bos@34 | 252 \hook{commit} hook is run. If the \hook{pretxncommit} hook exits with |
bos@34 | 253 a non-zero status code, the transaction is rolled back; the metadata |
bos@34 | 254 representing the changeset is erased; and the \hook{commit} hook is |
bos@34 | 255 not run. |
bos@34 | 256 |
bos@34 | 257 \begin{figure}[ht] |
bos@34 | 258 \interaction{hook.simple.pretxncommit} |
bos@34 | 259 \caption{Using the \hook{pretxncommit} hook to control commits} |
bos@34 | 260 \label{ex:hook:pretxncommit} |
bos@34 | 261 \end{figure} |
bos@34 | 262 |
bos@34 | 263 The hook in example~\ref{ex:hook:pretxncommit} checks that a commit |
bos@34 | 264 comment contains a bug ID. If it does, the commit can complete. If |
bos@34 | 265 not, the commit is rolled back. |
bos@34 | 266 |
bos@37 | 267 \section{Writing your own hooks} |
bos@37 | 268 |
bos@37 | 269 When you are writing a hook, you might find it useful to run Mercurial |
bos@37 | 270 either with the \hggopt{-v} option, or the \rcitem{ui}{verbose} config |
bos@37 | 271 item set to ``true''. When you do so, Mercurial will print a message |
bos@37 | 272 before it calls each hook. |
bos@37 | 273 |
bos@37 | 274 \subsection{Choosing how your hook should run} |
bos@37 | 275 \label{sec:hook:lang} |
bos@34 | 276 |
bos@34 | 277 You can write a hook either as a normal program---typically a shell |
bos@37 | 278 script---or as a Python function that is executed within the Mercurial |
bos@34 | 279 process. |
bos@34 | 280 |
bos@34 | 281 Writing a hook as an external program has the advantage that it |
bos@34 | 282 requires no knowledge of Mercurial's internals. You can call normal |
bos@34 | 283 Mercurial commands to get any added information you need. The |
bos@34 | 284 trade-off is that external hooks are slower than in-process hooks. |
bos@34 | 285 |
bos@34 | 286 An in-process Python hook has complete access to the Mercurial API, |
bos@34 | 287 and does not ``shell out'' to another process, so it is inherently |
bos@34 | 288 faster than an external hook. It is also easier to obtain much of the |
bos@34 | 289 information that a hook requires by using the Mercurial API than by |
bos@34 | 290 running Mercurial commands. |
bos@34 | 291 |
bos@34 | 292 If you are comfortable with Python, or require high performance, |
bos@34 | 293 writing your hooks in Python may be a good choice. However, when you |
bos@34 | 294 have a straightforward hook to write and you don't need to care about |
bos@34 | 295 performance (probably the majority of hooks), a shell script is |
bos@34 | 296 perfectly fine. |
bos@34 | 297 |
bos@37 | 298 \subsection{Hook parameters} |
bos@34 | 299 \label{sec:hook:param} |
bos@34 | 300 |
bos@34 | 301 Mercurial calls each hook with a set of well-defined parameters. In |
bos@34 | 302 Python, a parameter is passed as a keyword argument to your hook |
bos@34 | 303 function. For an external program, a parameter is passed as an |
bos@34 | 304 environment variable. |
bos@34 | 305 |
bos@34 | 306 Whether your hook is written in Python or as a shell script, the |
bos@37 | 307 hook-specific parameter names and values will be the same. A boolean |
bos@37 | 308 parameter will be represented as a boolean value in Python, but as the |
bos@37 | 309 number 1 (for ``true'') or 0 (for ``false'') as an environment |
bos@37 | 310 variable for an external hook. If a hook parameter is named |
bos@37 | 311 \texttt{foo}, the keyword argument for a Python hook will also be |
bos@37 | 312 named \texttt{foo} Python, while the environment variable for an |
bos@37 | 313 external hook will be named \texttt{HG\_FOO}. |
bos@37 | 314 |
bos@37 | 315 \subsection{Hook return values and activity control} |
bos@37 | 316 |
bos@37 | 317 A hook that executes successfully must exit with a status of zero if |
bos@37 | 318 external, or return boolean ``false'' if in-process. Failure is |
bos@37 | 319 indicated with a non-zero exit status from an external hook, or an |
bos@37 | 320 in-process hook returning boolean ``true''. If an in-process hook |
bos@37 | 321 raises an exception, the hook is considered to have failed. |
bos@37 | 322 |
bos@37 | 323 For a hook that controls whether an activity can proceed, zero/false |
bos@37 | 324 means ``allow'', while non-zero/true/exception means ``deny''. |
bos@37 | 325 |
bos@37 | 326 \subsection{Writing an external hook} |
bos@37 | 327 |
bos@37 | 328 When you define an external hook in your \hgrc\ and the hook is run, |
bos@37 | 329 its value is passed to your shell, which interprets it. This means |
bos@37 | 330 that you can use normal shell constructs in the body of the hook. |
bos@37 | 331 |
bos@37 | 332 An executable hook is always run with its current directory set to a |
bos@37 | 333 repository's root directory. |
bos@37 | 334 |
bos@37 | 335 Each hook parameter is passed in as an environment variable; the name |
bos@37 | 336 is upper-cased, and prefixed with the string ``\texttt{HG\_}''. |
bos@37 | 337 |
bos@37 | 338 With the exception of hook parameters, Mercurial does not set or |
bos@37 | 339 modify any environment variables when running a hook. This is useful |
bos@37 | 340 to remember if you are writing a site-wide hook that may be run by a |
bos@37 | 341 number of different users with differing environment variables set. |
bos@37 | 342 In multi-user situations, you should not rely on environment variables |
bos@37 | 343 being set to the values you have in your environment when testing the |
bos@37 | 344 hook. |
bos@37 | 345 |
bos@37 | 346 \subsection{Telling Mercurial to use an in-process hook} |
bos@37 | 347 |
bos@37 | 348 The \hgrc\ syntax for defining an in-process hook is slightly |
bos@37 | 349 different than for an executable hook. The value of the hook must |
bos@37 | 350 start with the text ``\texttt{python:}'', and continue with the |
bos@37 | 351 fully-qualified name of a callable object to use as the hook's value. |
bos@37 | 352 |
bos@37 | 353 The module in which a hook lives is automatically imported when a hook |
bos@37 | 354 is run. So long as you have the module name and \envar{PYTHONPATH} |
bos@37 | 355 right, it should ``just work''. |
bos@37 | 356 |
bos@37 | 357 The following \hgrc\ example snippet illustrates the syntax and |
bos@37 | 358 meaning of the notions we just described. |
bos@37 | 359 \begin{codesample2} |
bos@37 | 360 [hooks] |
bos@37 | 361 commit.example = python:mymodule.submodule.myhook |
bos@37 | 362 \end{codesample2} |
bos@37 | 363 When Mercurial runs the \texttt{commit.example} hook, it imports |
bos@37 | 364 \texttt{mymodule.submodule}, looks for the callable object named |
bos@37 | 365 \texttt{myhook}, and calls it. |
bos@37 | 366 |
bos@37 | 367 \subsection{Writing an in-process hook} |
bos@37 | 368 |
bos@37 | 369 The simplest in-process hook does nothing, but illustrates the basic |
bos@37 | 370 shape of the hook API: |
bos@37 | 371 \begin{codesample2} |
bos@37 | 372 def myhook(ui, repo, **kwargs): |
bos@37 | 373 pass |
bos@37 | 374 \end{codesample2} |
bos@37 | 375 The first argument to a Python hook is always a |
bos@37 | 376 \pymodclass{mercurial.ui}{ui} object. The second is a repository object; |
bos@37 | 377 at the moment, it is always an instance of |
bos@37 | 378 \pymodclass{mercurial.localrepo}{localrepository}. Following these two |
bos@37 | 379 arguments are other keyword arguments. Which ones are passed in |
bos@37 | 380 depends on the hook being called, but a hook can ignore arguments it |
bos@37 | 381 doesn't care about by dropping them into a keyword argument dict, as |
bos@37 | 382 with \texttt{**kwargs} above. |
bos@34 | 383 |
bos@34 | 384 |
bos@34 | 385 %%% Local Variables: |
bos@34 | 386 %%% mode: latex |
bos@34 | 387 %%% TeX-master: "00book" |
bos@34 | 388 %%% End: |