hgbook

diff en/collab.tex @ 184:7b812c428074

Document the ssh protocol, URL syntax, and configuration.
author Bryan O'Sullivan <bos@serpentine.com>
date Thu Apr 05 23:28:06 2007 -0700 (2007-04-05)
parents 5fc4a45c069f
children b60e2de6dbc3
line diff
     1.1 --- a/en/collab.tex	Fri Mar 30 23:05:28 2007 -0700
     1.2 +++ b/en/collab.tex	Thu Apr 05 23:28:06 2007 -0700
     1.3 @@ -189,9 +189,9 @@
     1.4  
     1.5  This model resembles working with feature branches.  The difference is
     1.6  that when a feature branch misses a train, someone on the feature team
     1.7 -pulls and merges the changes that went out on that train release, and
     1.8 -the team continues its work on top of that release so that their
     1.9 -feature can make the next release.
    1.10 +pulls and merges the changes that went out on that train release into
    1.11 +the feature branch, and the team continues its work on top of that
    1.12 +release so that their feature can make the next release.
    1.13  
    1.14  \subsection{The Linux kernel model}
    1.15  
    1.16 @@ -223,12 +223,16 @@
    1.17  to Linus.  In addition, there are several well known branches that
    1.18  people use for different purposes.  For example, a few people maintain
    1.19  ``stable'' repositories of older versions of the kernel, to which they
    1.20 -apply critical fixes as needed.
    1.21 +apply critical fixes as needed.  Some maintainers publish multiple
    1.22 +trees: one for experimental changes; one for changes that they are
    1.23 +about to feed upstream; and so on.  Others just publish a single
    1.24 +tree.
    1.25  
    1.26  This model has two notable features.  The first is that it's ``pull
    1.27  only''.  You have to ask, convince, or beg another developer to take a
    1.28 -change from you, because there are no shared trees, and there's no way
    1.29 -to push changes into a tree that someone else controls.
    1.30 +change from you, because there are almost no trees to which more than
    1.31 +one person can push, and there's no way to push changes into a tree
    1.32 +that someone else controls.
    1.33  
    1.34  The second is that it's based on reputation and acclaim.  If you're an
    1.35  unknown, Linus will probably ignore changes from you without even
    1.36 @@ -313,10 +317,287 @@
    1.37  correctly, and find out what URL you should send to your
    1.38  collaborators, start it with the \hggopt{-v} option.
    1.39  
    1.40 -\subsection{Using \command{ssh} as a tunnel}
    1.41 +\subsection{Using the Secure Shell (ssh) protocol}
    1.42  \label{sec:collab:ssh}
    1.43  
    1.44 -\subsection{Serving HTTP with a CGI script}
    1.45 +You can pull and push changes securely over a network connection using
    1.46 +the Secure Shell (\texttt{ssh}) protocol.  To use this successfully,
    1.47 +you may have to do a little bit of configuration on the client or
    1.48 +server sides.
    1.49 +
    1.50 +If you're not familiar with ssh, it's a network protocol that lets you
    1.51 +securely communicate with another computer.  To use it with Mercurial,
    1.52 +you'll be setting up one or more user accounts on a server so that
    1.53 +remote users can log in and execute commands.
    1.54 +
    1.55 +(If you \emph{are} familiar with ssh, you'll probably find some of the
    1.56 +material that follows to be elementary in nature.)
    1.57 +
    1.58 +\subsubsection{How to read and write ssh URLs}
    1.59 +
    1.60 +An ssh URL tends to look like this:
    1.61 +\begin{codesample2}
    1.62 +  ssh://bos@hg.serpentine.com:22/hg/hgbook
    1.63 +\end{codesample2}
    1.64 +\begin{enumerate}
    1.65 +\item The ``\texttt{ssh://}'' part tells Mercurial to use the ssh
    1.66 +  protocol.
    1.67 +\item The ``\texttt{bos@}'' component indicates what username to log
    1.68 +  into the server as.  You can leave this out if the remote username
    1.69 +  is the same as your local username.
    1.70 +\item The ``\texttt{hg.serpentine.com}'' gives the hostname of the
    1.71 +  server to log into.
    1.72 +\item The ``:22'' identifies the port number to connect to the server
    1.73 +  on.  The default port is~22, so you only need to specify this part
    1.74 +  if you're \emph{not} using port~22.
    1.75 +\item The remainder of the URL is the local path to the repository on
    1.76 +  the server.
    1.77 +\end{enumerate}
    1.78 +
    1.79 +There's plenty of scope for confusion with the path component of ssh
    1.80 +URLs, as there is no standard way for tools to interpret it.  Some
    1.81 +programs behave differently than others when dealing with these paths.
    1.82 +This isn't an ideal situation, but it's unlikely to change.  Please
    1.83 +read the following paragraphs carefully.
    1.84 +
    1.85 +Mercurial treats the path to a repository on the server as relative to
    1.86 +the remote user's home directory.  For example, if user \texttt{foo}
    1.87 +on the server has a home directory of \dirname{/home/foo}, then an ssh
    1.88 +URL that contains a path component of \dirname{bar}
    1.89 +\emph{really} refers to the directory \dirname{/home/foo/bar}.
    1.90 +
    1.91 +If you want to specify a path relative to another user's home
    1.92 +directory, you can use a path that starts with a tilde character
    1.93 +followed by the user's name (let's call them \texttt{otheruser}), like
    1.94 +this.
    1.95 +\begin{codesample2}
    1.96 +  ssh://server/~otheruser/hg/repo
    1.97 +\end{codesample2}
    1.98 +
    1.99 +And if you really want to specify an \emph{absolute} path on the
   1.100 +server, begin the path component with two slashes, as in this example.
   1.101 +\begin{codesample2}
   1.102 +  ssh://server//absolute/path
   1.103 +\end{codesample2}
   1.104 +
   1.105 +\subsubsection{Finding an ssh client for your system}
   1.106 +
   1.107 +Almost every Unix-like system comes with OpenSSH preinstalled.  If
   1.108 +you're using such a system, run \Verb|which ssh| to find out if
   1.109 +the \command{ssh} command is installed (it's usually in
   1.110 +\dirname{/usr/bin}).  In the unlikely event that it isn't present,
   1.111 +take a look at your system documentation to figure out how to install
   1.112 +it.
   1.113 +
   1.114 +On Windows, you'll first need to choose download a suitable ssh
   1.115 +client.  There are two alternatives.
   1.116 +\begin{itemize}
   1.117 +\item Simon Tatham's excellent PuTTY package~\cite{web:putty} provides
   1.118 +  a complete suite of ssh client commands.
   1.119 +\item If you have a high tolerance for pain, you can use the Cygwin
   1.120 +  port of OpenSSH.
   1.121 +\end{itemize}
   1.122 +In either case, you'll need to edit your \hgini\ file to tell
   1.123 +Mercurial where to find the actual client command.  For example, if
   1.124 +you're using PuTTY, you'll need to use the \command{plink} command as
   1.125 +a command-line ssh client.
   1.126 +\begin{codesample2}
   1.127 +  [ui]
   1.128 +  ssh = C:/path/to/plink.exe -ssh -i "C:/path/to/my/private/key"
   1.129 +\end{codesample2}
   1.130 +
   1.131 +\begin{note}
   1.132 +  The path to \command{plink} shouldn't contain any whitespace
   1.133 +  characters, or Mercurial may not be able to run it correctly (so
   1.134 +  putting it in \dirname{C:\\Program Files} is probably not be a good
   1.135 +  idea).
   1.136 +\end{note}
   1.137 +
   1.138 +\subsubsection{Generating a key pair}
   1.139 +
   1.140 +To avoid the need to repetitively type a password every time you need
   1.141 +to use your ssh client, I recommend generating a key pair.  On a
   1.142 +Unix-like system, the \command{ssh-keygen} command will do the trick.
   1.143 +On Windows, if you're using PuTTY, the \command{puttygen} command is
   1.144 +what you'll need.
   1.145 +
   1.146 +When you generate a key pair, it's usually \emph{highly} advisable to
   1.147 +protect it with a passphrase.  (The only time that you might not want
   1.148 +to do this id when you're using the ssh protocol for automated tasks
   1.149 +on a secure network.)
   1.150 +
   1.151 +Simply generating a key pair isn't enough, however.  You'll need to
   1.152 +add the public key to the set of authorised keys for whatever user
   1.153 +you're logging in remotely as.  For servers using OpenSSH (the vast
   1.154 +majority), this will mean adding the public key to a list in a file
   1.155 +called \sfilename{authorized\_keys} in their \sdirname{.ssh}
   1.156 +directory.
   1.157 +
   1.158 +On a Unix-like system, your public key will have a \filename{.pub}
   1.159 +extension.  If you're using \command{puttygen} on Windows, you can
   1.160 +save the public key to a file of your choosing, or paste it from the
   1.161 +window it's displayed in straight into the
   1.162 +\sfilename{authorized\_keys} file.
   1.163 +
   1.164 +\subsubsection{Using an authentication agent}
   1.165 +
   1.166 +An authentication agent is a daemon that stores passphrases in memory
   1.167 +(so it will forget passphrases if you log out and log back in again).
   1.168 +An ssh client will notice if it's running, and query it for a
   1.169 +passphrase.  If there's no authentication agent running, or the agent
   1.170 +doesn't store the necessary passphrase, you'll have to type your
   1.171 +passphrase every time Mercurial tries to communicate with a server on
   1.172 +your behalf (e.g.~whenever you pull or push changes).
   1.173 +
   1.174 +The downside of storing passphrases in an agent is that it's possible
   1.175 +for a well-prepared attacker to recover the plain text of your
   1.176 +passphrases, in some cases even if your system has been power-cycled.
   1.177 +You should make your own judgment as to whether this is an acceptable
   1.178 +risk.  It certainly saves a lot of repeated typing.
   1.179 +
   1.180 +On Unix-like systems, the agent is called \command{ssh-agent}, and
   1.181 +it's often run automatically for you when you log in.  You'll need to
   1.182 +use the \command{ssh-add} command to add passphrases to the agent's
   1.183 +store.  On Windows, if you're using PuTTY, the \command{pageant}
   1.184 +command acts as the agent.  It adds an icon to your system tray that
   1.185 +will let you manage stored passphrases.
   1.186 +
   1.187 +\subsubsection{Configuring the server side properly}
   1.188 +
   1.189 +Because ssh can be fiddly to set up if you're new to it, there's a
   1.190 +variety of things that can go wrong.  Add Mercurial on top, and
   1.191 +there's plenty more scope for head-scratching.  Most of these
   1.192 +potential problems occur on the server side, not the client side.  The
   1.193 +good news is that once you've gotten a configuration working, it will
   1.194 +usually continue to work indefinitely.
   1.195 +
   1.196 +Before you try using Mercurial to talk to an ssh server, it's best to
   1.197 +make sure that you can use the normal \command{ssh} or \command{putty}
   1.198 +command to talk to the server first.  If you run into problems with
   1.199 +using these commands directly, Mercurial surely won't work.  Worse, it
   1.200 +will obscure the underlying problem.  Any time you want to debug
   1.201 +ssh-related Mercurial problems, you should drop back to making sure
   1.202 +that plain ssh client commands work first, \emph{before} you worry
   1.203 +about whether there's a problem with Mercurial.
   1.204 +
   1.205 +The first thing to be sure of on the server side is that you can
   1.206 +actually log in from another machine at all.  If you can't use
   1.207 +\command{ssh} or \command{putty} to log in, the error message you get
   1.208 +may give you a few hints as to what's wrong.  The most common problems
   1.209 +are as follows.
   1.210 +\begin{itemize}
   1.211 +\item If you get a ``connection refused'' error, either there isn't an
   1.212 +  SSH daemon running on the server at all, or it's inaccessible due to
   1.213 +  firewall configuration.
   1.214 +\item If you get a ``no route to host'' error, you either have an
   1.215 +  incorrect address for the server or a seriously locked down firewall
   1.216 +  that won't admit its existence at all.
   1.217 +\item If you get a ``permission denied'' error, you may have mistyped
   1.218 +  the username on the server, or you could have mistyped your key's
   1.219 +  passphrase or the remote user's password.
   1.220 +\end{itemize}
   1.221 +In summary, if you're having trouble talking to the server's ssh
   1.222 +daemon, first make sure that one is running at all.  On many systems
   1.223 +it will be installed, but disabled, by default.  Once you're done with
   1.224 +this step, you should then check that the server's firewall is
   1.225 +configured to allow incoming connections on the port the ssh daemon is
   1.226 +listening on (usually~22).  Don't worry about more exotic
   1.227 +possibilities for misconfiguration until you've checked these two
   1.228 +first.
   1.229 +
   1.230 +If you're using an authentication agent on the client side to store
   1.231 +passphrases for your keys, you ought to be able to log into the server
   1.232 +without being prompted for a passphrase or a password.  If you're
   1.233 +prompted for a passphrase, there are a few possible culprits.
   1.234 +\begin{itemize}
   1.235 +\item You might have forgotten to use \command{ssh-add} or
   1.236 +  \command{pageant} to store the passphrase.
   1.237 +\item You might have stored the passphrase for the wrong key.
   1.238 +\end{itemize}
   1.239 +If you're being prompted for the remote user's password, there are
   1.240 +another few possible problems to check.
   1.241 +\begin{itemize}
   1.242 +\item Either the user's home directory or their \sdirname{.ssh}
   1.243 +  directory might have excessively liberal permissions.  As a result,
   1.244 +  the ssh daemon will not trust or read their
   1.245 +  \sfilename{authorized\_keys} file.  For example, a group-writable
   1.246 +  home or \sdirname{.ssh} directory will often cause this symptom.
   1.247 +\item The user's \sfilename{authorized\_keys} file may have a problem.
   1.248 +  If anyone other than the user owns or can write to that file, the
   1.249 +  ssh daemon will not trust or read it.
   1.250 +\end{itemize}
   1.251 +
   1.252 +In the ideal world, you should be able to run the following command
   1.253 +successfully, and it should print exactly one line of output, the
   1.254 +current date and time.
   1.255 +\begin{codesample2}
   1.256 +  ssh myserver date
   1.257 +\end{codesample2}
   1.258 +
   1.259 +If on your server you have login scripts that print banners or other
   1.260 +junk even when running non-interactive commands like this, you should
   1.261 +fix them before you continue, so that they only print output if
   1.262 +they're run interactively.  Otherwise these banners will at least
   1.263 +clutter up Mercurial's output.  Worse, they could potentially cause
   1.264 +problems with running Mercurial commands remotely.  (The usual way to
   1.265 +see if a login script is running in an interactive shell is to check
   1.266 +the return code from the command \Verb|tty -s|.)
   1.267 +
   1.268 +Once you've verified that plain old ssh is working with your server,
   1.269 +the next step is to ensure that Mercurial runs on the server.  The
   1.270 +following command should run successfully:
   1.271 +\begin{codesample2}
   1.272 +  ssh myserver hg version
   1.273 +\end{codesample2}
   1.274 +If you see an error message instead of normal \hgcmd{version} output,
   1.275 +this is usually because you haven't installed Mercurial to
   1.276 +\dirname{/usr/bin}.  Don't worry if this is the case; you don't need
   1.277 +to do that.  But you should check for a few possible problems.
   1.278 +\begin{itemize}
   1.279 +\item Is Mercurial really installed on the server at all?  I know this
   1.280 +  sounds trivial, but it's worth checking!
   1.281 +\item Maybe your shell's search path (usually set via the \envar{PATH}
   1.282 +  environment variable) is simply misconfigured.
   1.283 +\item Perhaps your \envar{PATH} environment variable is only being set
   1.284 +  to point to the location of the \command{hg} executable if the login
   1.285 +  session is interactive.  This can happen if you're setting the path
   1.286 +  in the wrong shell login script.  See your shell's documentation for
   1.287 +  details.
   1.288 +\item The \envar{PYTHONPATH} environment variable may need to contain
   1.289 +  the path to the Mercurial Python modules.  It might not be set at
   1.290 +  all; it could be incorrect; or it may be set only if the login is
   1.291 +  interactive.
   1.292 +\end{itemize}
   1.293 +
   1.294 +If you can run \hgcmd{version} over an ssh connection, well done!
   1.295 +You've got the server and client sorted out.  You should now be able
   1.296 +to use Mercurial to access repositories hosted by that username on
   1.297 +that server.  If you run into problems with Mercurial and ssh at this
   1.298 +point, try using the \hggopt{--debug} option to get a clearer picture
   1.299 +of what's going on.
   1.300 +
   1.301 +\subsubsection{Using compression with ssh}
   1.302 +
   1.303 +Mercurial does not compress data when it uses the ssh protocol,
   1.304 +because the ssh protocol can transparently compress data.  However,
   1.305 +the default behaviour of ssh clients is \emph{not} to request
   1.306 +compression.
   1.307 +
   1.308 +Over any network other than a fast LAN (even a wireless network),
   1.309 +using compression is likely to significantly speed up Mercurial's
   1.310 +network operations.  For example, over a WAN, someone measured
   1.311 +compression as reducing the amount of time required to clone a
   1.312 +particularly large repository from~51 minutes to~17 minutes.
   1.313 +
   1.314 +Both \command{ssh} and \command{plink} accept a \cmdopt{ssh}{-C}
   1.315 +option which turns on compression.  You can easily edit your \hgrc\ to
   1.316 +enable compression for all of Mercurial's uses of the ssh protocol.
   1.317 +\begin{codesample2}
   1.318 +  [ui]
   1.319 +  ssh = ssh -C
   1.320 +\end{codesample2}
   1.321 +
   1.322 +\subsection{Serving over HTTP with a CGI script}
   1.323  \label{sec:collab:cgi}
   1.324  
   1.325