hgbook

diff es/filenames.tex @ 429:e5c5918e8230

Started translation of cgi configuration section
author Igor TAmara <igor@tamarapatino.org>
date Sat Nov 29 19:46:33 2008 -0500 (2008-11-29)
parents 04c08ad7e92e
children 3afc654d70e5
line diff
     1.1 --- a/es/filenames.tex	Sat Oct 18 07:48:21 2008 -0500
     1.2 +++ b/es/filenames.tex	Sat Nov 29 19:46:33 2008 -0500
     1.3 @@ -0,0 +1,306 @@
     1.4 +\chapter{File names and pattern matching}
     1.5 +\label{chap:names}
     1.6 +
     1.7 +Mercurial provides mechanisms that let you work with file names in a
     1.8 +consistent and expressive way.
     1.9 +
    1.10 +\section{Simple file naming}
    1.11 +
    1.12 +Mercurial uses a unified piece of machinery ``under the hood'' to
    1.13 +handle file names.  Every command behaves uniformly with respect to
    1.14 +file names.  The way in which commands work with file names is as
    1.15 +follows.
    1.16 +
    1.17 +If you explicitly name real files on the command line, Mercurial works
    1.18 +with exactly those files, as you would expect.
    1.19 +\interaction{filenames.files}
    1.20 +
    1.21 +When you provide a directory name, Mercurial will interpret this as
    1.22 +``operate on every file in this directory and its subdirectories''.
    1.23 +Mercurial traverses the files and subdirectories in a directory in
    1.24 +alphabetical order.  When it encounters a subdirectory, it will
    1.25 +traverse that subdirectory before continuing with the current
    1.26 +directory.
    1.27 +\interaction{filenames.dirs}
    1.28 +
    1.29 +\section{Running commands without any file names}
    1.30 +
    1.31 +Mercurial's commands that work with file names have useful default
    1.32 +behaviours when you invoke them without providing any file names or
    1.33 +patterns.  What kind of behaviour you should expect depends on what
    1.34 +the command does.  Here are a few rules of thumb you can use to
    1.35 +predict what a command is likely to do if you don't give it any names
    1.36 +to work with.
    1.37 +\begin{itemize}
    1.38 +\item Most commands will operate on the entire working directory.
    1.39 +  This is what the \hgcmd{add} command does, for example.
    1.40 +\item If the command has effects that are difficult or impossible to
    1.41 +  reverse, it will force you to explicitly provide at least one name
    1.42 +  or pattern (see below).  This protects you from accidentally
    1.43 +  deleting files by running \hgcmd{remove} with no arguments, for
    1.44 +  example.
    1.45 +\end{itemize}
    1.46 +
    1.47 +It's easy to work around these default behaviours if they don't suit
    1.48 +you.  If a command normally operates on the whole working directory,
    1.49 +you can invoke it on just the current directory and its subdirectories
    1.50 +by giving it the name ``\dirname{.}''.
    1.51 +\interaction{filenames.wdir-subdir}
    1.52 +
    1.53 +Along the same lines, some commands normally print file names relative
    1.54 +to the root of the repository, even if you're invoking them from a
    1.55 +subdirectory.  Such a command will print file names relative to your
    1.56 +subdirectory if you give it explicit names.  Here, we're going to run
    1.57 +\hgcmd{status} from a subdirectory, and get it to operate on the
    1.58 +entire working directory while printing file names relative to our
    1.59 +subdirectory, by passing it the output of the \hgcmd{root} command.
    1.60 +\interaction{filenames.wdir-relname}
    1.61 +
    1.62 +\section{Telling you what's going on}
    1.63 +
    1.64 +The \hgcmd{add} example in the preceding section illustrates something
    1.65 +else that's helpful about Mercurial commands.  If a command operates
    1.66 +on a file that you didn't name explicitly on the command line, it will
    1.67 +usually print the name of the file, so that you will not be surprised
    1.68 +what's going on.
    1.69 +
    1.70 +The principle here is of \emph{least surprise}.  If you've exactly
    1.71 +named a file on the command line, there's no point in repeating it
    1.72 +back at you.  If Mercurial is acting on a file \emph{implicitly},
    1.73 +because you provided no names, or a directory, or a pattern (see
    1.74 +below), it's safest to tell you what it's doing.
    1.75 +
    1.76 +For commands that behave this way, you can silence them using the
    1.77 +\hggopt{-q} option.  You can also get them to print the name of every
    1.78 +file, even those you've named explicitly, using the \hggopt{-v}
    1.79 +option.
    1.80 +
    1.81 +\section{Using patterns to identify files}
    1.82 +
    1.83 +In addition to working with file and directory names, Mercurial lets
    1.84 +you use \emph{patterns} to identify files.  Mercurial's pattern
    1.85 +handling is expressive.
    1.86 +
    1.87 +On Unix-like systems (Linux, MacOS, etc.), the job of matching file
    1.88 +names to patterns normally falls to the shell.  On these systems, you
    1.89 +must explicitly tell Mercurial that a name is a pattern.  On Windows,
    1.90 +the shell does not expand patterns, so Mercurial will automatically
    1.91 +identify names that are patterns, and expand them for you.
    1.92 +
    1.93 +To provide a pattern in place of a regular name on the command line,
    1.94 +the mechanism is simple:
    1.95 +\begin{codesample2}
    1.96 +  syntax:patternbody
    1.97 +\end{codesample2}
    1.98 +That is, a pattern is identified by a short text string that says what
    1.99 +kind of pattern this is, followed by a colon, followed by the actual
   1.100 +pattern.
   1.101 +
   1.102 +Mercurial supports two kinds of pattern syntax.  The most frequently
   1.103 +used is called \texttt{glob}; this is the same kind of pattern
   1.104 +matching used by the Unix shell, and should be familiar to Windows
   1.105 +command prompt users, too.  
   1.106 +
   1.107 +When Mercurial does automatic pattern matching on Windows, it uses
   1.108 +\texttt{glob} syntax.  You can thus omit the ``\texttt{glob:}'' prefix
   1.109 +on Windows, but it's safe to use it, too.
   1.110 +
   1.111 +The \texttt{re} syntax is more powerful; it lets you specify patterns
   1.112 +using regular expressions, also known as regexps.
   1.113 +
   1.114 +By the way, in the examples that follow, notice that I'm careful to
   1.115 +wrap all of my patterns in quote characters, so that they won't get
   1.116 +expanded by the shell before Mercurial sees them.
   1.117 +
   1.118 +\subsection{Shell-style \texttt{glob} patterns}
   1.119 +
   1.120 +This is an overview of the kinds of patterns you can use when you're
   1.121 +matching on glob patterns.
   1.122 +
   1.123 +The ``\texttt{*}'' character matches any string, within a single
   1.124 +directory.
   1.125 +\interaction{filenames.glob.star}
   1.126 +
   1.127 +The ``\texttt{**}'' pattern matches any string, and crosses directory
   1.128 +boundaries.  It's not a standard Unix glob token, but it's accepted by
   1.129 +several popular Unix shells, and is very useful.
   1.130 +\interaction{filenames.glob.starstar}
   1.131 +
   1.132 +The ``\texttt{?}'' pattern matches any single character.
   1.133 +\interaction{filenames.glob.question}
   1.134 +
   1.135 +The ``\texttt{[}'' character begins a \emph{character class}.  This
   1.136 +matches any single character within the class.  The class ends with a
   1.137 +``\texttt{]}'' character.  A class may contain multiple \emph{range}s
   1.138 +of the form ``\texttt{a-f}'', which is shorthand for
   1.139 +``\texttt{abcdef}''.
   1.140 +\interaction{filenames.glob.range}
   1.141 +If the first character after the ``\texttt{[}'' in a character class
   1.142 +is a ``\texttt{!}'', it \emph{negates} the class, making it match any
   1.143 +single character not in the class.
   1.144 +
   1.145 +A ``\texttt{\{}'' begins a group of subpatterns, where the whole group
   1.146 +matches if any subpattern in the group matches.  The ``\texttt{,}''
   1.147 +character separates subpatterns, and ``\texttt{\}}'' ends the group.
   1.148 +\interaction{filenames.glob.group}
   1.149 +
   1.150 +\subsubsection{Watch out!}
   1.151 +
   1.152 +Don't forget that if you want to match a pattern in any directory, you
   1.153 +should not be using the ``\texttt{*}'' match-any token, as this will
   1.154 +only match within one directory.  Instead, use the ``\texttt{**}''
   1.155 +token.  This small example illustrates the difference between the two.
   1.156 +\interaction{filenames.glob.star-starstar}
   1.157 +
   1.158 +\subsection{Regular expression matching with \texttt{re} patterns}
   1.159 +
   1.160 +Mercurial accepts the same regular expression syntax as the Python
   1.161 +programming language (it uses Python's regexp engine internally).
   1.162 +This is based on the Perl language's regexp syntax, which is the most
   1.163 +popular dialect in use (it's also used in Java, for example).
   1.164 +
   1.165 +I won't discuss Mercurial's regexp dialect in any detail here, as
   1.166 +regexps are not often used.  Perl-style regexps are in any case
   1.167 +already exhaustively documented on a multitude of web sites, and in
   1.168 +many books.  Instead, I will focus here on a few things you should
   1.169 +know if you find yourself needing to use regexps with Mercurial.
   1.170 +
   1.171 +A regexp is matched against an entire file name, relative to the root
   1.172 +of the repository.  In other words, even if you're already in
   1.173 +subbdirectory \dirname{foo}, if you want to match files under this
   1.174 +directory, your pattern must start with ``\texttt{foo/}''.
   1.175 +
   1.176 +One thing to note, if you're familiar with Perl-style regexps, is that
   1.177 +Mercurial's are \emph{rooted}.  That is, a regexp starts matching
   1.178 +against the beginning of a string; it doesn't look for a match
   1.179 +anywhere within the string.  To match anywhere in a string, start
   1.180 +your pattern with ``\texttt{.*}''.
   1.181 +
   1.182 +\section{Filtering files}
   1.183 +
   1.184 +Not only does Mercurial give you a variety of ways to specify files;
   1.185 +it lets you further winnow those files using \emph{filters}.  Commands
   1.186 +that work with file names accept two filtering options.
   1.187 +\begin{itemize}
   1.188 +\item \hggopt{-I}, or \hggopt{--include}, lets you specify a pattern
   1.189 +  that file names must match in order to be processed.
   1.190 +\item \hggopt{-X}, or \hggopt{--exclude}, gives you a way to
   1.191 +  \emph{avoid} processing files, if they match this pattern.
   1.192 +\end{itemize}
   1.193 +You can provide multiple \hggopt{-I} and \hggopt{-X} options on the
   1.194 +command line, and intermix them as you please.  Mercurial interprets
   1.195 +the patterns you provide using glob syntax by default (but you can use
   1.196 +regexps if you need to).
   1.197 +
   1.198 +You can read a \hggopt{-I} filter as ``process only the files that
   1.199 +match this filter''.
   1.200 +\interaction{filenames.filter.include}
   1.201 +The \hggopt{-X} filter is best read as ``process only the files that
   1.202 +don't match this pattern''.
   1.203 +\interaction{filenames.filter.exclude}
   1.204 +
   1.205 +\section{Ignoring unwanted files and directories}
   1.206 +
   1.207 +XXX.
   1.208 +
   1.209 +\section{Case sensitivity}
   1.210 +\label{sec:names:case}
   1.211 +
   1.212 +If you're working in a mixed development environment that contains
   1.213 +both Linux (or other Unix) systems and Macs or Windows systems, you
   1.214 +should keep in the back of your mind the knowledge that they treat the
   1.215 +case (``N'' versus ``n'') of file names in incompatible ways.  This is
   1.216 +not very likely to affect you, and it's easy to deal with if it does,
   1.217 +but it could surprise you if you don't know about it.
   1.218 +
   1.219 +Operating systems and filesystems differ in the way they handle the
   1.220 +\emph{case} of characters in file and directory names.  There are
   1.221 +three common ways to handle case in names.
   1.222 +\begin{itemize}
   1.223 +\item Completely case insensitive.  Uppercase and lowercase versions
   1.224 +  of a letter are treated as identical, both when creating a file and
   1.225 +  during subsequent accesses.  This is common on older DOS-based
   1.226 +  systems.
   1.227 +\item Case preserving, but insensitive.  When a file or directory is
   1.228 +  created, the case of its name is stored, and can be retrieved and
   1.229 +  displayed by the operating system.  When an existing file is being
   1.230 +  looked up, its case is ignored.  This is the standard arrangement on
   1.231 +  Windows and MacOS.  The names \filename{foo} and \filename{FoO}
   1.232 +  identify the same file.  This treatment of uppercase and lowercase
   1.233 +  letters as interchangeable is also referred to as \emph{case
   1.234 +    folding}.
   1.235 +\item Case sensitive.  The case of a name is significant at all times.
   1.236 +  The names \filename{foo} and {FoO} identify different files.  This
   1.237 +  is the way Linux and Unix systems normally work.
   1.238 +\end{itemize}
   1.239 +
   1.240 +On Unix-like systems, it is possible to have any or all of the above
   1.241 +ways of handling case in action at once.  For example, if you use a
   1.242 +USB thumb drive formatted with a FAT32 filesystem on a Linux system,
   1.243 +Linux will handle names on that filesystem in a case preserving, but
   1.244 +insensitive, way.
   1.245 +
   1.246 +\subsection{Safe, portable repository storage}
   1.247 +
   1.248 +Mercurial's repository storage mechanism is \emph{case safe}.  It
   1.249 +translates file names so that they can be safely stored on both case
   1.250 +sensitive and case insensitive filesystems.  This means that you can
   1.251 +use normal file copying tools to transfer a Mercurial repository onto,
   1.252 +for example, a USB thumb drive, and safely move that drive and
   1.253 +repository back and forth between a Mac, a PC running Windows, and a
   1.254 +Linux box.
   1.255 +
   1.256 +\subsection{Detecting case conflicts}
   1.257 +
   1.258 +When operating in the working directory, Mercurial honours the naming
   1.259 +policy of the filesystem where the working directory is located.  If
   1.260 +the filesystem is case preserving, but insensitive, Mercurial will
   1.261 +treat names that differ only in case as the same.
   1.262 +
   1.263 +An important aspect of this approach is that it is possible to commit
   1.264 +a changeset on a case sensitive (typically Linux or Unix) filesystem
   1.265 +that will cause trouble for users on case insensitive (usually Windows
   1.266 +and MacOS) users.  If a Linux user commits changes to two files, one
   1.267 +named \filename{myfile.c} and the other named \filename{MyFile.C},
   1.268 +they will be stored correctly in the repository.  And in the working
   1.269 +directories of other Linux users, they will be correctly represented
   1.270 +as separate files.
   1.271 +
   1.272 +If a Windows or Mac user pulls this change, they will not initially
   1.273 +have a problem, because Mercurial's repository storage mechanism is
   1.274 +case safe.  However, once they try to \hgcmd{update} the working
   1.275 +directory to that changeset, or \hgcmd{merge} with that changeset,
   1.276 +Mercurial will spot the conflict between the two file names that the
   1.277 +filesystem would treat as the same, and forbid the update or merge
   1.278 +from occurring.
   1.279 +
   1.280 +\subsection{Fixing a case conflict}
   1.281 +
   1.282 +If you are using Windows or a Mac in a mixed environment where some of
   1.283 +your collaborators are using Linux or Unix, and Mercurial reports a
   1.284 +case folding conflict when you try to \hgcmd{update} or \hgcmd{merge},
   1.285 +the procedure to fix the problem is simple.
   1.286 +
   1.287 +Just find a nearby Linux or Unix box, clone the problem repository
   1.288 +onto it, and use Mercurial's \hgcmd{rename} command to change the
   1.289 +names of any offending files or directories so that they will no
   1.290 +longer cause case folding conflicts.  Commit this change, \hgcmd{pull}
   1.291 +or \hgcmd{push} it across to your Windows or MacOS system, and
   1.292 +\hgcmd{update} to the revision with the non-conflicting names.
   1.293 +
   1.294 +The changeset with case-conflicting names will remain in your
   1.295 +project's history, and you still won't be able to \hgcmd{update} your
   1.296 +working directory to that changeset on a Windows or MacOS system, but
   1.297 +you can continue development unimpeded.
   1.298 +
   1.299 +\begin{note}
   1.300 +  Prior to version~0.9.3, Mercurial did not use a case safe repository
   1.301 +  storage mechanism, and did not detect case folding conflicts.  If
   1.302 +  you are using an older version of Mercurial on Windows or MacOS, I
   1.303 +  strongly recommend that you upgrade.
   1.304 +\end{note}
   1.305 +
   1.306 +%%% Local Variables: 
   1.307 +%%% mode: latex
   1.308 +%%% TeX-master: "00book"
   1.309 +%%% End: