hgbook

view fr/ch07-filenames.xml @ 983:5e1e70fcdfdb

Corrected some errors.
- Indentation problems
- Syntax errors (missing </para>,...)
- French mistakes
author Frédéric Bouquet <youshe.jaalon@gmail.com>
date Tue Sep 08 23:42:42 2009 +0200 (2009-09-08)
parents
children 6f8c48362758
line source
1 <!-- vim: set filetype=docbkxml shiftwidth=2 autoindent expandtab tw=77 : -->
3 <chapter>
4 <title>File names and pattern matching</title>
5 <para>\label{chap:names}</para>
7 <para>Mercurial provides mechanisms that let you work with file names in a
8 consistent and expressive way.</para>
10 <sect1>
11 <title>Simple file naming</title>
13 <para>Mercurial uses a unified piece of machinery <quote>under the hood</quote> to
14 handle file names. Every command behaves uniformly with respect to
15 file names. The way in which commands work with file names is as
16 follows.</para>
18 <para>If you explicitly name real files on the command line, Mercurial works
19 with exactly those files, as you would expect.
20 <!-- &interaction.filenames.files; --></para>
22 <para>When you provide a directory name, Mercurial will interpret this as
23 <quote>operate on every file in this directory and its subdirectories</quote>.
24 Mercurial traverses the files and subdirectories in a directory in
25 alphabetical order. When it encounters a subdirectory, it will
26 traverse that subdirectory before continuing with the current
27 directory.
28 <!-- &interaction.filenames.dirs; --></para>
30 </sect1>
31 <sect1>
32 <title>Running commands without any file names</title>
34 <para>Mercurial's commands that work with file names have useful default
35 behaviours when you invoke them without providing any file names or
36 patterns. What kind of behaviour you should expect depends on what
37 the command does. Here are a few rules of thumb you can use to
38 predict what a command is likely to do if you don't give it any names
39 to work with.</para>
40 <itemizedlist>
41 <listitem><para>Most commands will operate on the entire working directory.
42 This is what the <command role="hg-cmd">hg add</command> command does, for example.</para>
43 </listitem>
44 <listitem><para>If the command has effects that are difficult or impossible to
45 reverse, it will force you to explicitly provide at least one name
46 or pattern (see below). This protects you from accidentally
47 deleting files by running <command role="hg-cmd">hg remove</command> with no arguments, for
48 example.</para>
49 </listitem></itemizedlist>
51 <para>It's easy to work around these default behaviours if they don't suit
52 you. If a command normally operates on the whole working directory,
53 you can invoke it on just the current directory and its subdirectories
54 by giving it the name <quote><filename class="directory">.</filename></quote>.
55 <!-- &interaction.filenames.wdir-subdir; -->
56 </para>
58 <para>Along the same lines, some commands normally print file names relative
59 to the root of the repository, even if you're invoking them from a
60 subdirectory. Such a command will print file names relative to your
61 subdirectory if you give it explicit names. Here, we're going to run
62 <command role="hg-cmd">hg status</command> from a subdirectory, and get it to operate on the
63 entire working directory while printing file names relative to our
64 subdirectory, by passing it the output of the <command role="hg-cmd">hg root</command> command.
65 <!-- &interaction.filenames.wdir-relname; -->
66 </para>
68 </sect1>
69 <sect1>
70 <title>Telling you what's going on</title>
72 <para>The <command role="hg-cmd">hg add</command> example in the preceding section illustrates something
73 else that's helpful about Mercurial commands. If a command operates
74 on a file that you didn't name explicitly on the command line, it will
75 usually print the name of the file, so that you will not be surprised
76 what's going on.
77 </para>
79 <para>The principle here is of <emphasis>least surprise</emphasis>. If you've exactly
80 named a file on the command line, there's no point in repeating it
81 back at you. If Mercurial is acting on a file <emphasis>implicitly</emphasis>,
82 because you provided no names, or a directory, or a pattern (see
83 below), it's safest to tell you what it's doing.
84 </para>
86 <para>For commands that behave this way, you can silence them using the
87 <option role="hg-opt-global">-q</option> option. You can also get them to print the name of every
88 file, even those you've named explicitly, using the <option role="hg-opt-global">-v</option>
89 option.
90 </para>
92 </sect1>
93 <sect1>
94 <title>Using patterns to identify files</title>
96 <para>In addition to working with file and directory names, Mercurial lets
97 you use <emphasis>patterns</emphasis> to identify files. Mercurial's pattern
98 handling is expressive.
99 </para>
101 <para>On Unix-like systems (Linux, MacOS, etc.), the job of matching file
102 names to patterns normally falls to the shell. On these systems, you
103 must explicitly tell Mercurial that a name is a pattern. On Windows,
104 the shell does not expand patterns, so Mercurial will automatically
105 identify names that are patterns, and expand them for you.
106 </para>
108 <para>To provide a pattern in place of a regular name on the command line,
109 the mechanism is simple:
110 </para>
111 <programlisting>
112 <para> syntax:patternbody
113 </para>
114 </programlisting>
115 <para>That is, a pattern is identified by a short text string that says what
116 kind of pattern this is, followed by a colon, followed by the actual
117 pattern.
118 </para>
120 <para>Mercurial supports two kinds of pattern syntax. The most frequently
121 used is called <literal>glob</literal>; this is the same kind of pattern
122 matching used by the Unix shell, and should be familiar to Windows
123 command prompt users, too.
124 </para>
126 <para>When Mercurial does automatic pattern matching on Windows, it uses
127 <literal>glob</literal> syntax. You can thus omit the <quote><literal>glob:</literal></quote> prefix
128 on Windows, but it's safe to use it, too.
129 </para>
131 <para>The <literal>re</literal> syntax is more powerful; it lets you specify patterns
132 using regular expressions, also known as regexps.
133 </para>
135 <para>By the way, in the examples that follow, notice that I'm careful to
136 wrap all of my patterns in quote characters, so that they won't get
137 expanded by the shell before Mercurial sees them.
138 </para>
140 <sect2>
141 <title>Shell-style <literal>glob</literal> patterns</title>
143 <para>This is an overview of the kinds of patterns you can use when you're
144 matching on glob patterns.
145 </para>
147 <para>The <quote><literal>*</literal></quote> character matches any string, within a single
148 directory.
149 <!-- &interaction.filenames.glob.star; -->
150 </para>
152 <para>The <quote><literal>**</literal></quote> pattern matches any string, and crosses directory
153 boundaries. It's not a standard Unix glob token, but it's accepted by
154 several popular Unix shells, and is very useful.
155 <!-- &interaction.filenames.glob.starstar; -->
156 </para>
158 <para>The <quote><literal>?</literal></quote> pattern matches any single character.
159 <!-- &interaction.filenames.glob.question; -->
160 </para>
162 <para>The <quote><literal>[</literal></quote> character begins a <emphasis>character class</emphasis>. This
163 matches any single character within the class. The class ends with a
164 <quote><literal>]</literal></quote> character. A class may contain multiple <emphasis>range</emphasis>s
165 of the form <quote><literal>a-f</literal></quote>, which is shorthand for
166 <quote><literal>abcdef</literal></quote>.
167 <!-- &interaction.filenames.glob.range; -->
168 If the first character after the <quote><literal>[</literal></quote> in a character class
169 is a <quote><literal>!</literal></quote>, it <emphasis>negates</emphasis> the class, making it match any
170 single character not in the class.
171 </para>
173 <para>A <quote><literal>{</literal></quote> begins a group of subpatterns, where the whole group
174 matches if any subpattern in the group matches. The <quote><literal>,</literal></quote>
175 character separates subpatterns, and <quote>\texttt{}}</quote> ends the group.
176 <!-- &interaction.filenames.glob.group; -->
177 </para>
179 <sect3>
180 <title>Watch out!</title>
182 <para>Don't forget that if you want to match a pattern in any directory, you
183 should not be using the <quote><literal>*</literal></quote> match-any token, as this will
184 only match within one directory. Instead, use the <quote><literal>**</literal></quote>
185 token. This small example illustrates the difference between the two.
186 <!-- &interaction.filenames.glob.star-starstar; -->
187 </para>
189 </sect3>
190 </sect2>
191 <sect2>
192 <title>Regular expression matching with <literal>re</literal> patterns</title>
194 <para>Mercurial accepts the same regular expression syntax as the Python
195 programming language (it uses Python's regexp engine internally).
196 This is based on the Perl language's regexp syntax, which is the most
197 popular dialect in use (it's also used in Java, for example).
198 </para>
200 <para>I won't discuss Mercurial's regexp dialect in any detail here, as
201 regexps are not often used. Perl-style regexps are in any case
202 already exhaustively documented on a multitude of web sites, and in
203 many books. Instead, I will focus here on a few things you should
204 know if you find yourself needing to use regexps with Mercurial.
205 </para>
207 <para>A regexp is matched against an entire file name, relative to the root
208 of the repository. In other words, even if you're already in
209 subbdirectory <filename class="directory">foo</filename>, if you want to match files under this
210 directory, your pattern must start with <quote><literal>foo/</literal></quote>.
211 </para>
213 <para>One thing to note, if you're familiar with Perl-style regexps, is that
214 Mercurial's are <emphasis>rooted</emphasis>. That is, a regexp starts matching
215 against the beginning of a string; it doesn't look for a match
216 anywhere within the string. To match anywhere in a string, start
217 your pattern with <quote><literal>.*</literal></quote>.
218 </para>
220 </sect2>
221 </sect1>
222 <sect1>
223 <title>Filtering files</title>
225 <para>Not only does Mercurial give you a variety of ways to specify files;
226 it lets you further winnow those files using <emphasis>filters</emphasis>. Commands
227 that work with file names accept two filtering options.
228 </para>
229 <itemizedlist>
230 <listitem><para><option role="hg-opt-global">-I</option>, or <option role="hg-opt-global">--include</option>, lets you specify a pattern
231 that file names must match in order to be processed.
232 </para>
233 </listitem>
234 <listitem><para><option role="hg-opt-global">-X</option>, or <option role="hg-opt-global">--exclude</option>, gives you a way to
235 <emphasis>avoid</emphasis> processing files, if they match this pattern.
236 </para>
237 </listitem></itemizedlist>
238 <para>You can provide multiple <option role="hg-opt-global">-I</option> and <option role="hg-opt-global">-X</option> options on the
239 command line, and intermix them as you please. Mercurial interprets
240 the patterns you provide using glob syntax by default (but you can use
241 regexps if you need to).
242 </para>
244 <para>You can read a <option role="hg-opt-global">-I</option> filter as <quote>process only the files that
245 match this filter</quote>.
246 <!-- &interaction.filenames.filter.include; -->
247 The <option role="hg-opt-global">-X</option> filter is best read as <quote>process only the files that
248 don't match this pattern</quote>.
249 <!-- &interaction.filenames.filter.exclude; -->
250 </para>
252 </sect1>
253 <sect1>
254 <title>Ignoring unwanted files and directories</title>
256 <para>XXX.
257 </para>
259 </sect1>
260 <sect1>
261 <title>Case sensitivity</title>
262 <para>\label{sec:names:case}
263 </para>
265 <para>If you're working in a mixed development environment that contains
266 both Linux (or other Unix) systems and Macs or Windows systems, you
267 should keep in the back of your mind the knowledge that they treat the
268 case (<quote>N</quote> versus <quote>n</quote>) of file names in incompatible ways. This is
269 not very likely to affect you, and it's easy to deal with if it does,
270 but it could surprise you if you don't know about it.
271 </para>
273 <para>Operating systems and filesystems differ in the way they handle the
274 <emphasis>case</emphasis> of characters in file and directory names. There are
275 three common ways to handle case in names.
276 </para>
277 <itemizedlist>
278 <listitem><para>Completely case insensitive. Uppercase and lowercase versions
279 of a letter are treated as identical, both when creating a file and
280 during subsequent accesses. This is common on older DOS-based
281 systems.
282 </para>
283 </listitem>
284 <listitem><para>Case preserving, but insensitive. When a file or directory is
285 created, the case of its name is stored, and can be retrieved and
286 displayed by the operating system. When an existing file is being
287 looked up, its case is ignored. This is the standard arrangement on
288 Windows and MacOS. The names <filename>foo</filename> and <filename>FoO</filename>
289 identify the same file. This treatment of uppercase and lowercase
290 letters as interchangeable is also referred to as \emph{case
291 folding}.
292 </para>
293 </listitem>
294 <listitem><para>Case sensitive. The case of a name is significant at all times.
295 The names <filename>foo</filename> and {FoO} identify different files. This
296 is the way Linux and Unix systems normally work.
297 </para>
298 </listitem></itemizedlist>
300 <para>On Unix-like systems, it is possible to have any or all of the above
301 ways of handling case in action at once. For example, if you use a
302 USB thumb drive formatted with a FAT32 filesystem on a Linux system,
303 Linux will handle names on that filesystem in a case preserving, but
304 insensitive, way.
305 </para>
307 <sect2>
308 <title>Safe, portable repository storage</title>
310 <para>Mercurial's repository storage mechanism is <emphasis>case safe</emphasis>. It
311 translates file names so that they can be safely stored on both case
312 sensitive and case insensitive filesystems. This means that you can
313 use normal file copying tools to transfer a Mercurial repository onto,
314 for example, a USB thumb drive, and safely move that drive and
315 repository back and forth between a Mac, a PC running Windows, and a
316 Linux box.
317 </para>
319 </sect2>
320 <sect2>
321 <title>Detecting case conflicts</title>
323 <para>When operating in the working directory, Mercurial honours the naming
324 policy of the filesystem where the working directory is located. If
325 the filesystem is case preserving, but insensitive, Mercurial will
326 treat names that differ only in case as the same.
327 </para>
329 <para>An important aspect of this approach is that it is possible to commit
330 a changeset on a case sensitive (typically Linux or Unix) filesystem
331 that will cause trouble for users on case insensitive (usually Windows
332 and MacOS) users. If a Linux user commits changes to two files, one
333 named <filename>myfile.c</filename> and the other named <filename>MyFile.C</filename>,
334 they will be stored correctly in the repository. And in the working
335 directories of other Linux users, they will be correctly represented
336 as separate files.
337 </para>
339 <para>If a Windows or Mac user pulls this change, they will not initially
340 have a problem, because Mercurial's repository storage mechanism is
341 case safe. However, once they try to <command role="hg-cmd">hg update</command> the working
342 directory to that changeset, or <command role="hg-cmd">hg merge</command> with that changeset,
343 Mercurial will spot the conflict between the two file names that the
344 filesystem would treat as the same, and forbid the update or merge
345 from occurring.
346 </para>
348 </sect2>
349 <sect2>
350 <title>Fixing a case conflict</title>
352 <para>If you are using Windows or a Mac in a mixed environment where some of
353 your collaborators are using Linux or Unix, and Mercurial reports a
354 case folding conflict when you try to <command role="hg-cmd">hg update</command> or <command role="hg-cmd">hg merge</command>,
355 the procedure to fix the problem is simple.
356 </para>
358 <para>Just find a nearby Linux or Unix box, clone the problem repository
359 onto it, and use Mercurial's <command role="hg-cmd">hg rename</command> command to change the
360 names of any offending files or directories so that they will no
361 longer cause case folding conflicts. Commit this change, <command role="hg-cmd">hg pull</command>
362 or <command role="hg-cmd">hg push</command> it across to your Windows or MacOS system, and
363 <command role="hg-cmd">hg update</command> to the revision with the non-conflicting names.
364 </para>
366 <para>The changeset with case-conflicting names will remain in your
367 project's history, and you still won't be able to <command role="hg-cmd">hg update</command> your
368 working directory to that changeset on a Windows or MacOS system, but
369 you can continue development unimpeded.
370 </para>
372 <note>
373 <para> Prior to version 0.9.3, Mercurial did not use a case safe repository
374 storage mechanism, and did not detect case folding conflicts. If
375 you are using an older version of Mercurial on Windows or MacOS, I
376 strongly recommend that you upgrade.
377 </para>
378 </note>
380 </sect2>
381 </sect1>
382 </chapter>
384 <!--
385 local variables:
386 sgml-parent-document: ("00book.xml" "book" "chapter")
387 end:
388 -->