rev |
line source |
bos@559
|
1 <!-- vim: set filetype=docbkxml shiftwidth=2 autoindent expandtab tw=77 : -->
|
bos@559
|
2
|
bos@559
|
3 <chapter id="chap:names">
|
bos@559
|
4 <title>File names and pattern matching</title>
|
bos@559
|
5
|
bos@559
|
6 <para>Mercurial provides mechanisms that let you work with file
|
bos@559
|
7 names in a consistent and expressive way.</para>
|
bos@559
|
8
|
bos@559
|
9 <sect1>
|
bos@559
|
10 <title>Simple file naming</title>
|
bos@559
|
11
|
bos@559
|
12 <para>Mercurial uses a unified piece of machinery <quote>under the
|
bos@559
|
13 hood</quote> to handle file names. Every command behaves
|
bos@559
|
14 uniformly with respect to file names. The way in which commands
|
bos@559
|
15 work with file names is as follows.</para>
|
bos@559
|
16
|
bos@559
|
17 <para>If you explicitly name real files on the command line,
|
bos@559
|
18 Mercurial works with exactly those files, as you would expect.
|
bos@567
|
19 &interaction.filenames.files;</para>
|
bos@559
|
20
|
bos@559
|
21 <para>When you provide a directory name, Mercurial will interpret
|
bos@559
|
22 this as <quote>operate on every file in this directory and its
|
bos@559
|
23 subdirectories</quote>. Mercurial traverses the files and
|
bos@559
|
24 subdirectories in a directory in alphabetical order. When it
|
bos@559
|
25 encounters a subdirectory, it will traverse that subdirectory
|
bos@567
|
26 before continuing with the current directory.</para>
|
bos@567
|
27
|
bos@567
|
28 &interaction.filenames.dirs;
|
bos@559
|
29
|
bos@559
|
30 </sect1>
|
bos@559
|
31 <sect1>
|
bos@559
|
32 <title>Running commands without any file names</title>
|
bos@559
|
33
|
bos@559
|
34 <para>Mercurial's commands that work with file names have useful
|
bos@559
|
35 default behaviours when you invoke them without providing any
|
bos@559
|
36 file names or patterns. What kind of behaviour you should
|
bos@559
|
37 expect depends on what the command does. Here are a few rules
|
bos@559
|
38 of thumb you can use to predict what a command is likely to do
|
bos@559
|
39 if you don't give it any names to work with.</para>
|
bos@559
|
40 <itemizedlist>
|
bos@559
|
41 <listitem><para>Most commands will operate on the entire working
|
bos@559
|
42 directory. This is what the <command role="hg-cmd">hg
|
bos@559
|
43 add</command> command does, for example.</para>
|
bos@559
|
44 </listitem>
|
bos@559
|
45 <listitem><para>If the command has effects that are difficult or
|
bos@559
|
46 impossible to reverse, it will force you to explicitly
|
bos@559
|
47 provide at least one name or pattern (see below). This
|
bos@559
|
48 protects you from accidentally deleting files by running
|
bos@559
|
49 <command role="hg-cmd">hg remove</command> with no
|
bos@559
|
50 arguments, for example.</para>
|
bos@559
|
51 </listitem></itemizedlist>
|
bos@559
|
52
|
bos@559
|
53 <para>It's easy to work around these default behaviours if they
|
bos@559
|
54 don't suit you. If a command normally operates on the whole
|
bos@559
|
55 working directory, you can invoke it on just the current
|
bos@559
|
56 directory and its subdirectories by giving it the name
|
bos@567
|
57 <quote><filename class="directory">.</filename></quote>.</para>
|
bos@567
|
58
|
bos@567
|
59 &interaction.filenames.wdir-subdir;
|
bos@559
|
60
|
bos@559
|
61 <para>Along the same lines, some commands normally print file
|
bos@559
|
62 names relative to the root of the repository, even if you're
|
bos@559
|
63 invoking them from a subdirectory. Such a command will print
|
bos@559
|
64 file names relative to your subdirectory if you give it explicit
|
bos@559
|
65 names. Here, we're going to run <command role="hg-cmd">hg
|
bos@559
|
66 status</command> from a subdirectory, and get it to operate on
|
bos@559
|
67 the entire working directory while printing file names relative
|
bos@559
|
68 to our subdirectory, by passing it the output of the <command
|
bos@567
|
69 role="hg-cmd">hg root</command> command.</para>
|
bos@567
|
70
|
bos@567
|
71 &interaction.filenames.wdir-relname;
|
bos@559
|
72
|
bos@559
|
73 </sect1>
|
bos@559
|
74 <sect1>
|
bos@559
|
75 <title>Telling you what's going on</title>
|
bos@559
|
76
|
bos@559
|
77 <para>The <command role="hg-cmd">hg add</command> example in the
|
bos@559
|
78 preceding section illustrates something else that's helpful
|
bos@559
|
79 about Mercurial commands. If a command operates on a file that
|
bos@559
|
80 you didn't name explicitly on the command line, it will usually
|
bos@559
|
81 print the name of the file, so that you will not be surprised
|
bos@559
|
82 what's going on.</para>
|
bos@559
|
83
|
bos@559
|
84 <para>The principle here is of <emphasis>least
|
bos@559
|
85 surprise</emphasis>. If you've exactly named a file on the
|
bos@559
|
86 command line, there's no point in repeating it back at you. If
|
bos@559
|
87 Mercurial is acting on a file <emphasis>implicitly</emphasis>,
|
bos@559
|
88 because you provided no names, or a directory, or a pattern (see
|
bos@559
|
89 below), it's safest to tell you what it's doing.</para>
|
bos@559
|
90
|
bos@559
|
91 <para>For commands that behave this way, you can silence them
|
bos@559
|
92 using the <option role="hg-opt-global">-q</option> option. You
|
bos@559
|
93 can also get them to print the name of every file, even those
|
bos@559
|
94 you've named explicitly, using the <option
|
bos@559
|
95 role="hg-opt-global">-v</option> option.</para>
|
bos@559
|
96
|
bos@559
|
97 </sect1>
|
bos@559
|
98 <sect1>
|
bos@559
|
99 <title>Using patterns to identify files</title>
|
bos@559
|
100
|
bos@559
|
101 <para>In addition to working with file and directory names,
|
bos@559
|
102 Mercurial lets you use <emphasis>patterns</emphasis> to identify
|
bos@559
|
103 files. Mercurial's pattern handling is expressive.</para>
|
bos@559
|
104
|
bos@559
|
105 <para>On Unix-like systems (Linux, MacOS, etc.), the job of
|
bos@559
|
106 matching file names to patterns normally falls to the shell. On
|
bos@559
|
107 these systems, you must explicitly tell Mercurial that a name is
|
bos@559
|
108 a pattern. On Windows, the shell does not expand patterns, so
|
bos@559
|
109 Mercurial will automatically identify names that are patterns,
|
bos@559
|
110 and expand them for you.</para>
|
bos@559
|
111
|
bos@559
|
112 <para>To provide a pattern in place of a regular name on the
|
bos@559
|
113 command line, the mechanism is simple:</para>
|
bos@559
|
114 <programlisting>syntax:patternbody</programlisting>
|
bos@559
|
115 <para>That is, a pattern is identified by a short text string that
|
bos@559
|
116 says what kind of pattern this is, followed by a colon, followed
|
bos@559
|
117 by the actual pattern.</para>
|
bos@559
|
118
|
bos@559
|
119 <para>Mercurial supports two kinds of pattern syntax. The most
|
bos@559
|
120 frequently used is called <literal>glob</literal>; this is the
|
bos@559
|
121 same kind of pattern matching used by the Unix shell, and should
|
bos@559
|
122 be familiar to Windows command prompt users, too.</para>
|
bos@559
|
123
|
bos@559
|
124 <para>When Mercurial does automatic pattern matching on Windows,
|
bos@559
|
125 it uses <literal>glob</literal> syntax. You can thus omit the
|
bos@559
|
126 <quote><literal>glob:</literal></quote> prefix on Windows, but
|
bos@559
|
127 it's safe to use it, too.</para>
|
bos@559
|
128
|
bos@559
|
129 <para>The <literal>re</literal> syntax is more powerful; it lets
|
bos@559
|
130 you specify patterns using regular expressions, also known as
|
bos@559
|
131 regexps.</para>
|
bos@559
|
132
|
bos@559
|
133 <para>By the way, in the examples that follow, notice that I'm
|
bos@559
|
134 careful to wrap all of my patterns in quote characters, so that
|
bos@559
|
135 they won't get expanded by the shell before Mercurial sees
|
bos@559
|
136 them.</para>
|
bos@559
|
137
|
bos@559
|
138 <sect2>
|
bos@559
|
139 <title>Shell-style <literal>glob</literal> patterns</title>
|
bos@559
|
140
|
bos@559
|
141 <para>This is an overview of the kinds of patterns you can use
|
bos@559
|
142 when you're matching on glob patterns.</para>
|
bos@559
|
143
|
bos@559
|
144 <para>The <quote><literal>*</literal></quote> character matches
|
bos@567
|
145 any string, within a single directory.</para>
|
bos@567
|
146
|
bos@567
|
147 &interaction.filenames.glob.star;
|
bos@559
|
148
|
bos@559
|
149 <para>The <quote><literal>**</literal></quote> pattern matches
|
bos@559
|
150 any string, and crosses directory boundaries. It's not a
|
bos@559
|
151 standard Unix glob token, but it's accepted by several popular
|
bos@567
|
152 Unix shells, and is very useful.</para>
|
bos@567
|
153
|
bos@567
|
154 &interaction.filenames.glob.starstar;
|
bos@559
|
155
|
bos@559
|
156 <para>The <quote><literal>?</literal></quote> pattern matches
|
bos@567
|
157 any single character.</para>
|
bos@567
|
158
|
bos@567
|
159 &interaction.filenames.glob.question;
|
bos@559
|
160
|
bos@559
|
161 <para>The <quote><literal>[</literal></quote> character begins a
|
bos@559
|
162 <emphasis>character class</emphasis>. This matches any single
|
bos@559
|
163 character within the class. The class ends with a
|
bos@559
|
164 <quote><literal>]</literal></quote> character. A class may
|
bos@559
|
165 contain multiple <emphasis>range</emphasis>s of the form
|
bos@559
|
166 <quote><literal>a-f</literal></quote>, which is shorthand for
|
bos@567
|
167 <quote><literal>abcdef</literal></quote>.</para>
|
bos@567
|
168
|
bos@567
|
169 &interaction.filenames.glob.range;
|
bos@567
|
170
|
bos@567
|
171 <para>If the first character after the
|
bos@567
|
172 <quote><literal>[</literal></quote> in a character class is a
|
bos@567
|
173 <quote><literal>!</literal></quote>, it
|
bos@559
|
174 <emphasis>negates</emphasis> the class, making it match any
|
bos@559
|
175 single character not in the class.</para>
|
bos@559
|
176
|
bos@559
|
177 <para>A <quote><literal>{</literal></quote> begins a group of
|
bos@559
|
178 subpatterns, where the whole group matches if any subpattern
|
bos@559
|
179 in the group matches. The <quote><literal>,</literal></quote>
|
bos@567
|
180 character separates subpatterns, and
|
bos@567
|
181 <quote><literal>}</literal></quote> ends the group.</para>
|
bos@567
|
182
|
bos@567
|
183 &interaction.filenames.glob.group;
|
bos@559
|
184
|
bos@559
|
185 <sect3>
|
bos@559
|
186 <title>Watch out!</title>
|
bos@559
|
187
|
bos@559
|
188 <para>Don't forget that if you want to match a pattern in any
|
bos@559
|
189 directory, you should not be using the
|
bos@559
|
190 <quote><literal>*</literal></quote> match-any token, as this
|
bos@559
|
191 will only match within one directory. Instead, use the
|
bos@559
|
192 <quote><literal>**</literal></quote> token. This small
|
bos@567
|
193 example illustrates the difference between the two.</para>
|
bos@567
|
194
|
bos@567
|
195 &interaction.filenames.glob.star-starstar;
|
bos@559
|
196
|
bos@559
|
197 </sect3>
|
bos@559
|
198 </sect2>
|
bos@559
|
199 <sect2>
|
bos@559
|
200 <title>Regular expression matching with <literal>re</literal>
|
bos@559
|
201 patterns</title>
|
bos@559
|
202
|
bos@559
|
203 <para>Mercurial accepts the same regular expression syntax as
|
bos@559
|
204 the Python programming language (it uses Python's regexp
|
bos@559
|
205 engine internally). This is based on the Perl language's
|
bos@559
|
206 regexp syntax, which is the most popular dialect in use (it's
|
bos@559
|
207 also used in Java, for example).</para>
|
bos@559
|
208
|
bos@559
|
209 <para>I won't discuss Mercurial's regexp dialect in any detail
|
bos@559
|
210 here, as regexps are not often used. Perl-style regexps are
|
bos@559
|
211 in any case already exhaustively documented on a multitude of
|
bos@559
|
212 web sites, and in many books. Instead, I will focus here on a
|
bos@559
|
213 few things you should know if you find yourself needing to use
|
bos@559
|
214 regexps with Mercurial.</para>
|
bos@559
|
215
|
bos@559
|
216 <para>A regexp is matched against an entire file name, relative
|
bos@559
|
217 to the root of the repository. In other words, even if you're
|
bos@559
|
218 already in subbdirectory <filename
|
bos@559
|
219 class="directory">foo</filename>, if you want to match files
|
bos@559
|
220 under this directory, your pattern must start with
|
bos@559
|
221 <quote><literal>foo/</literal></quote>.</para>
|
bos@559
|
222
|
bos@559
|
223 <para>One thing to note, if you're familiar with Perl-style
|
bos@559
|
224 regexps, is that Mercurial's are <emphasis>rooted</emphasis>.
|
bos@559
|
225 That is, a regexp starts matching against the beginning of a
|
bos@559
|
226 string; it doesn't look for a match anywhere within the
|
bos@559
|
227 string. To match anywhere in a string, start your pattern
|
bos@559
|
228 with <quote><literal>.*</literal></quote>.</para>
|
bos@559
|
229
|
bos@559
|
230 </sect2>
|
bos@559
|
231 </sect1>
|
bos@559
|
232 <sect1>
|
bos@559
|
233 <title>Filtering files</title>
|
bos@559
|
234
|
bos@559
|
235 <para>Not only does Mercurial give you a variety of ways to
|
bos@559
|
236 specify files; it lets you further winnow those files using
|
bos@559
|
237 <emphasis>filters</emphasis>. Commands that work with file
|
bos@559
|
238 names accept two filtering options.</para>
|
bos@559
|
239 <itemizedlist>
|
bos@559
|
240 <listitem><para><option role="hg-opt-global">-I</option>, or
|
bos@559
|
241 <option role="hg-opt-global">--include</option>, lets you
|
bos@559
|
242 specify a pattern that file names must match in order to be
|
bos@559
|
243 processed.</para>
|
bos@559
|
244 </listitem>
|
bos@559
|
245 <listitem><para><option role="hg-opt-global">-X</option>, or
|
bos@559
|
246 <option role="hg-opt-global">--exclude</option>, gives you a
|
bos@559
|
247 way to <emphasis>avoid</emphasis> processing files, if they
|
bos@559
|
248 match this pattern.</para>
|
bos@559
|
249 </listitem></itemizedlist>
|
bos@559
|
250 <para>You can provide multiple <option
|
bos@559
|
251 role="hg-opt-global">-I</option> and <option
|
bos@559
|
252 role="hg-opt-global">-X</option> options on the command line,
|
bos@559
|
253 and intermix them as you please. Mercurial interprets the
|
bos@559
|
254 patterns you provide using glob syntax by default (but you can
|
bos@559
|
255 use regexps if you need to).</para>
|
bos@559
|
256
|
bos@559
|
257 <para>You can read a <option role="hg-opt-global">-I</option>
|
bos@559
|
258 filter as <quote>process only the files that match this
|
bos@567
|
259 filter</quote>.</para>
|
bos@567
|
260
|
bos@567
|
261 &interaction.filenames.filter.include;
|
bos@567
|
262
|
bos@567
|
263 <para>The <option role="hg-opt-global">-X</option> filter is best
|
bos@559
|
264 read as <quote>process only the files that don't match this
|
bos@567
|
265 pattern</quote>.</para>
|
bos@567
|
266
|
bos@567
|
267 &interaction.filenames.filter.exclude;
|
bos@559
|
268
|
bos@559
|
269 </sect1>
|
bos@559
|
270 <sect1>
|
bos@559
|
271 <title>Ignoring unwanted files and directories</title>
|
bos@559
|
272
|
bos@559
|
273 <para>XXX.</para>
|
bos@559
|
274
|
bos@559
|
275 </sect1>
|
bos@559
|
276 <sect1 id="sec:names:case">
|
bos@559
|
277 <title>Case sensitivity</title>
|
bos@559
|
278
|
bos@559
|
279 <para>If you're working in a mixed development environment that
|
bos@559
|
280 contains both Linux (or other Unix) systems and Macs or Windows
|
bos@559
|
281 systems, you should keep in the back of your mind the knowledge
|
bos@559
|
282 that they treat the case (<quote>N</quote> versus
|
bos@559
|
283 <quote>n</quote>) of file names in incompatible ways. This is
|
bos@559
|
284 not very likely to affect you, and it's easy to deal with if it
|
bos@559
|
285 does, but it could surprise you if you don't know about
|
bos@559
|
286 it.</para>
|
bos@559
|
287
|
bos@559
|
288 <para>Operating systems and filesystems differ in the way they
|
bos@559
|
289 handle the <emphasis>case</emphasis> of characters in file and
|
bos@559
|
290 directory names. There are three common ways to handle case in
|
bos@559
|
291 names.</para>
|
bos@559
|
292 <itemizedlist>
|
bos@559
|
293 <listitem><para>Completely case insensitive. Uppercase and
|
bos@559
|
294 lowercase versions of a letter are treated as identical,
|
bos@559
|
295 both when creating a file and during subsequent accesses.
|
bos@559
|
296 This is common on older DOS-based systems.</para>
|
bos@559
|
297 </listitem>
|
bos@559
|
298 <listitem><para>Case preserving, but insensitive. When a file
|
bos@559
|
299 or directory is created, the case of its name is stored, and
|
bos@559
|
300 can be retrieved and displayed by the operating system.
|
bos@559
|
301 When an existing file is being looked up, its case is
|
bos@559
|
302 ignored. This is the standard arrangement on Windows and
|
bos@559
|
303 MacOS. The names <filename>foo</filename> and
|
bos@559
|
304 <filename>FoO</filename> identify the same file. This
|
bos@559
|
305 treatment of uppercase and lowercase letters as
|
bos@559
|
306 interchangeable is also referred to as <emphasis>case
|
bos@559
|
307 folding</emphasis>.</para>
|
bos@559
|
308 </listitem>
|
bos@559
|
309 <listitem><para>Case sensitive. The case of a name is
|
bos@559
|
310 significant at all times. The names <filename>foo</filename>
|
bos@559
|
311 and {FoO} identify different files. This is the way Linux
|
bos@559
|
312 and Unix systems normally work.</para>
|
bos@559
|
313 </listitem></itemizedlist>
|
bos@559
|
314
|
bos@559
|
315 <para>On Unix-like systems, it is possible to have any or all of
|
bos@559
|
316 the above ways of handling case in action at once. For example,
|
bos@559
|
317 if you use a USB thumb drive formatted with a FAT32 filesystem
|
bos@559
|
318 on a Linux system, Linux will handle names on that filesystem in
|
bos@559
|
319 a case preserving, but insensitive, way.</para>
|
bos@559
|
320
|
bos@559
|
321 <sect2>
|
bos@559
|
322 <title>Safe, portable repository storage</title>
|
bos@559
|
323
|
bos@559
|
324 <para>Mercurial's repository storage mechanism is <emphasis>case
|
bos@559
|
325 safe</emphasis>. It translates file names so that they can
|
bos@559
|
326 be safely stored on both case sensitive and case insensitive
|
bos@559
|
327 filesystems. This means that you can use normal file copying
|
bos@559
|
328 tools to transfer a Mercurial repository onto, for example, a
|
bos@559
|
329 USB thumb drive, and safely move that drive and repository
|
bos@559
|
330 back and forth between a Mac, a PC running Windows, and a
|
bos@559
|
331 Linux box.</para>
|
bos@559
|
332
|
bos@559
|
333 </sect2>
|
bos@559
|
334 <sect2>
|
bos@559
|
335 <title>Detecting case conflicts</title>
|
bos@559
|
336
|
bos@559
|
337 <para>When operating in the working directory, Mercurial honours
|
bos@559
|
338 the naming policy of the filesystem where the working
|
bos@559
|
339 directory is located. If the filesystem is case preserving,
|
bos@559
|
340 but insensitive, Mercurial will treat names that differ only
|
bos@559
|
341 in case as the same.</para>
|
bos@559
|
342
|
bos@559
|
343 <para>An important aspect of this approach is that it is
|
bos@559
|
344 possible to commit a changeset on a case sensitive (typically
|
bos@559
|
345 Linux or Unix) filesystem that will cause trouble for users on
|
bos@559
|
346 case insensitive (usually Windows and MacOS) users. If a
|
bos@559
|
347 Linux user commits changes to two files, one named
|
bos@559
|
348 <filename>myfile.c</filename> and the other named
|
bos@559
|
349 <filename>MyFile.C</filename>, they will be stored correctly
|
bos@559
|
350 in the repository. And in the working directories of other
|
bos@559
|
351 Linux users, they will be correctly represented as separate
|
bos@559
|
352 files.</para>
|
bos@559
|
353
|
bos@559
|
354 <para>If a Windows or Mac user pulls this change, they will not
|
bos@559
|
355 initially have a problem, because Mercurial's repository
|
bos@559
|
356 storage mechanism is case safe. However, once they try to
|
bos@559
|
357 <command role="hg-cmd">hg update</command> the working
|
bos@559
|
358 directory to that changeset, or <command role="hg-cmd">hg
|
bos@559
|
359 merge</command> with that changeset, Mercurial will spot the
|
bos@559
|
360 conflict between the two file names that the filesystem would
|
bos@559
|
361 treat as the same, and forbid the update or merge from
|
bos@559
|
362 occurring.</para>
|
bos@559
|
363
|
bos@559
|
364 </sect2>
|
bos@559
|
365 <sect2>
|
bos@559
|
366 <title>Fixing a case conflict</title>
|
bos@559
|
367
|
bos@559
|
368 <para>If you are using Windows or a Mac in a mixed environment
|
bos@559
|
369 where some of your collaborators are using Linux or Unix, and
|
bos@559
|
370 Mercurial reports a case folding conflict when you try to
|
bos@559
|
371 <command role="hg-cmd">hg update</command> or <command
|
bos@559
|
372 role="hg-cmd">hg merge</command>, the procedure to fix the
|
bos@559
|
373 problem is simple.</para>
|
bos@559
|
374
|
bos@559
|
375 <para>Just find a nearby Linux or Unix box, clone the problem
|
bos@559
|
376 repository onto it, and use Mercurial's <command
|
bos@559
|
377 role="hg-cmd">hg rename</command> command to change the
|
bos@559
|
378 names of any offending files or directories so that they will
|
bos@559
|
379 no longer cause case folding conflicts. Commit this change,
|
bos@559
|
380 <command role="hg-cmd">hg pull</command> or <command
|
bos@559
|
381 role="hg-cmd">hg push</command> it across to your Windows or
|
bos@559
|
382 MacOS system, and <command role="hg-cmd">hg update</command>
|
bos@559
|
383 to the revision with the non-conflicting names.</para>
|
bos@559
|
384
|
bos@559
|
385 <para>The changeset with case-conflicting names will remain in
|
bos@559
|
386 your project's history, and you still won't be able to
|
bos@559
|
387 <command role="hg-cmd">hg update</command> your working
|
bos@559
|
388 directory to that changeset on a Windows or MacOS system, but
|
bos@559
|
389 you can continue development unimpeded.</para>
|
bos@559
|
390
|
bos@559
|
391 <note>
|
bos@559
|
392 <para> Prior to version 0.9.3, Mercurial did not use a case
|
bos@559
|
393 safe repository storage mechanism, and did not detect case
|
bos@559
|
394 folding conflicts. If you are using an older version of
|
bos@559
|
395 Mercurial on Windows or MacOS, I strongly recommend that you
|
bos@559
|
396 upgrade.</para>
|
bos@559
|
397 </note>
|
bos@559
|
398
|
bos@559
|
399 </sect2>
|
bos@559
|
400 </sect1>
|
bos@559
|
401 </chapter>
|
bos@559
|
402
|
bos@559
|
403 <!--
|
bos@559
|
404 local variables:
|
bos@559
|
405 sgml-parent-document: ("00book.xml" "book" "chapter")
|
bos@559
|
406 end:
|
bos@559
|
407 -->
|