hgbook

annotate fr/ch07-filenames.xml @ 964:6b680d569bb4

deleting a bunch of files not longer necessary to build the documentation.
Adding missing newly files needed to build the documentation
author Romain PELISSE <belaran@gmail.com>
date Sun Aug 16 04:58:01 2009 +0200 (2009-08-16)
parents
children 6f8c48362758
rev   line source
belaran@964 1 <!-- vim: set filetype=docbkxml shiftwidth=2 autoindent expandtab tw=77 : -->
belaran@964 2
belaran@964 3 <chapter>
belaran@964 4 <title>File names and pattern matching</title>
belaran@964 5 <para>\label{chap:names}</para>
belaran@964 6
belaran@964 7 <para>Mercurial provides mechanisms that let you work with file names in a
belaran@964 8 consistent and expressive way.</para>
belaran@964 9
belaran@964 10 <sect1>
belaran@964 11 <title>Simple file naming</title>
belaran@964 12
belaran@964 13 <para>Mercurial uses a unified piece of machinery <quote>under the hood</quote> to
belaran@964 14 handle file names. Every command behaves uniformly with respect to
belaran@964 15 file names. The way in which commands work with file names is as
belaran@964 16 follows.</para>
belaran@964 17
belaran@964 18 <para>If you explicitly name real files on the command line, Mercurial works
belaran@964 19 with exactly those files, as you would expect.
belaran@964 20 <!-- &interaction.filenames.files; --></para>
belaran@964 21
belaran@964 22 <para>When you provide a directory name, Mercurial will interpret this as
belaran@964 23 <quote>operate on every file in this directory and its subdirectories</quote>.
belaran@964 24 Mercurial traverses the files and subdirectories in a directory in
belaran@964 25 alphabetical order. When it encounters a subdirectory, it will
belaran@964 26 traverse that subdirectory before continuing with the current
belaran@964 27 directory.
belaran@964 28 <!-- &interaction.filenames.dirs; --></para>
belaran@964 29
belaran@964 30 </sect1>
belaran@964 31 <sect1>
belaran@964 32 <title>Running commands without any file names</title>
belaran@964 33
belaran@964 34 <para>Mercurial's commands that work with file names have useful default
belaran@964 35 behaviours when you invoke them without providing any file names or
belaran@964 36 patterns. What kind of behaviour you should expect depends on what
belaran@964 37 the command does. Here are a few rules of thumb you can use to
belaran@964 38 predict what a command is likely to do if you don't give it any names
belaran@964 39 to work with.</para>
belaran@964 40 <itemizedlist>
belaran@964 41 <listitem><para>Most commands will operate on the entire working directory.
belaran@964 42 This is what the <command role="hg-cmd">hg add</command> command does, for example.</para>
belaran@964 43 </listitem>
belaran@964 44 <listitem><para>If the command has effects that are difficult or impossible to
belaran@964 45 reverse, it will force you to explicitly provide at least one name
belaran@964 46 or pattern (see below). This protects you from accidentally
belaran@964 47 deleting files by running <command role="hg-cmd">hg remove</command> with no arguments, for
belaran@964 48 example.</para>
belaran@964 49 </listitem></itemizedlist>
belaran@964 50
belaran@964 51 <para>It's easy to work around these default behaviours if they don't suit
belaran@964 52 you. If a command normally operates on the whole working directory,
belaran@964 53 you can invoke it on just the current directory and its subdirectories
belaran@964 54 by giving it the name <quote><filename class="directory">.</filename></quote>.
belaran@964 55 <!-- &interaction.filenames.wdir-subdir; -->
belaran@964 56 </para>
belaran@964 57
belaran@964 58 <para>Along the same lines, some commands normally print file names relative
belaran@964 59 to the root of the repository, even if you're invoking them from a
belaran@964 60 subdirectory. Such a command will print file names relative to your
belaran@964 61 subdirectory if you give it explicit names. Here, we're going to run
belaran@964 62 <command role="hg-cmd">hg status</command> from a subdirectory, and get it to operate on the
belaran@964 63 entire working directory while printing file names relative to our
belaran@964 64 subdirectory, by passing it the output of the <command role="hg-cmd">hg root</command> command.
belaran@964 65 <!-- &interaction.filenames.wdir-relname; -->
belaran@964 66 </para>
belaran@964 67
belaran@964 68 </sect1>
belaran@964 69 <sect1>
belaran@964 70 <title>Telling you what's going on</title>
belaran@964 71
belaran@964 72 <para>The <command role="hg-cmd">hg add</command> example in the preceding section illustrates something
belaran@964 73 else that's helpful about Mercurial commands. If a command operates
belaran@964 74 on a file that you didn't name explicitly on the command line, it will
belaran@964 75 usually print the name of the file, so that you will not be surprised
belaran@964 76 what's going on.
belaran@964 77 </para>
belaran@964 78
belaran@964 79 <para>The principle here is of <emphasis>least surprise</emphasis>. If you've exactly
belaran@964 80 named a file on the command line, there's no point in repeating it
belaran@964 81 back at you. If Mercurial is acting on a file <emphasis>implicitly</emphasis>,
belaran@964 82 because you provided no names, or a directory, or a pattern (see
belaran@964 83 below), it's safest to tell you what it's doing.
belaran@964 84 </para>
belaran@964 85
belaran@964 86 <para>For commands that behave this way, you can silence them using the
belaran@964 87 <option role="hg-opt-global">-q</option> option. You can also get them to print the name of every
belaran@964 88 file, even those you've named explicitly, using the <option role="hg-opt-global">-v</option>
belaran@964 89 option.
belaran@964 90 </para>
belaran@964 91
belaran@964 92 </sect1>
belaran@964 93 <sect1>
belaran@964 94 <title>Using patterns to identify files</title>
belaran@964 95
belaran@964 96 <para>In addition to working with file and directory names, Mercurial lets
belaran@964 97 you use <emphasis>patterns</emphasis> to identify files. Mercurial's pattern
belaran@964 98 handling is expressive.
belaran@964 99 </para>
belaran@964 100
belaran@964 101 <para>On Unix-like systems (Linux, MacOS, etc.), the job of matching file
belaran@964 102 names to patterns normally falls to the shell. On these systems, you
belaran@964 103 must explicitly tell Mercurial that a name is a pattern. On Windows,
belaran@964 104 the shell does not expand patterns, so Mercurial will automatically
belaran@964 105 identify names that are patterns, and expand them for you.
belaran@964 106 </para>
belaran@964 107
belaran@964 108 <para>To provide a pattern in place of a regular name on the command line,
belaran@964 109 the mechanism is simple:
belaran@964 110 </para>
belaran@964 111 <programlisting>
belaran@964 112 <para> syntax:patternbody
belaran@964 113 </para>
belaran@964 114 </programlisting>
belaran@964 115 <para>That is, a pattern is identified by a short text string that says what
belaran@964 116 kind of pattern this is, followed by a colon, followed by the actual
belaran@964 117 pattern.
belaran@964 118 </para>
belaran@964 119
belaran@964 120 <para>Mercurial supports two kinds of pattern syntax. The most frequently
belaran@964 121 used is called <literal>glob</literal>; this is the same kind of pattern
belaran@964 122 matching used by the Unix shell, and should be familiar to Windows
belaran@964 123 command prompt users, too.
belaran@964 124 </para>
belaran@964 125
belaran@964 126 <para>When Mercurial does automatic pattern matching on Windows, it uses
belaran@964 127 <literal>glob</literal> syntax. You can thus omit the <quote><literal>glob:</literal></quote> prefix
belaran@964 128 on Windows, but it's safe to use it, too.
belaran@964 129 </para>
belaran@964 130
belaran@964 131 <para>The <literal>re</literal> syntax is more powerful; it lets you specify patterns
belaran@964 132 using regular expressions, also known as regexps.
belaran@964 133 </para>
belaran@964 134
belaran@964 135 <para>By the way, in the examples that follow, notice that I'm careful to
belaran@964 136 wrap all of my patterns in quote characters, so that they won't get
belaran@964 137 expanded by the shell before Mercurial sees them.
belaran@964 138 </para>
belaran@964 139
belaran@964 140 <sect2>
belaran@964 141 <title>Shell-style <literal>glob</literal> patterns</title>
belaran@964 142
belaran@964 143 <para>This is an overview of the kinds of patterns you can use when you're
belaran@964 144 matching on glob patterns.
belaran@964 145 </para>
belaran@964 146
belaran@964 147 <para>The <quote><literal>*</literal></quote> character matches any string, within a single
belaran@964 148 directory.
belaran@964 149 <!-- &interaction.filenames.glob.star; -->
belaran@964 150 </para>
belaran@964 151
belaran@964 152 <para>The <quote><literal>**</literal></quote> pattern matches any string, and crosses directory
belaran@964 153 boundaries. It's not a standard Unix glob token, but it's accepted by
belaran@964 154 several popular Unix shells, and is very useful.
belaran@964 155 <!-- &interaction.filenames.glob.starstar; -->
belaran@964 156 </para>
belaran@964 157
belaran@964 158 <para>The <quote><literal>?</literal></quote> pattern matches any single character.
belaran@964 159 <!-- &interaction.filenames.glob.question; -->
belaran@964 160 </para>
belaran@964 161
belaran@964 162 <para>The <quote><literal>[</literal></quote> character begins a <emphasis>character class</emphasis>. This
belaran@964 163 matches any single character within the class. The class ends with a
belaran@964 164 <quote><literal>]</literal></quote> character. A class may contain multiple <emphasis>range</emphasis>s
belaran@964 165 of the form <quote><literal>a-f</literal></quote>, which is shorthand for
belaran@964 166 <quote><literal>abcdef</literal></quote>.
belaran@964 167 <!-- &interaction.filenames.glob.range; -->
belaran@964 168 If the first character after the <quote><literal>[</literal></quote> in a character class
belaran@964 169 is a <quote><literal>!</literal></quote>, it <emphasis>negates</emphasis> the class, making it match any
belaran@964 170 single character not in the class.
belaran@964 171 </para>
belaran@964 172
belaran@964 173 <para>A <quote><literal>{</literal></quote> begins a group of subpatterns, where the whole group
belaran@964 174 matches if any subpattern in the group matches. The <quote><literal>,</literal></quote>
belaran@964 175 character separates subpatterns, and <quote>\texttt{}}</quote> ends the group.
belaran@964 176 <!-- &interaction.filenames.glob.group; -->
belaran@964 177 </para>
belaran@964 178
belaran@964 179 <sect3>
belaran@964 180 <title>Watch out!</title>
belaran@964 181
belaran@964 182 <para>Don't forget that if you want to match a pattern in any directory, you
belaran@964 183 should not be using the <quote><literal>*</literal></quote> match-any token, as this will
belaran@964 184 only match within one directory. Instead, use the <quote><literal>**</literal></quote>
belaran@964 185 token. This small example illustrates the difference between the two.
belaran@964 186 <!-- &interaction.filenames.glob.star-starstar; -->
belaran@964 187 </para>
belaran@964 188
belaran@964 189 </sect3>
belaran@964 190 </sect2>
belaran@964 191 <sect2>
belaran@964 192 <title>Regular expression matching with <literal>re</literal> patterns</title>
belaran@964 193
belaran@964 194 <para>Mercurial accepts the same regular expression syntax as the Python
belaran@964 195 programming language (it uses Python's regexp engine internally).
belaran@964 196 This is based on the Perl language's regexp syntax, which is the most
belaran@964 197 popular dialect in use (it's also used in Java, for example).
belaran@964 198 </para>
belaran@964 199
belaran@964 200 <para>I won't discuss Mercurial's regexp dialect in any detail here, as
belaran@964 201 regexps are not often used. Perl-style regexps are in any case
belaran@964 202 already exhaustively documented on a multitude of web sites, and in
belaran@964 203 many books. Instead, I will focus here on a few things you should
belaran@964 204 know if you find yourself needing to use regexps with Mercurial.
belaran@964 205 </para>
belaran@964 206
belaran@964 207 <para>A regexp is matched against an entire file name, relative to the root
belaran@964 208 of the repository. In other words, even if you're already in
belaran@964 209 subbdirectory <filename class="directory">foo</filename>, if you want to match files under this
belaran@964 210 directory, your pattern must start with <quote><literal>foo/</literal></quote>.
belaran@964 211 </para>
belaran@964 212
belaran@964 213 <para>One thing to note, if you're familiar with Perl-style regexps, is that
belaran@964 214 Mercurial's are <emphasis>rooted</emphasis>. That is, a regexp starts matching
belaran@964 215 against the beginning of a string; it doesn't look for a match
belaran@964 216 anywhere within the string. To match anywhere in a string, start
belaran@964 217 your pattern with <quote><literal>.*</literal></quote>.
belaran@964 218 </para>
belaran@964 219
belaran@964 220 </sect2>
belaran@964 221 </sect1>
belaran@964 222 <sect1>
belaran@964 223 <title>Filtering files</title>
belaran@964 224
belaran@964 225 <para>Not only does Mercurial give you a variety of ways to specify files;
belaran@964 226 it lets you further winnow those files using <emphasis>filters</emphasis>. Commands
belaran@964 227 that work with file names accept two filtering options.
belaran@964 228 </para>
belaran@964 229 <itemizedlist>
belaran@964 230 <listitem><para><option role="hg-opt-global">-I</option>, or <option role="hg-opt-global">--include</option>, lets you specify a pattern
belaran@964 231 that file names must match in order to be processed.
belaran@964 232 </para>
belaran@964 233 </listitem>
belaran@964 234 <listitem><para><option role="hg-opt-global">-X</option>, or <option role="hg-opt-global">--exclude</option>, gives you a way to
belaran@964 235 <emphasis>avoid</emphasis> processing files, if they match this pattern.
belaran@964 236 </para>
belaran@964 237 </listitem></itemizedlist>
belaran@964 238 <para>You can provide multiple <option role="hg-opt-global">-I</option> and <option role="hg-opt-global">-X</option> options on the
belaran@964 239 command line, and intermix them as you please. Mercurial interprets
belaran@964 240 the patterns you provide using glob syntax by default (but you can use
belaran@964 241 regexps if you need to).
belaran@964 242 </para>
belaran@964 243
belaran@964 244 <para>You can read a <option role="hg-opt-global">-I</option> filter as <quote>process only the files that
belaran@964 245 match this filter</quote>.
belaran@964 246 <!-- &interaction.filenames.filter.include; -->
belaran@964 247 The <option role="hg-opt-global">-X</option> filter is best read as <quote>process only the files that
belaran@964 248 don't match this pattern</quote>.
belaran@964 249 <!-- &interaction.filenames.filter.exclude; -->
belaran@964 250 </para>
belaran@964 251
belaran@964 252 </sect1>
belaran@964 253 <sect1>
belaran@964 254 <title>Ignoring unwanted files and directories</title>
belaran@964 255
belaran@964 256 <para>XXX.
belaran@964 257 </para>
belaran@964 258
belaran@964 259 </sect1>
belaran@964 260 <sect1>
belaran@964 261 <title>Case sensitivity</title>
belaran@964 262 <para>\label{sec:names:case}
belaran@964 263 </para>
belaran@964 264
belaran@964 265 <para>If you're working in a mixed development environment that contains
belaran@964 266 both Linux (or other Unix) systems and Macs or Windows systems, you
belaran@964 267 should keep in the back of your mind the knowledge that they treat the
belaran@964 268 case (<quote>N</quote> versus <quote>n</quote>) of file names in incompatible ways. This is
belaran@964 269 not very likely to affect you, and it's easy to deal with if it does,
belaran@964 270 but it could surprise you if you don't know about it.
belaran@964 271 </para>
belaran@964 272
belaran@964 273 <para>Operating systems and filesystems differ in the way they handle the
belaran@964 274 <emphasis>case</emphasis> of characters in file and directory names. There are
belaran@964 275 three common ways to handle case in names.
belaran@964 276 </para>
belaran@964 277 <itemizedlist>
belaran@964 278 <listitem><para>Completely case insensitive. Uppercase and lowercase versions
belaran@964 279 of a letter are treated as identical, both when creating a file and
belaran@964 280 during subsequent accesses. This is common on older DOS-based
belaran@964 281 systems.
belaran@964 282 </para>
belaran@964 283 </listitem>
belaran@964 284 <listitem><para>Case preserving, but insensitive. When a file or directory is
belaran@964 285 created, the case of its name is stored, and can be retrieved and
belaran@964 286 displayed by the operating system. When an existing file is being
belaran@964 287 looked up, its case is ignored. This is the standard arrangement on
belaran@964 288 Windows and MacOS. The names <filename>foo</filename> and <filename>FoO</filename>
belaran@964 289 identify the same file. This treatment of uppercase and lowercase
belaran@964 290 letters as interchangeable is also referred to as \emph{case
belaran@964 291 folding}.
belaran@964 292 </para>
belaran@964 293 </listitem>
belaran@964 294 <listitem><para>Case sensitive. The case of a name is significant at all times.
belaran@964 295 The names <filename>foo</filename> and {FoO} identify different files. This
belaran@964 296 is the way Linux and Unix systems normally work.
belaran@964 297 </para>
belaran@964 298 </listitem></itemizedlist>
belaran@964 299
belaran@964 300 <para>On Unix-like systems, it is possible to have any or all of the above
belaran@964 301 ways of handling case in action at once. For example, if you use a
belaran@964 302 USB thumb drive formatted with a FAT32 filesystem on a Linux system,
belaran@964 303 Linux will handle names on that filesystem in a case preserving, but
belaran@964 304 insensitive, way.
belaran@964 305 </para>
belaran@964 306
belaran@964 307 <sect2>
belaran@964 308 <title>Safe, portable repository storage</title>
belaran@964 309
belaran@964 310 <para>Mercurial's repository storage mechanism is <emphasis>case safe</emphasis>. It
belaran@964 311 translates file names so that they can be safely stored on both case
belaran@964 312 sensitive and case insensitive filesystems. This means that you can
belaran@964 313 use normal file copying tools to transfer a Mercurial repository onto,
belaran@964 314 for example, a USB thumb drive, and safely move that drive and
belaran@964 315 repository back and forth between a Mac, a PC running Windows, and a
belaran@964 316 Linux box.
belaran@964 317 </para>
belaran@964 318
belaran@964 319 </sect2>
belaran@964 320 <sect2>
belaran@964 321 <title>Detecting case conflicts</title>
belaran@964 322
belaran@964 323 <para>When operating in the working directory, Mercurial honours the naming
belaran@964 324 policy of the filesystem where the working directory is located. If
belaran@964 325 the filesystem is case preserving, but insensitive, Mercurial will
belaran@964 326 treat names that differ only in case as the same.
belaran@964 327 </para>
belaran@964 328
belaran@964 329 <para>An important aspect of this approach is that it is possible to commit
belaran@964 330 a changeset on a case sensitive (typically Linux or Unix) filesystem
belaran@964 331 that will cause trouble for users on case insensitive (usually Windows
belaran@964 332 and MacOS) users. If a Linux user commits changes to two files, one
belaran@964 333 named <filename>myfile.c</filename> and the other named <filename>MyFile.C</filename>,
belaran@964 334 they will be stored correctly in the repository. And in the working
belaran@964 335 directories of other Linux users, they will be correctly represented
belaran@964 336 as separate files.
belaran@964 337 </para>
belaran@964 338
belaran@964 339 <para>If a Windows or Mac user pulls this change, they will not initially
belaran@964 340 have a problem, because Mercurial's repository storage mechanism is
belaran@964 341 case safe. However, once they try to <command role="hg-cmd">hg update</command> the working
belaran@964 342 directory to that changeset, or <command role="hg-cmd">hg merge</command> with that changeset,
belaran@964 343 Mercurial will spot the conflict between the two file names that the
belaran@964 344 filesystem would treat as the same, and forbid the update or merge
belaran@964 345 from occurring.
belaran@964 346 </para>
belaran@964 347
belaran@964 348 </sect2>
belaran@964 349 <sect2>
belaran@964 350 <title>Fixing a case conflict</title>
belaran@964 351
belaran@964 352 <para>If you are using Windows or a Mac in a mixed environment where some of
belaran@964 353 your collaborators are using Linux or Unix, and Mercurial reports a
belaran@964 354 case folding conflict when you try to <command role="hg-cmd">hg update</command> or <command role="hg-cmd">hg merge</command>,
belaran@964 355 the procedure to fix the problem is simple.
belaran@964 356 </para>
belaran@964 357
belaran@964 358 <para>Just find a nearby Linux or Unix box, clone the problem repository
belaran@964 359 onto it, and use Mercurial's <command role="hg-cmd">hg rename</command> command to change the
belaran@964 360 names of any offending files or directories so that they will no
belaran@964 361 longer cause case folding conflicts. Commit this change, <command role="hg-cmd">hg pull</command>
belaran@964 362 or <command role="hg-cmd">hg push</command> it across to your Windows or MacOS system, and
belaran@964 363 <command role="hg-cmd">hg update</command> to the revision with the non-conflicting names.
belaran@964 364 </para>
belaran@964 365
belaran@964 366 <para>The changeset with case-conflicting names will remain in your
belaran@964 367 project's history, and you still won't be able to <command role="hg-cmd">hg update</command> your
belaran@964 368 working directory to that changeset on a Windows or MacOS system, but
belaran@964 369 you can continue development unimpeded.
belaran@964 370 </para>
belaran@964 371
belaran@964 372 <note>
belaran@964 373 <para> Prior to version 0.9.3, Mercurial did not use a case safe repository
belaran@964 374 storage mechanism, and did not detect case folding conflicts. If
belaran@964 375 you are using an older version of Mercurial on Windows or MacOS, I
belaran@964 376 strongly recommend that you upgrade.
belaran@964 377 </para>
belaran@964 378 </note>
belaran@964 379
belaran@964 380 </sect2>
belaran@964 381 </sect1>
belaran@964 382 </chapter>
belaran@964 383
belaran@964 384 <!--
belaran@964 385 local variables:
belaran@964 386 sgml-parent-document: ("00book.xml" "book" "chapter")
belaran@964 387 end:
belaran@964 388 -->