bos@559: bos@559: dongsheng@625: bos@572: bos@559: File names and pattern matching bos@559: bos@559: Mercurial provides mechanisms that let you work with file bos@559: names in a consistent and expressive way. bos@559: bos@559: bos@559: Simple file naming bos@559: bos@559: Mercurial uses a unified piece of machinery under the bos@559: hood to handle file names. Every command behaves bos@559: uniformly with respect to file names. The way in which commands bos@559: work with file names is as follows. bos@559: bos@559: If you explicitly name real files on the command line, bos@559: Mercurial works with exactly those files, as you would expect. bos@567: &interaction.filenames.files; bos@559: bos@559: When you provide a directory name, Mercurial will interpret bos@559: this as operate on every file in this directory and its bos@559: subdirectories. Mercurial traverses the files and bos@559: subdirectories in a directory in alphabetical order. When it bos@559: encounters a subdirectory, it will traverse that subdirectory bos@567: before continuing with the current directory. bos@567: bos@567: &interaction.filenames.dirs; bos@559: bos@559: bos@559: bos@559: Running commands without any file names bos@559: bos@559: Mercurial's commands that work with file names have useful bos@559: default behaviours when you invoke them without providing any bos@559: file names or patterns. What kind of behaviour you should bos@559: expect depends on what the command does. Here are a few rules bos@559: of thumb you can use to predict what a command is likely to do bos@559: if you don't give it any names to work with. bos@559: bos@559: Most commands will operate on the entire working bos@559: directory. This is what the hg bos@559: add command does, for example. bos@559: bos@559: If the command has effects that are difficult or bos@559: impossible to reverse, it will force you to explicitly bos@559: provide at least one name or pattern (see below). This bos@559: protects you from accidentally deleting files by running bos@559: hg remove with no bos@559: arguments, for example. bos@559: bos@559: bos@559: It's easy to work around these default behaviours if they bos@559: don't suit you. If a command normally operates on the whole bos@559: working directory, you can invoke it on just the current bos@559: directory and its subdirectories by giving it the name bos@567: .. bos@567: bos@567: &interaction.filenames.wdir-subdir; bos@559: bos@559: Along the same lines, some commands normally print file bos@559: names relative to the root of the repository, even if you're bos@559: invoking them from a subdirectory. Such a command will print bos@559: file names relative to your subdirectory if you give it explicit bos@559: names. Here, we're going to run hg bos@559: status from a subdirectory, and get it to operate on bos@559: the entire working directory while printing file names relative bos@559: to our subdirectory, by passing it the output of the hg root command. bos@567: bos@567: &interaction.filenames.wdir-relname; bos@559: bos@559: bos@559: bos@559: Telling you what's going on bos@559: bos@559: The hg add example in the bos@559: preceding section illustrates something else that's helpful bos@559: about Mercurial commands. If a command operates on a file that bos@559: you didn't name explicitly on the command line, it will usually bos@559: print the name of the file, so that you will not be surprised bos@559: what's going on. bos@559: bos@559: The principle here is of least bos@559: surprise. If you've exactly named a file on the bos@559: command line, there's no point in repeating it back at you. If bos@559: Mercurial is acting on a file implicitly, bos@559: because you provided no names, or a directory, or a pattern (see bos@559: below), it's safest to tell you what it's doing. bos@559: bos@559: For commands that behave this way, you can silence them bos@559: using the option. You bos@559: can also get them to print the name of every file, even those bos@559: you've named explicitly, using the option. bos@559: bos@559: bos@559: bos@559: Using patterns to identify files bos@559: bos@559: In addition to working with file and directory names, bos@559: Mercurial lets you use patterns to identify bos@559: files. Mercurial's pattern handling is expressive. bos@559: bos@559: On Unix-like systems (Linux, MacOS, etc.), the job of bos@559: matching file names to patterns normally falls to the shell. On bos@559: these systems, you must explicitly tell Mercurial that a name is bos@559: a pattern. On Windows, the shell does not expand patterns, so bos@559: Mercurial will automatically identify names that are patterns, bos@559: and expand them for you. bos@559: bos@559: To provide a pattern in place of a regular name on the bos@559: command line, the mechanism is simple: bos@559: syntax:patternbody bos@559: That is, a pattern is identified by a short text string that bos@559: says what kind of pattern this is, followed by a colon, followed bos@559: by the actual pattern. bos@559: bos@559: Mercurial supports two kinds of pattern syntax. The most bos@559: frequently used is called glob; this is the bos@559: same kind of pattern matching used by the Unix shell, and should bos@559: be familiar to Windows command prompt users, too. bos@559: bos@559: When Mercurial does automatic pattern matching on Windows, bos@559: it uses glob syntax. You can thus omit the bos@559: glob: prefix on Windows, but bos@559: it's safe to use it, too. bos@559: bos@559: The re syntax is more powerful; it lets bos@559: you specify patterns using regular expressions, also known as bos@559: regexps. bos@559: bos@559: By the way, in the examples that follow, notice that I'm bos@559: careful to wrap all of my patterns in quote characters, so that bos@559: they won't get expanded by the shell before Mercurial sees bos@559: them. bos@559: bos@559: bos@559: Shell-style <literal>glob</literal> patterns bos@559: bos@559: This is an overview of the kinds of patterns you can use bos@559: when you're matching on glob patterns. bos@559: bos@559: The * character matches bos@567: any string, within a single directory. bos@567: bos@567: &interaction.filenames.glob.star; bos@559: bos@559: The ** pattern matches bos@559: any string, and crosses directory boundaries. It's not a bos@559: standard Unix glob token, but it's accepted by several popular bos@567: Unix shells, and is very useful. bos@567: bos@567: &interaction.filenames.glob.starstar; bos@559: bos@559: The ? pattern matches bos@567: any single character. bos@567: bos@567: &interaction.filenames.glob.question; bos@559: bos@559: The [ character begins a bos@559: character class. This matches any single bos@559: character within the class. The class ends with a bos@559: ] character. A class may bos@559: contain multiple ranges of the form bos@559: a-f, which is shorthand for bos@567: abcdef. bos@567: bos@567: &interaction.filenames.glob.range; bos@567: bos@567: If the first character after the bos@567: [ in a character class is a bos@567: !, it bos@559: negates the class, making it match any bos@559: single character not in the class. bos@559: bos@559: A { begins a group of bos@559: subpatterns, where the whole group matches if any subpattern bos@559: in the group matches. The , bos@567: character separates subpatterns, and bos@567: } ends the group. bos@567: bos@567: &interaction.filenames.glob.group; bos@559: bos@559: bos@559: Watch out! bos@559: bos@559: Don't forget that if you want to match a pattern in any bos@559: directory, you should not be using the bos@559: * match-any token, as this bos@559: will only match within one directory. Instead, use the bos@559: ** token. This small bos@567: example illustrates the difference between the two. bos@567: bos@567: &interaction.filenames.glob.star-starstar; bos@559: bos@559: bos@559: bos@559: bos@559: Regular expression matching with <literal>re</literal> bos@559: patterns bos@559: bos@559: Mercurial accepts the same regular expression syntax as bos@559: the Python programming language (it uses Python's regexp bos@559: engine internally). This is based on the Perl language's bos@559: regexp syntax, which is the most popular dialect in use (it's bos@559: also used in Java, for example). bos@559: bos@559: I won't discuss Mercurial's regexp dialect in any detail bos@559: here, as regexps are not often used. Perl-style regexps are bos@559: in any case already exhaustively documented on a multitude of bos@559: web sites, and in many books. Instead, I will focus here on a bos@559: few things you should know if you find yourself needing to use bos@559: regexps with Mercurial. bos@559: bos@559: A regexp is matched against an entire file name, relative bos@559: to the root of the repository. In other words, even if you're bos@559: already in subbdirectory foo, if you want to match files bos@559: under this directory, your pattern must start with bos@559: foo/. bos@559: bos@559: One thing to note, if you're familiar with Perl-style bos@559: regexps, is that Mercurial's are rooted. bos@559: That is, a regexp starts matching against the beginning of a bos@559: string; it doesn't look for a match anywhere within the bos@559: string. To match anywhere in a string, start your pattern bos@559: with .*. bos@559: bos@559: bos@559: bos@559: bos@559: Filtering files bos@559: bos@559: Not only does Mercurial give you a variety of ways to bos@559: specify files; it lets you further winnow those files using bos@559: filters. Commands that work with file bos@559: names accept two filtering options. bos@559: bos@559: , or bos@559: , lets you bos@559: specify a pattern that file names must match in order to be bos@559: processed. bos@559: bos@559: , or bos@559: , gives you a bos@559: way to avoid processing files, if they bos@559: match this pattern. bos@559: bos@559: You can provide multiple and options on the command line, bos@559: and intermix them as you please. Mercurial interprets the bos@559: patterns you provide using glob syntax by default (but you can bos@559: use regexps if you need to). bos@559: bos@559: You can read a bos@559: filter as process only the files that match this bos@567: filter. bos@567: bos@567: &interaction.filenames.filter.include; bos@567: bos@567: The filter is best bos@559: read as process only the files that don't match this bos@567: pattern. bos@567: bos@567: &interaction.filenames.filter.exclude; bos@559: bos@559: bos@559: bos@559: Ignoring unwanted files and directories bos@559: bos@559: XXX. bos@559: bos@559: dongsheng@625: bos@559: Case sensitivity bos@559: bos@559: If you're working in a mixed development environment that bos@559: contains both Linux (or other Unix) systems and Macs or Windows bos@559: systems, you should keep in the back of your mind the knowledge bos@559: that they treat the case (N versus bos@559: n) of file names in incompatible ways. This is bos@559: not very likely to affect you, and it's easy to deal with if it bos@559: does, but it could surprise you if you don't know about bos@559: it. bos@559: bos@559: Operating systems and filesystems differ in the way they bos@559: handle the case of characters in file and bos@559: directory names. There are three common ways to handle case in bos@559: names. bos@559: bos@559: Completely case insensitive. Uppercase and bos@559: lowercase versions of a letter are treated as identical, bos@559: both when creating a file and during subsequent accesses. bos@559: This is common on older DOS-based systems. bos@559: bos@559: Case preserving, but insensitive. When a file bos@559: or directory is created, the case of its name is stored, and bos@559: can be retrieved and displayed by the operating system. bos@559: When an existing file is being looked up, its case is bos@559: ignored. This is the standard arrangement on Windows and bos@559: MacOS. The names foo and bos@559: FoO identify the same file. This bos@559: treatment of uppercase and lowercase letters as bos@559: interchangeable is also referred to as case bos@559: folding. bos@559: bos@559: Case sensitive. The case of a name is bos@559: significant at all times. The names foo bos@559: and {FoO} identify different files. This is the way Linux bos@559: and Unix systems normally work. bos@559: bos@559: bos@559: On Unix-like systems, it is possible to have any or all of bos@559: the above ways of handling case in action at once. For example, bos@559: if you use a USB thumb drive formatted with a FAT32 filesystem bos@559: on a Linux system, Linux will handle names on that filesystem in bos@559: a case preserving, but insensitive, way. bos@559: bos@559: bos@559: Safe, portable repository storage bos@559: bos@559: Mercurial's repository storage mechanism is case bos@559: safe. It translates file names so that they can bos@559: be safely stored on both case sensitive and case insensitive bos@559: filesystems. This means that you can use normal file copying bos@559: tools to transfer a Mercurial repository onto, for example, a bos@559: USB thumb drive, and safely move that drive and repository bos@559: back and forth between a Mac, a PC running Windows, and a bos@559: Linux box. bos@559: bos@559: bos@559: bos@559: Detecting case conflicts bos@559: bos@559: When operating in the working directory, Mercurial honours bos@559: the naming policy of the filesystem where the working bos@559: directory is located. If the filesystem is case preserving, bos@559: but insensitive, Mercurial will treat names that differ only bos@559: in case as the same. bos@559: bos@559: An important aspect of this approach is that it is bos@559: possible to commit a changeset on a case sensitive (typically bos@559: Linux or Unix) filesystem that will cause trouble for users on bos@559: case insensitive (usually Windows and MacOS) users. If a bos@559: Linux user commits changes to two files, one named bos@559: myfile.c and the other named bos@559: MyFile.C, they will be stored correctly bos@559: in the repository. And in the working directories of other bos@559: Linux users, they will be correctly represented as separate bos@559: files. bos@559: bos@559: If a Windows or Mac user pulls this change, they will not bos@559: initially have a problem, because Mercurial's repository bos@559: storage mechanism is case safe. However, once they try to bos@559: hg update the working bos@559: directory to that changeset, or hg bos@559: merge with that changeset, Mercurial will spot the bos@559: conflict between the two file names that the filesystem would bos@559: treat as the same, and forbid the update or merge from bos@559: occurring. bos@559: bos@559: bos@559: bos@559: Fixing a case conflict bos@559: bos@559: If you are using Windows or a Mac in a mixed environment bos@559: where some of your collaborators are using Linux or Unix, and bos@559: Mercurial reports a case folding conflict when you try to bos@559: hg update or hg merge, the procedure to fix the bos@559: problem is simple. bos@559: bos@559: Just find a nearby Linux or Unix box, clone the problem bos@559: repository onto it, and use Mercurial's hg rename command to change the bos@559: names of any offending files or directories so that they will bos@559: no longer cause case folding conflicts. Commit this change, bos@559: hg pull or hg push it across to your Windows or bos@559: MacOS system, and hg update bos@559: to the revision with the non-conflicting names. bos@559: bos@559: The changeset with case-conflicting names will remain in bos@559: your project's history, and you still won't be able to bos@559: hg update your working bos@559: directory to that changeset on a Windows or MacOS system, but bos@559: you can continue development unimpeded. bos@559: bos@559: bos@559: Prior to version 0.9.3, Mercurial did not use a case bos@559: safe repository storage mechanism, and did not detect case bos@559: folding conflicts. If you are using an older version of bos@559: Mercurial on Windows or MacOS, I strongly recommend that you bos@559: upgrade. bos@559: bos@559: bos@559: bos@559: bos@559: bos@559: bos@559: