hgbook
changeset 111:34b8b7a15ea1
More material.
author | Bryan O'Sullivan <bos@serpentine.com> |
---|---|
date | Fri Nov 10 15:32:33 2006 -0800 (2006-11-10) |
parents | 75c076c7a374 |
children | 2fcead053b7a |
files | en/concepts.tex |
line diff
1.1 --- a/en/concepts.tex Fri Nov 10 15:09:49 2006 -0800 1.2 +++ b/en/concepts.tex Fri Nov 10 15:32:33 2006 -0800 1.3 @@ -8,9 +8,9 @@ 1.4 1.5 This understanding gives me confidence that Mercurial has been 1.6 carefully designed to be both \emph{safe} and \emph{efficient}. And 1.7 -just as importantly, if I have a good idea what the software is doing 1.8 -when I perform a revision control task, I'm less likely to be 1.9 -surprised by its behaviour. 1.10 +just as importantly, if it's easy for me to retain a good idea of what 1.11 +the software is doing when I perform a revision control task, I'm less 1.12 +likely to be surprised by its behaviour. 1.13 1.14 \section{Mercurial's historical record} 1.15 1.16 @@ -179,7 +179,10 @@ 1.17 Along with delta or snapshot information, a revlog entry contains a 1.18 cryptographic hash of the data that it represents. This makes it 1.19 difficult to forge the contents of a revision, and easy to detect 1.20 -accidental corruption. 1.21 +accidental corruption. The hash that Mercurial uses is SHA-1, which 1.22 +is 160 bits long. Although all revision data is hashed, the changeset 1.23 +hashes that you see as an end user are from revisions of the 1.24 +changelog. Manifest and file hashes are only used behind the scenes. 1.25 1.26 Mercurial checks these hashes when retrieving file revisions and when 1.27 pulling changes from a repository. If it encounters an integrity 1.28 @@ -329,7 +332,34 @@ 1.29 \filename{dirstate}. The file named \filename{dirstate} is thus 1.30 guaranteed to be complete, not partially written. 1.31 1.32 - 1.33 +\subsection{Avoiding seeks} 1.34 + 1.35 +Critical to Mercurial's performance is the avoidance of seeks of the 1.36 +disk head, since any seek is far more expensive than even a 1.37 +comparatively large read operation. 1.38 + 1.39 +This is why, for example, the dirstate is stored in a single file. If 1.40 +there were a dirstate file per directory that Mercurial tracked, the 1.41 +disk would seek once per directory. Instead, Mercurial reads the 1.42 +entire single dirstate file in one step. 1.43 + 1.44 +Mercurial also uses a ``copy on write'' scheme when cloning a 1.45 +repository on local storage. Instead of copying every revlog file 1.46 +from the old repository into the new repository, it makes a ``hard 1.47 +link'', which is a shorthand way to say ``these two names point to the 1.48 +same file''. When Mercurial is about to write to one of a revlog's 1.49 +files, it checks to see if the number of names pointing at the file is 1.50 +greater than one. If it is, more than one repository is using the 1.51 +file, so Mercurial makes a new copy of the file that is private to 1.52 +this repository. 1.53 + 1.54 +A few revision control developers have pointed out that this idea of 1.55 +making a complete private copy of a file is not very efficient in its 1.56 +use of storage. While this is true, storage is cheap, and this method 1.57 +gives the highest performance while deferring most book-keeping to the 1.58 +operating system. An alternative scheme would most likely reduce 1.59 +performance and increase the complexity of the software, each of which 1.60 +is much more important to the ``feel'' of day-to-day use. 1.61 1.62 %%% Local Variables: 1.63 %%% mode: latex