Hg Update Commit Descriptive Essay

Prebuilt binary packages of Mercurial are available for every popular operating system. These make it easy to start using Mercurial on your computer immediately.

The best version of Mercurial for Windows is TortoiseHg, which can be found at http://bitbucket.org/tortoisehg/stable/wiki/Home. This package has no external dependencies; it “just works”. It provides both command line and graphical user interfaces.

Because each Linux distribution has its own packaging tools, policies, and rate of development, it's difficult to give a comprehensive set of instructions on how to install Mercurial binaries. The version of Mercurial that you will end up with can vary depending on how active the person is who maintains the package for your distribution.

To keep things simple, I will focus on installing Mercurial from the command line under the most popular Linux distributions. Most of these distributions provide graphical package managers that will let you install Mercurial with a single click; the package name to look for is .

  • Ubuntu and Debian:

    apt-get install mercurial
  • Fedora:

    yum install mercurial
  • OpenSUSE:

    zypper install mercurial
  • Gentoo:

    emerge mercurial

Installing Mercurial on your system

To begin, we'll use the hg version command to find out whether Mercurial is installed properly. The actual version information that it prints isn't so important; we simply care whether the command runs and prints anything at all.

Mercurial Distributed SCM (version 1.2) Copyright (C) 2005-2008 Matt Mackall <mpm@selenic.com> and others This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Mercurial provides a built-in help system. This is invaluable for those times when you find yourself stuck trying to remember how to run a command. If you are completely stuck, simply run hg help; it will print a brief list of commands, along with a description of what each does. If you ask for help on a specific command (as below), it prints more detailed information.

hg init [-e CMD] [--remotecmd CMD] [DEST] create a new repository in the given directory Initialize a new repository in the given directory. If the given directory does not exist, it is created. If no directory is given, the current directory is used. It is possible to specify an ssh:// URL as the destination. See 'hg help urls' for more information. options: -e --ssh specify ssh command to use --remotecmd specify hg command to run on the remote side use "hg -v help init" to show global options

For a more impressive level of detail (which you won't usually need) run hg help . The option is short for , and tells Mercurial to print more information than it usually would.

In Mercurial, everything happens inside a repository. The repository for a project contains all of the files that “belong to” that project, along with a historical record of the project's files.

There's nothing particularly magical about a repository; it is simply a directory tree in your filesystem that Mercurial treats as special. You can rename or delete a repository any time you like, using either the command line or your file browser.

Copying a repository is just a little bit special. While you could use a normal file copying command to make a copy of a repository, it's best to use a built-in command that Mercurial provides. This command is called hg clone, because it makes an identical copy of an existing repository.

destination directory: hello requesting all changes adding changesets adding manifests adding file changes added 5 changesets with 5 changes to 2 files updating working directory 2 files updated, 0 files merged, 0 files removed, 0 files unresolved

One advantage of using hg clone is that, as we can see above, it lets us clone repositories over the network. Another is that it remembers where we cloned from, which we'll find useful soon when we want to fetch new changes from another repository.

If our clone succeeded, we should now have a local directory called . This directory will contain some files.

total 4 drwxrwxr-x 3 bos bos 4096 May 5 06:55 hello Makefile hello.c

These files have the same contents and history in our repository as they do in the repository we cloned.

Every Mercurial repository is complete, self-contained, and independent. It contains its own private copy of a project's files and history. As we just mentioned, a cloned repository remembers the location of the repository it was cloned from, but Mercurial will not communicate with that repository, or any other, unless you tell it to.

What this means for now is that we're free to experiment with our repository, safe in the knowledge that it's a private “sandbox” that won't affect anyone else.

Making a local copy of a repository

When we take a more detailed look inside a repository, we can see that it contains a directory named . This is where Mercurial keeps all of its metadata for the repository.

. .. .hg Makefile hello.c

The contents of the directory and its subdirectories are private to Mercurial. Every other file and directory in the repository is yours to do with as you please.

To introduce a little terminology, the directory is the “real” repository, and all of the files and directories that coexist with it are said to live in the working directory. An easy way to remember the distinction is that the repository contains the history of your project, while the working directory contains a snapshot of your project at a particular point in history.

Working with a repository

One of the first things we might want to do with a new, unfamiliar repository is understand its history. The hg log command gives us a view of the history of changes in the repository.

changeset: 4:2278160e78d4 tag: tip user: Bryan O'Sullivan <bos@serpentine.com> date: Sat Aug 16 22:16:53 2008 +0200 summary: Trim comments. changeset: 3:0272e0d5a517 user: Bryan O'Sullivan <bos@serpentine.com> date: Sat Aug 16 22:08:02 2008 +0200 summary: Get make to generate the final binary from a .o file. changeset: 2:fef857204a0c user: Bryan O'Sullivan <bos@serpentine.com> date: Sat Aug 16 22:05:04 2008 +0200 summary: Introduce a typo into hello.c. changeset: 1:82e55d328c8c user: mpm@selenic.com date: Fri Aug 26 01:21:28 2005 -0700 summary: Create a makefile changeset: 0:0a04b987be5a user: mpm@selenic.com date: Fri Aug 26 01:20:50 2005 -0700 summary: Create a standard "hello, world" program

By default, this command prints a brief paragraph of output for each change to the project that was recorded. In Mercurial terminology, we call each of these recorded events a changeset, because it can contain a record of changes to several files.

The fields in a record of output from hg log are as follows.

  • : This field has the format of a number, followed by a colon, followed by a hexadecimal (or hex) string. These are identifiers for the changeset. The hex string is a unique identifier: the same hex string will always refer to the same changeset in every copy of this repository. The number is shorter and easier to type than the hex string, but it isn't unique: the same number in two different clones of a repository may identify different changesets.

  • : The identity of the person who created the changeset. This is a free-form field, but it most often contains a person's name and email address.

  • : The date and time on which the changeset was created, and the timezone in which it was created. (The date and time are local to that timezone; they display what time and date it was for the person who created the changeset.)

  • : The first line of the text message that the creator of the changeset entered to describe the changeset.

  • Some changesets, such as the first in the list above, have a field. A tag is another way to identify a changeset, by giving it an easy-to-remember name. (The tag named is special: it always refers to the newest change in a repository.)

The default output printed by hg log is purely a summary; it is missing a lot of detail.

Figure 2.1, “Graphical history of the hello repository” provides a graphical representation of the history of the repository, to make it a little easier to see which direction history is “flowing” in. We'll be returning to this figure several times in this chapter and the chapter that follows.

Figure 2.1. Graphical history of the repository


As English is a notoriously sloppy language, and computer science has a hallowed history of terminological confusion (why use one term when four will do?), revision control has a variety of words and phrases that mean the same thing. If you are talking about Mercurial history with other people, you will find that the word “changeset” is often compressed to “change” or (when written) “cset”, and sometimes a changeset is referred to as a “revision” or a “rev”.

While it doesn't matter what word you use to refer to the concept of “a changeset”, the identifier that you use to refer to “a specific changeset” is of great importance. Recall that the field in the output from hg log identifies a changeset using both a number and a hexadecimal string.

  • The revision number is a handy notation that is only valid in that repository.

  • The hexadecimal string is the permanent, unchanging identifier that will always identify that exact changeset in every copy of the repository.

This distinction is important. If you send someone an email talking about “revision 33”, there's a high likelihood that their revision 33 will not be the same as yours. The reason for this is that a revision number depends on the order in which changes arrived in a repository, and there is no guarantee that the same changes will happen in the same order in different repositories. Three changes can easily appear in one repository as , while in another as .

Mercurial uses revision numbers purely as a convenient shorthand. If you need to discuss a changeset with someone, or make a record of a changeset for some other reason (for example, in a bug report), use the hexadecimal identifier.

Changesets, revisions, and talking to other people

To narrow the output of hg log down to a single revision, use the (or ) option. You can use either a revision number or a hexadecimal identifier, and you can provide as many revisions as you want.

changeset: 3:0272e0d5a517 user: Bryan O'Sullivan <bos@serpentine.com> date: Sat Aug 16 22:08:02 2008 +0200 summary: Get make to generate the final binary from a .o file. changeset: 3:0272e0d5a517 user: Bryan O'Sullivan <bos@serpentine.com> date: Sat Aug 16 22:08:02 2008 +0200 summary: Get make to generate the final binary from a .o file. changeset: 1:82e55d328c8c user: mpm@selenic.com date: Fri Aug 26 01:21:28 2005 -0700 summary: Create a makefile changeset: 4:2278160e78d4 tag: tip user: Bryan O'Sullivan <bos@serpentine.com> date: Sat Aug 16 22:16:53 2008 +0200 summary: Trim comments.

If you want to see the history of several revisions without having to list each one, you can use range notation; this lets you express the idea “I want all revisions between and , inclusive”.

changeset: 2:fef857204a0c user: Bryan O'Sullivan <bos@serpentine.com> date: Sat Aug 16 22:05:04 2008 +0200 summary: Introduce a typo into hello.c. changeset: 3:0272e0d5a517 user: Bryan O'Sullivan <bos@serpentine.com> date: Sat Aug 16 22:08:02 2008 +0200 summary: Get make to generate the final binary from a .o file. changeset: 4:2278160e78d4 tag: tip user: Bryan O'Sullivan <bos@serpentine.com> date: Sat Aug 16 22:16:53 2008 +0200 summary: Trim comments.

Mercurial also honours the order in which you specify revisions, so hg log -r 2:4 prints 2, 3, and 4. while hg log -r 4:2 prints 4, 3, and 2.

Viewing specific revisions

While the summary information printed by hg log is useful if you already know what you're looking for, you may need to see a complete description of the change, or a list of the files changed, if you're trying to decide whether a changeset is the one you're looking for. The hg log command's (or ) option gives you this extra detail.

changeset: 3:0272e0d5a517 user: Bryan O'Sullivan <bos@serpentine.com> date: Sat Aug 16 22:08:02 2008 +0200 files: Makefile description: Get make to generate the final binary from a .o file.

If you want to see both the description and content of a change, add the (or ) option. This displays the content of a change as a unified diff (if you've never seen a unified diff before, see the section called “Understanding patches” for an overview).

changeset: 2:fef857204a0c user: Bryan O'Sullivan <bos@serpentine.com> date: Sat Aug 16 22:05:04 2008 +0200 files: hello.c description: Introduce a typo into hello.c. diff -r 82e55d328c8c -r fef857204a0c hello.c --- a/hello.c Fri Aug 26 01:21:28 2005 -0700 +++ b/hello.c Sat Aug 16 22:05:04 2008 +0200 @@ -11,6 +11,6 @@ int main(int argc, char **argv) { - printf("hello, world!\n"); + printf("hello, world!\"); return 0; }

The option is tremendously useful, so it's well worth remembering.

More detailed information

Let's take a brief break from exploring Mercurial commands to discuss a pattern in the way that they work; you may find this useful to keep in mind as we continue our tour.

Mercurial has a consistent and straightforward approach to dealing with the options that you can pass to commands. It follows the conventions for options that are common to modern Linux and Unix systems.

  • Every option has a long name. For example, as we've already seen, the hg log command accepts a option.

  • Most options have short names, too. Instead of , we can use . (The reason that some options don't have short names is that the options in question are rarely used.)

  • Long options start with two dashes (e.g. ), while short options start with one (e.g. ).

  • Option naming and usage is consistent across commands. For example, every command that lets you specify a changeset ID or revision number accepts both and arguments.

  • If you are using short options, you can save typing by running them together. For example, the command hg log -v -p -r 2 can be written as hg log -vpr2.

In the examples throughout this book, I usually use short options instead of long. This simply reflects my own preference, so don't read anything significant into it.

Most commands that print output of some kind will print more output when passed a (or ) option, and less when passed (or ).

Option naming consistency

Almost always, Mercurial commands use consistent option names to refer to the same concepts. For instance, if a command deals with changesets, you'll always identify them with or . This consistent use of option names makes it easier to remember what options a particular command takes.

All about command options

Now that we have a grasp of viewing history in Mercurial, let's take a look at making some changes and examining them.

The first thing we'll do is isolate our experiment in a repository of its own. We use the hg clone command, but we don't need to clone a copy of the remote repository. Since we already have a copy of it locally, we can just clone that instead. This is much faster than cloning over the network, and cloning a local repository uses less disk space in most cases, too[1].

updating working directory 2 files updated, 0 files merged, 0 files removed, 0 files unresolved

As an aside, it's often good practice to keep a “pristine” copy of a remote repository around, which you can then make temporary clones of to create sandboxes for each task you want to work on. This lets you work on multiple tasks in parallel, each isolated from the others until it's complete and you're ready to integrate it back. Because local clones are so cheap, there's almost no overhead to cloning and destroying repositories whenever you want.

In our repository, we have a file that contains the classic “hello, world” program.

/* * Placed in the public domain by Bryan O'Sullivan. This program is * not covered by patents in the United States or other countries. */ #include <stdio.h> int main(int argc, char **argv) { printf("hello, world!\"); return 0; }

Let's edit this file so that it prints a second line of output.

# ... edit edit edit ... /* * Placed in the public domain by Bryan O'Sullivan. This program is * not covered by patents in the United States or other countries. */ #include <stdio.h> int main(int argc, char **argv) { printf("hello, world!\"); printf("hello again!\n"); return 0; }

Mercurial's hg status command will tell us what Mercurial knows about the files in the repository.

Makefile hello.c M hello.c

The hg status command prints no output for some files, but a line starting with “” for . Unless you tell it to, hg status will not print any output for files that have not been modified.

The “” indicates that Mercurial has noticed that we modified . We didn't need to inform Mercurial that we were going to modify the file before we started, or that we had modified the file after we were done; it was able to figure this out itself.

It's somewhat helpful to know that we've modified , but we might prefer to know exactly what changes we've made to it. To do this, we use the hg diff command.

diff -r 2278160e78d4 hello.c --- a/hello.c Sat Aug 16 22:16:53 2008 +0200 +++ b/hello.c Tue May 05 06:55:53 2009 +0000 @@ -8,5 +8,6 @@ int main(int argc, char **argv) { printf("hello, world!\"); + printf("hello again!\n"); return 0; }

Making and reviewing changes

We can modify files, build and test our changes, and use hg status and hg diff to review our changes, until we're satisfied with what we've done and arrive at a natural stopping point where we want to record our work in a new changeset.

The hg commit command lets us create a new changeset; we'll usually refer to this as “making a commit” or “committing”.

When you try to run hg commit for the first time, it is not guaranteed to succeed. Mercurial records your name and address with each change that you commit, so that you and others will later be able to tell who made each change. Mercurial tries to automatically figure out a sensible username to commit the change with. It will attempt each of the following methods, in order:

  1. If you specify a option to the hg commit command on the command line, followed by a username, this is always given the highest precedence.

  2. If you have set the environment variable, this is checked next.

  3. If you create a file in your home directory called , with a entry, that will be used next. To see what the contents of this file should look like, refer to the section called “Creating a Mercurial configuration file” below.

  4. If you have set the environment variable, this will be used next.

  5. Mercurial will query your system to find out your local user name and host name, and construct a username from these components. Since this often results in a username that is not very useful, it will print a warning if it has to do this.

If all of these mechanisms fail, Mercurial will fail, printing an error message. In this case, it will not let you commit until you set up a username.

You should think of the environment variable and the option to the hg commit command as ways to override Mercurial's default selection of username. For normal use, the simplest and most robust way to set a username for yourself is by creating a file; see below for details.

To set a user name, use your favorite editor to create a file called in your home directory. Mercurial will use this file to look up your personalised configuration settings. The initial contents of your should look like this.

Home directory on Windows

When we refer to your home directory, on an English language installation of Windows this will usually be a folder named after your user name in . You can find out the exact name of your home directory by opening a command prompt window and running the following command.

# This is a Mercurial configuration file. [ui] username = Firstname Lastname <email.address@example.net>

The “” line begins a section of the config file, so you can read the “” line as meaning “set the value of the item in the section”. A section continues until a new section begins, or the end of the file. Mercurial ignores empty lines and treats any text from “” to the end of a line as a comment.

Creating a Mercurial configuration file

You can use any text you like as the value of the config item, since this information is for reading by other people, but will not be interpreted by Mercurial. The convention that most people follow is to use their name and email address, as in the example above.

Note

Mercurial's built-in web server obfuscates email addresses, to make it more difficult for the email harvesting tools that spammers use. This reduces the likelihood that you'll start receiving more junk email if you publish a Mercurial repository on the web.

When we commit a change, Mercurial drops us into a text editor, to enter a message that will describe the modifications we've made in this changeset. This is called the commit message. It will be a record for readers of what we did and why, and it will be printed by hg log after we've finished committing.

The editor that the hg commit command drops us into will contain an empty line or two, followed by a number of lines starting with “”.

This is where I type my commit comment. HG: Enter commit message. Lines beginning with 'HG:' are removed. HG: -- HG: user: Bryan O'Sullivan <bos@serpentine.com> HG: branch 'default' HG: changed hello.c

Mercurial ignores the lines that start with “”; it uses them only to tell us which files it's recording changes to. Modifying or deleting these lines has no effect.

Since hg log only prints the first line of a commit message by default, it's best to write a commit message whose first line stands alone. Here's a real example of a commit message that doesn't follow this guideline, and hence has a summary that is not readable.

changeset: 73:584af0e231be user: Censored Person <censored.person@example.org> date: Tue Sep 26 21:37:07 2006 -0700 summary: include buildmeister/commondefs. Add exports.

As far as the remainder of the contents of the commit message are concerned, there are no hard-and-fast rules. Mercurial itself doesn't interpret or care about the contents of the commit message, though your project may have policies that dictate a certain kind of formatting.

My personal preference is for short, but informative, commit messages that tell me something that I can't figure out with a quick glance at the output of hg log --patch.

If we run the hg commit command without any arguments, it records all of the changes we've made, as reported by hg status and hg diff.

A surprise for Subversion users

Like other Mercurial commands, if we don't supply explicit names to commit to the hg commit, it will operate across a repository's entire working directory. Be wary of this if you're coming from the Subversion or CVS world, since you might expect it to operate only on the current directory that you happen to be visiting and its subdirectories.

Writing a good commit message

If you decide that you don't want to commit while in the middle of editing a commit message, simply exit from your editor without saving the file that it's editing. This will cause nothing to happen to either the repository or the working directory.

Once we've finished the commit, we can use the hg tip command to display the changeset we just created. This command produces output that is identical to hg log, but it only displays the newest revision in the repository.

changeset: 5:b6fed4f21233 tag: tip user: Bryan O'Sullivan <bos@serpentine.com> date: Tue May 05 06:55:53 2009 +0000 files: hello.c description: Added an extra line of output diff -r 2278160e78d4 -r b6fed4f21233 hello.c --- a/hello.c Sat Aug 16 22:16:53 2008 +0200 +++ b/hello.c Tue May 05 06:55:53 2009 +0000 @@ -8,5 +8,6 @@ int main(int argc, char **argv) { printf("hello, world!\"); + printf("hello again!\n"); return 0; }

We refer to the newest revision in the repository as the tip revision, or simply the tip.

By the way, the hg tip command accepts many of the same options as hg log, so above indicates “be verbose”, specifies “print a patch”. The use of to print patches is another example of the consistent naming we mentioned earlier.

Admiring our new handiwork

Recording changes in a new changeset

We mentioned earlier that repositories in Mercurial are self-contained. This means that the changeset we just created exists only in our repository. Let's look at a few ways that we can propagate this change into other repositories.

To get started, let's clone our original repository, which does not contain the change we just committed. We'll call our temporary repository .

updating working directory 2 files updated, 0 files merged, 0 files removed, 0 files unresolved

We'll use the hg pull command to bring changes from into . However, blindly pulling unknown changes into a repository is a somewhat scary prospect. Mercurial provides the hg incoming command to tell us what changes the hg pull command would pull into the repository, without actually pulling the changes in.

comparing with ../my-hello searching for changes changeset: 5:b6fed4f21233 tag: tip user: Bryan O'Sullivan <bos@serpentine.com> date: Tue May 05 06:55:53 2009 +0000 summary: Added an extra line of output

Bringing changes into a repository is a simple matter of running the hg pull command, and optionally telling it which repository to pull from.

changeset: 4:2278160e78d4 tag: tip user: Bryan O'Sullivan <bos@serpentine.com> date: Sat Aug 16 22:16:53 2008 +0200 summary: Trim comments. pulling from ../my-hello searching for changes adding changesets adding manifests adding file changes added 1 changesets with 1 changes to 1 files (run 'hg update' to get a working copy) changeset: 5:b6fed4f21233 tag: tip user: Bryan O'Sullivan <bos@serpentine.com> date: Tue May 05 06:55:53 2009 +0000 summary: Added an extra line of output

As you can see from the before-and-after output of hg tip, we have successfully pulled changes into our repository. However, Mercurial separates pulling changes in from updating the working directory. There remains one step before we will see the changes that we just pulled appear in the working directory.

Pulling specific changes

It is possible that due to the delay between running hg incoming and hg pull, you may not see all changesets that will be brought from the other repository. Suppose you're pulling changes from a repository on the network somewhere. While you are looking at the hg incoming output, and before you pull those changes, someone might have committed something in the remote repository. This means that it's possible to pull more changes than you saw when using hg incoming.

If you only want to pull precisely the changes that were listed by hg incoming, or you have some other reason to pull a subset of changes, simply identify the change that you want to pull by its changeset ID, e.g. hg pull -r7e95bb.

Pulling changes from another repository

We have so far glossed over the relationship between a repository and its working directory. The hg pull command that we ran in the section called “Pulling changes from another repository” brought changes into the repository, but if we check, there's no sign of those changes in the working directory. This is because hg pull does not (by default) touch the working directory. Instead, we use the hg update command to do this.

printf("hello, world!\"); 1 files updated, 0 files merged, 0 files removed, 0 files unresolved printf("hello, world!\"); printf("hello again!\n");

It might seem a bit strange that hg pull doesn't update the working directory automatically. There's actually a good reason for this: you can use hg update to update the working directory to the state it was in at any revision in the history of the repository. If you had the working directory updated to an old revision—to hunt down the origin of a bug, say—and ran a hg pull which automatically updated the working directory to a new revision, you might not be terribly happy.

Since pull-then-update is such a common sequence of operations, Mercurial lets you combine the two by passing the option to hg pull.

If you look back at the output of hg pull in the section called “Pulling changes from another repository” when we ran it without , you can see that it printed a helpful reminder that we'd have to take an explicit step to update the working directory.

To find out what revision the working directory is at, use the hg parents command.

changeset: 5:b6fed4f21233 tag: tip user: Bryan O'Sullivan <bos@serpentine.com> date: Tue May 05 06:55:53 2009 +0000 summary: Added an extra line of output

If you look back at Figure 2.1, “Graphical history of the hello repository”, you'll see arrows connecting each changeset. The node that the arrow leads from in each case is a parent, and the node that the arrow leads to is its child. The working directory has a parent in just the same way; this is the changeset that the working directory currently contains.

To update the working directory to a particular revision, give a revision number or changeset ID to the hg update command.

2 files updated, 0 files merged, 0 files removed, 0 files unresolved changeset: 2:fef857204a0c user: Bryan O'Sullivan <bos@serpentine.com> date: Sat Aug 16 22:05:04 2008 +0200 summary: Introduce a typo into hello.c. 2 files updated, 0 files merged, 0 files removed, 0 files unresolved changeset: 5:b6fed4f21233 tag: tip user: Bryan O'Sullivan <bos@serpentine.com> date: Tue May 05 06:55:53 2009 +0000 summary: Added an extra line of output

If you omit an explicit revision, hg update will update to the tip revision, as shown by the second call to hg update in the example above.

Updating the working directory

Mercurial lets us push changes to another repository, from the repository we're currently visiting. As with the example of hg pull above, we'll create a temporary repository to push our changes into.

updating working directory 2 files updated, 0 files merged, 0 files removed, 0 files unresolved

The hg outgoing command tells us what changes would be pushed into another repository.

comparing with ../hello-push searching for changes changeset: 5:b6fed4f21233 tag: tip user: Bryan O'Sullivan <bos@serpentine.com> date: Tue May 05 06:55:53 2009 +0000 summary: Added an extra line of output

And the hg push command does the actual push.

pushing to ../hello-push searching for changes adding changesets adding manifests adding file changes added 1 changesets with 1 changes to 1 files

As with hg pull, the hg push command does not update the working directory in the repository that it's pushing changes into. Unlike hg pull, hg push does not provide a option that updates the other repository's working directory. This asymmetry is deliberate: the repository we're pushing to might be on a remote server and shared between several people. If we were to update its working directory while someone was working in it, their work would be disrupted.

What happens if we try to pull or push changes and the receiving repository already has those changes? Nothing too exciting.

pushing to ../hello-push searching for changes no changes found

Pushing changes to another repository

When we clone a repository, Mercurial records the location of the repository we cloned in the file of the new repository. If we don't supply a location to hg pull from or hg push to, those commands will use this location as a default. The hg incoming and hg outgoing commands do so too.

If you open a repository's file in a text editor, you will see contents like the following.

[paths] default = http://www.selenic.com/repo/hg

It is possible—and often useful—to have the default location for hg push and hg outgoing be different from those for hg pull and hg incoming. We can do this by adding a entry to the section of the file, as follows.

[paths] default = http://www.selenic.com/repo/hg default-push = http://hg.example.com/hg

The commands we have covered in the previous few sections are not limited to working with local repositories. Each works in exactly the same fashion over a network connection; simply pass in a URL instead of a local path.

comparing with https://bitbucket.org/bos/hg-tutorial-hello searching for changes changeset: 5:b6fed4f21233 tag: tip user: Bryan O'Sullivan <bos@serpentine.com> date: Tue May 05 06:55:53 2009 +0000 summary: Added an extra line of output

In this example, we can see what changes we could push to the remote repository, but the repository is understandably not set up to let anonymous users push to it.

pushing to http://bitbucket.org/bos/hg-tutorial-hello searching for changes ssl required

Sharing changes over a network

It is just as easy to begin a new project as to work on one that already exists. The hg init command creates a new, empty Mercurial repository.

This simply creates a repository named in the current directory.

total 12 -rw-rw-r-- 1 bos bos 47 May 5 06:55 goodbye.c -rw-rw-r-- 1 bos bos 45 May 5 06:55 hello.c drwxrwxr-x 3 bos bos 4096 May 5 06:55 myproject

We can tell that is a Mercurial repository, because it contains a directory.

total 12 drwxrwxr-x 3 bos bos 4096 May 5 06:55 . drwx------ 3 bos bos 4096 May 5 06:55 .. drwxrwxr-x 3 bos bos 4096 May 5 06:55 .hg

If we want to add some pre-existing files to the repository, we copy them into place, and tell Mercurial to start tracking them using the hg add command.

adding goodbye.c adding hello.c A goodbye.c A hello.c

Once we are satisfied that our project looks right, we commit our changes.

It takes just a few moments to start using Mercurial on a new project, which is part of its appeal. Revision control is now so easy to work with, we can use it on the smallest of projects that we might not have considered with a more complicated tool.

Chapter 2. A tour of Mercurial: the basics

Editing History

How to modify repository history.

1. Alternatives to editing history

If you would like to undo a changeset, but don't mind having it preserved in history, you can use the backout command to reverse it. This is generally preferred to changing history as it's non-destructive and lets future developers use that history to avoid making the same mistakes.

2. Why changing history is hard

First, consider changesets. Each changeset's id is a cryptographically strong hash of the changeset data, which recursively includes all of the changeset content (data and metadata) as well as the ids of its parents. Change any bit in a changeset itself or the history it's based on and you will change its id. This makes Mercurial changesets tamperproof: it is computationally infeasible to make a tampered changeset that has the same changeset id as a given changeset.

Second, Mercurial's network protocol assumes history is append-only. Pushing and pulling only ever add history to repositories, never remove it. If the history you want to modify has already been published to public repositories, there is no way to recall it except with the cooperation of everyone who has pulled a copy, which is generally not practical.

3. Consequences of editing history

If you edit your repository history, the changeset IDs (i.e., the identity of the changesets) will be changed from the point of the edit forward. Suppose for example that the bad revision is number 3. Then, before your change, the repository will look like this:

After your edit, changes , and will have new changeset IDs:

As long as nobody else has seen the repository before the change, this is okay. If people have already pulled from your repository, then things become more complex. Suppose someone cloned your repository before you edited , and then pulled afterwards. They would see completeley new changesets, , and , and their tree of changes would look like this:

This is exactly what one would expect - Mercurial always works this way when you pull in changes from others: It takes two directed acyclic graphs (one representing your repository, the other representing the repository you pull from) and merges common nodes in the graphs to produce a new acyclic graph.

Notice that the changesets and and and are identical changes (neither of you edited them), but they have different hash values since they have different histories (because of ).

So you can simply ask your friend to strip , which will strip and as well - their changes are preserved in and . This makes your friend's repository identical to yours:

On the other hand, if your friend had already committed new work on top of , then his repository looks like this before the pull:

After the pull he gets

Now it is not simply enough to strip BAD since that destroys the work in and . But you can ask him to import and into MQ, pop the queue, strip BAD and apply the queue again (this time to ). This is rebasing his changes on :

hg qimport -r R6:R7 hg qpop -a hg strip BAD hg update -C R5a # might not be necessary hg qpush -a hg qdelete -r qbase:qtip

4. Motivation

Having said all of this, there are good reasons why people might want to change history. Here are some examples:

  • It's a purely personal repository, and you're happy to recreate any branches you made.
  • You have a well-controlled development environment, where telling everyone to delete repository copies and re-clone is practical.
  • You have strong controls on what gets pulled into the central repository (so you can stop the history being "unrevised") and don't care too much about breaking people's clones.
  • Your lawyers insist you do your best to remove something, but they are happy with "reasonable endeavours".
  • You have to keep the version history for 5 years, but not longer.

As long as you understand the implications, it is possible to do this.

The key implication is that the changeset IDs will change from the point at which the revision occurs. This means that developers with clones will need to rebase their changes, and care must be taken to manage the effects of the revision. This document does not attempt to cover this process. It assumes that if you need to do this, you will ensure that you know what to do, and you make sure it happens.

5. Mercurial safety features

Since version 2.2, Mercurial includes a feature called phases that will prevent you from editing history that has been pushed or pulled to another repository. If the above motivations apply, you can still forcibly change the phase of published changesets. It's also possible to set up a repository as "non-publishing", such that when you push changes to it, they do not get marked as public. See phases for more information.

6. Scenarios

6.1. Basic changeset removal with clone

One of the simplest tasks is removing the most recent commits in a repository. This can be done non-destructively with clone:

hg clone -r LASTGOODREVISION oldrepo newrepo

- and then perhaps move away as a backup and rename to take its place. If the old repository have multiple heads you might want to pull them too.

6.2. Amending the latest changeset with commit --amend

Another common history-editing task is changing something in the most recent commit, either to modify the code or to edit the commit message. Since version 2.2, the commit command has a option that will fold any changes into your working directory into the latest commit, and allow you to edit the commit message.

can in fact be used on any changeset that is a (topological) branch head, that is, one that has no child changesets. It need not be the tip revision.

6.3. Editing recent history with MQ

Recent history can be modified fairly easily with the MQ extension:

  • Remove a change with ''

  • Collapse a series of changes into one with ''

  • Edit a commit message, obliterate a file, or any other modification of the changes themselves with ''

Some caveats exist. First, MQ can't operate on merge changesets. Second, by default MQ works with textual changes. If the history you edit contains binary files, permission changes or other non-textual changes, enable extended diffs for your repo. Add the following section to your :

[diff] git = True

(alternatively remember to add to every and invocation below).

Let's pretend you comitted the file which should not be commited in revision BAD. Then doing

hg qimport -r BAD:tip

will import the changesets into MQ. You can find newly created patches in .

Those patches are nevertheless still applied, to strip them from the history, you need the command. Issue

hg qpop -a

Now all changes since revision BAD are no longer available in your repository history. They are saved as patches in - and only there.

If you want to undo the entire changeset BAD (obliterate it!), then do this:

hg qdelete BAD.diff

If you only want to edit the changeset (remove some edits, avoid commiting one file while leaving the remaining changes etc), then do this:

hg qpush BAD.diff # edit files, remove passwords, revert newly added files etc. hg qrefresh

Note that removes whole patch, while modifies it.

Now if you want to combine the next two patches (say GOOD1.diff and GOOD2.diff) into one patch, do this:

hg qpush GOOD1.diff hg qfold GOOD2.diff

The changes from GOOD2.diff have been integrated ("folded") into GOOD1.diff and the GOOD2.diff patch itself has been deleted from the queue (it can be kept there but unmanaged, by using ).

To go back to standard Mercurial changesets you do

hg qpush -a hg qfinish -a

The command turns an applied patch into a real Mercurial changeset. Here we use it to turn all applied patches into normal changesets.

One thing this process over-simplifies is that the step may fail, if later changes depend on the obliterated data. In that case, you have to fix the problem manually - there's no easy answer here, after you edit history, you need to manage the consequences, in your own repository as well as elsewhere. So you need to push conflicting patches one by one, edit them as appropriate, and the changes. Read chapter about Mercurial Queues from Mercurial Book for details.

Here's a real example (captured on Windows). I add a file I shouldn't in revision 1, edit it in revision 2, then try to obliterate revision 1. After doing so, I need to resolve the issue with revision 2 (editing what is now a nonexistent file). In this case it's easy, we just drop revision 2 as well. In other cases, this could be much harder to deal with, possibly even so much harder that you decide it's not worth it. It depends on your data (and possibly your lawyers!).

>hg init >echo Line 1 >a >hg commit --addremove -m "Added a" adding a >rem Add the launch codes for the nuclear arsenal here... >echo Super secret >b >hg commit --addremove -m "Added b" adding b >hg log changeset: 1:65bcb0d3f953 tag: tip user: "Paul Moore <user@example.com>" date: Sat Mar 22 16:43:00 2008 +0000 summary: Added b changeset: 0:5dd6949828e1 user: "Paul Moore <user@example.com>" date: Sat Mar 22 16:42:40 2008 +0000 summary: Added a >rem Here we compound the error... >echo More secret stuff >>b >hg commit -m "Edited b" >echo More safe stuff >>a >hg commit -m "Edited a" >hg log changeset: 3:ea4f8ad48048 tag: tip user: "Paul Moore <user@example.com>" date: Sat Mar 22 16:43:46 2008 +0000 summary: Edited a changeset: 2:6bb0d654a0a6 user: "Paul Moore <user@example.com>" date: Sat Mar 22 16:43:32 2008 +0000 summary: Edited b changeset: 1:65bcb0d3f953 user: "Paul Moore <user@example.com>" date: Sat Mar 22 16:43:00 2008 +0000 summary: Added b changeset: 0:5dd6949828e1 user: "Paul Moore <user@example.com>" date: Sat Mar 22 16:42:40 2008 +0000 summary: Added a >rem We realise our mistake. We need to get rid of changeset 1, >rem so that file b is no longer in out repository! >hg qinit >hg qimport -r 1:tip >hg qpop -a Patch queue now empty >rem Delete changeset 1 >hg qdelete 1.diff >rem Now start to put everything back >hg qpush -a applying 2.diff unable to find 'b' for patching 1 out of 1 hunk FAILED -- saving rejects to file b.rej patch failed, unable to continue (try -v) b: No such file or directory b not tracked! patch failed, rejects left in working dir Errors during apply, please fix and refresh 2.diff >rem Hmm, change 2 depends on file b. Fix things up. Luckily, this is easy, just delete change 2 as well. >rem In reality, change 2 may contain other edits, and we'd need to do some further fixing. >hg qdelete 2.diff abort: cannot delete applied patch 2.diff >rem Even this isn't as simple as all that. Back out change 2 so we can delete it. >hg qpop -a Patch queue now empty >hg qdelete 2.diff >rem And now we're good to go. >hg qpush -a applying 3.diff Now at: 3.diff >hg qdelete -r qbase:qtip >rem No sign of file b, and the world is safe again. >rem Except, of course, that evil Doctor Death pulled from us 5 minutes ago. >rem But at least as we all get blown up, we can be glad that it's not a technical problem :-) >hg log changeset: 1:a50e33884959 tag: tip user: "Paul Moore <user@example.com>" date: Sat Mar 22 16:43:46 2008 +0000 summary: Edited a changeset: 0:5dd6949828e1 user: "Paul Moore <user@example.com>" date: Sat Mar 22 16:42:40 2008 +0000 summary: Added a

The process of going up through the patch stack, tidying up the debris (as in our example, where change 2 wouldn't apply as it depended on the obliterated file "b"), is what is generally referred to as "rebasing" the changes. It can be simple, in the case of a localised change, but it can be arbitrarily complex. Before you start editing history, you need to be sure that you know what to do to rebase.

7. Other options

There are other options that may be more appropriate in particular circumstances.

  • If you catch your mistake immediately (or reasonably soon), you can just use to roll back the latest (one or more) changes. This produces a bundle of the stripped changes as a backup, so you could strip the changes, then clone and fix up the problem and push the fixes. The effect would be the same as editing history in place, but the window of time for Doctor Death to grab the nuclear codes is limited.

  • If you want to remove file(s) that shouldn't have been added, use the ConvertExtension with --filemap option to "convert" your Mercurial repository to another Mercurial repository. You'll want to make sure that you set convert.hg.saverev to False if you want to keep in common the history prior to your removed file(s).

  • To easily reorder, accept, fold or reject changesets there's also the HisteditExtension

8. See also


CategoryHowTo

EditingHistory (last edited 2013-08-29 16:11:40 by EgorKuropatkin)

0 Thoughts to “Hg Update Commit Descriptive Essay

Leave a comment

L'indirizzo email non verrà pubblicato. I campi obbligatori sono contrassegnati *