How to stage a rename without subsequent edits in

2019-01-18 04:14发布

问题:

I have a file that I've renamed and then edited. I would like to tell Git to stage the rename, but not the content modifications. That is, I wish to stage the deletion of the old file name, and the addition of the old file contents with the new file name.

So I have this:

Changes not staged for commit:

        deleted:    old-name.txt

Untracked files:

        new-name.txt

but want either this:

Changes to be committed:

        new file:   new-name.txt
        deleted:    old-name.txt

Changes not staged for commit:

        modified:   new-name.txt

or this:

Changes to be committed:

        renamed:    old-name.txt -> new-name.txt

Changes not staged for commit:

        modified:   new-name.txt

(where the similarity measure should be 100%).

I can't think of a straightforward way to do this.

Is there syntax for getting the contents of a specific revision of a specific file, and adding this to the git staging area under a specified path?

The delete part, with git rm, is fine:

$ git rm old-name.txt

It's the add part of the rename I'm struggling with. (I could save the new contents, checkout a fresh copy (for the old contents), mv in the shell, git add, and then recover the new contents, but that seems like a very long way around!)

Thanks!

回答1:

Git doesn't really do renames. They're all computed in an "after the fact" fashion: git compares one commit with another and, at compare time, decides if there was a rename. This means that whether git considers something "a rename" changes dynamically. I know you're asking about a commit you haven't even made yet, but bear with me, this really all does tie in (but the answer will be long).


When you ask git (via git show or git log -p or git diff HEAD^ HEAD) "what happened in the last commit", it runs a diff of the previous commit (HEAD^ or HEAD~1 or the actual raw SHA-1 for the previous commit—any of these will do to identify it) and the current commit (HEAD). In making that diff it may discover that there used to be an old.txt and isn't any more; and there was no new.txt but there is now.

These file names—files that used to be there but aren't, and files that are there now that weren't—are put into pile marked "candidates for rename". Then, for each name in the pile, git compares "old contents" and "new contents". The comparison for exact match is super-easy because of the way git reduces contents to SHA-1s; if the exact-match fails, git switches to an optional "are the contents at least similar" diff to check for renames. With git diff this optional step is controlled by the -M flag. With other commands it's either set by your git config values, or hardcoded into the command.

Now, back to the staging area and git status: what git stores in the index / staging-area is basically "a prototype for the next commit". When you git add something, git stores the file contents right at that point, computing the SHA-1 in the process and then storing the SHA-1 in the index. When you git rm something, git stores a note in the index saying "this path name is being deliberately removed on the next commit".

The git status command, then, simply does a diff—or really, two diffs: HEAD vs index, for what is going to be committed; and index vs work-tree, for what could be (but isn't yet) going to be committed.

In that first diff, git uses the same mechanism as always to detect renames. If there's a path in the HEAD commit that is gone in the index, and a path in the index that's new and not in the HEAD commit, it's a candidate for rename-detection. The git status command hardwires rename detection to "on" (and the file count limit to 200; with just one candidate for rename detection this limit is plenty).


What does all this mean for your case? Well, you renamed a file (without using git mv, but it doesn't really matter because git status finds the rename, or fails to find it, at git status time), and now have a newer, different version of the new file.

If you git add the new version, that newer version goes into the repo, and its SHA-1 is in the index, and when git status does a diff it will compare the new and old. If they're at least "50% similar" (the hardwired value for git status), git will tell you the file is renamed.

Of course, git add-ing the modified contents is not quite what you asked for: you wanted to do an intermediate commit where the file is only renamed, i.e., a commit with a tree with the new name, but the old contents.

You don't have to do this, because of all of the above dynamic rename detection. If you want to do it (for whatever reason) ... well, git doesn't make it all that easy.

The most straightforward way is just as you suggest: move the modified contents somewhere out of the way, use git checkout -- old-name.txt, then git mv old-name.txt new-name.txt, then commit. The git mv will both rename the file in the index/staging-area, and rename the work-tree version.

If git mv had a --cached option like git rm does, you could just git mv --cached old-name.txt new-name.txt and then git commit. The first step would rename the file in the index, without touching the work-tree. But it doesn't: it insists on overwriting the work-tree version, and it insists that the old name must exist in the work-tree to start.

The single step method for doing this without touching the work-tree is to use git update-index --index-info, but that, too, is somewhat messy (I'll show it in a moment anyway). Fortunately, there's one last thing we can do. I've set up the same situation you had, by renaming the old name to the new one and modifying the file:

$ git status
On branch master
Changes not staged for commit:
  (use "git add/rm <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

    deleted:    old-name.txt

Untracked files:
  (use "git add <file>..." to include in what will be committed)

    new-name.txt

What we do now is, first, manually put the file back under its old name, then use git mv to switch again to the new name:

$ mv new-name.txt old-name.txt
$ git mv old-name.txt new-name.txt

This time git mv updates the name in the index, but keeps the original contents as the index SHA-1, yet moves the work-tree version (new contents) into place in the work-tree:

$ git status
On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

    renamed:    old-name.txt -> new-name.txt

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

    modified:   new-name.txt

Now just git commit to make a commit with the rename in place, but not the new contents.

(Note that this depends on there not being a new file with the old name!)


What about using git update-index? Well, first let's get things back to the "changed in work-tree, index matches HEAD commit" state:

$ git reset --mixed HEAD  # set index=HEAD, leave work-tree alone

Now let's see what's in the index for old-name.txt:

$ git ls-files --stage -- old-name.txt
100644 2b27f2df63a3419da26984b5f7bafa29bdf5b3e3 0   old-name.txt

So, what we need git update-index --index-info to do is to wipe out the entry for old-name.txt but make an otherwise identical entry for new-name.txt:

$ (git ls-files --stage -- old-name.txt;
   git ls-files --stage -- old-name.txt) |
  sed -e \
'1s/^[0-9]* [0-9a-f]*/000000 0000000000000000000000000000000000000000/' \
      -e '2s/old-name.txt$/new-name.txt/' | 
  git update-index --index-info

(note: I broke the above up for posting purposes, it was all one line when I typed it in; in sh/bash, it should work broken-up like this, given the backslashes I added to continue the "sed" command).

There are some other ways to do this, but simply extracting the index entry twice and modifying the first one into a removal and the second with the new name seemed the easiest here, hence the sed command. The first substitution changes the file mode (100644 but any mode would be turned into all-zeros) and SHA-1 (match any SHA-1, replace with git's special all-zeros SHA-1), and the second leaves the mode and SHA-1 alone while replacing the name.

When the update-index finishes, the index has recorded the removal of the old path and the addition of the new path (with same mode and SHA-1 as were in the old path).

Note that this could fail badly if the index had unmerged entries for old-name.txt since there might be other stages (1 to 3) for the file.



回答2:

@torek gave a very clear and full answer. There's lots of very useful detail there; it's well worth a proper read.

But, for the sake of those in a rush, the crux of the very simple solution given was this:

What we do now is, first, manually put the file back under its old name, then use git mv to switch again to the new name:

$ mv new-name.txt old-name.txt
$ git mv old-name.txt new-name.txt

(It was just the mv back that I was missing, to make the git mv possible.)

Please upvote @torek's answer if you find this helpful.