Change file name case using git filter-branch

2020-07-22 10:20发布

问题:

I've got a git repo where some files differ in name by case only across branches.

As a simplified example, in master, there's a file alpha/beta/foo.cpp and in branch bar, there's a file alpha/beta/Foo.cpp.

The problem is that when I attempt to switch branches, git won't allow me to do it. There's an error that I don't have handy at the moment, but it basically looks like

changes to file alpha/beta/Foo.cpp would be overwritten -- aborting

even though a subsequent git status shows the working directory is clean.

Since this repo is not yet shared (it's actually a mirror of a large Perforce depot that I'm working on migrating), I see no problem with using git filter-branch to rewrite the history, but when I do so, any case-sensitive changes I make are lost.

When I use

git filter-branch -f -d /tmp/tmpfs/filter-it \
--tree-filter path/to/script \
--tag-name-filter cat --prune-empty -- --all

with the script looking like this

#!/bin/bash
if [ -e alpha/beta/foo.cpp ] ; then
    mv alpha/beta/foo.cpp alpha/beta/Foo.cpp
fi

the end result winds up with rewritten refs (expected) but the files themselves are not actually renamed across both branches as I would expect.

Any suggestions?

回答1:

The Short Answer

The following solution was modified from multiple sources:

  1. filter-branch --index-filter always failing with "fatal: bad source".

  2. Renaming The Past With Git.

Here is a filter-branch invocation that uses an index-filter to rewrite the commits without a working copy, so it should run really fast. Note that, as an example, I'm renaming the file alpha/beta/foo.cpp to alpha/beta/Foo.cpp.

As with any potentially destructive Git operation, it is highly recommended that you make a backup clone of your repo before you use this:

git filter-branch --index-filter '
git ls-files --stage | \
sed "s:alpha/beta/foo.cpp:alpha/beta/Foo.cpp:" | \
GIT_INDEX_FILE=$GIT_INDEX_FILE.new \
git update-index --index-info && \
mv "$GIT_INDEX_FILE.new" "$GIT_INDEX_FILE"
' HEAD

Note that HEAD is optional, because it should be the default for filter-branch. It will rewrite all commits from the root commit to the commit pointed to by HEAD. If you want to increase the speed of the filter-branch even more, you can pass a range of commits instead of HEAD, such as

HEAD~20..HEAD

to rewrite just the last 20 commits. The beginning of the range is exclusive, i.e. it's not rewritten, only its children are, and the ending HEAD is again optional, since its a default.

Verification

It's a good idea to do some quick sanity-checks to verify that the filter-branch did what you expected it to do. First, compare the current history with the previous history:

git diff --name-status refs/original/refs/heads/master
D       foo.cpp
A       Foo.cpp

Notice that when the previous history is compared relative to the current one, the current history no longer has foo.cpp (it's deleted), while Foo.cpp was added to it.

Now confirm that foo.cpp contains the exact same content as Foo.cpp:

git diff refs/original/refs/heads/master:foo.cpp Foo.cpp

The output should be empty, meaning that there are no differences between the two versions.

Detailed Explanation

The following breakdown is also available in more detail from the blog post "Renaming The Past With Git". I am summarizing it here. The basic idea of the script is to create a new index file that contains the new name for the file foo (i.e. foo becomes Foo), and then replace the old index with the new one.

Step 1: Get the Index File Contents

First, the current index file contents are output in a form that can then be fed into git update-index, using the --stage option:

git ls-files --stage
100644 195ff081f7d0d37a60181de790ae1c6b9f177be8 0       alpha/beta/foo.cpp
100644 0504de8997941bf10bcfb5af9a0bf472d6c061d3 0       LICENSE
100644 6293167f0eb7389b2f6f6b73e838d3a547787cbf 0       README.md
...etc...

Step 2: Rename the File

Since we want to rename foo.cpp to Foo.cpp, we use sed with a regular expression to replace the string foo with Foo:

"s:alpha/beta/foo.cpp:alpha/beta/Foo.cpp:"

In the above command, I'm using a colon : to delimit the regexes in the sed command, but you can use other characters as delimiters too, such as pipe |. I chose a colon instead of the more standard forward-slash / as a delimeter so that it wasn't necessary to escape the forward-slashes used in the file paths.

After piping git ls-files --stage through sed, you should get the following:

git ls-files --stage | sed "s:alpha/beta/foo.cpp:alpha/beta/Foo.cpp:"
100644 195ff081f7d0d37a60181de790ae1c6b9f177be8 0       alpha/beta/Foo.cpp
100644 0504de8997941bf10bcfb5af9a0bf472d6c061d3 0       LICENSE
100644 6293167f0eb7389b2f6f6b73e838d3a547787cbf 0       README.md
...etc...

Step 3: Create a New Index with the Renamed File

Now the modified output of git ls-files --stage can be piped into git update-index --index-info to rename the file in the index. Because we want to create an entirely new index to replace the old one, some environment variables for the path to the index file need to be set first, before invoking the git update-index command:

GIT_INDEX_FILE=$GIT_INDEX_FILE.new git update-index --index-info

Step 4: Replace the Old Index

Now we just replace the old index with the new one, which effectively "renames" the file:

mv "$GIT_INDEX_FILE.new" "$GIT_INDEX_FILE"

Summary

Here's the whole command again, when everything is put together:

git filter-branch --index-filter '
git ls-files --stage | \
sed "s:alpha/beta/foo.cpp:alpha/beta/Foo.cpp:" | \
GIT_INDEX_FILE=$GIT_INDEX_FILE.new \
git update-index --index-info && \
mv "$GIT_INDEX_FILE.new" "$GIT_INDEX_FILE"
' HEAD

Documentation

  1. git filter-branch.

  2. git ls-files.

  3. git update-index.

  4. Git environment variables.



回答2:

My .profile alias based on @cupcake 's answer, fixing issues with how to expand variables.

Example usage:

mvidx src/myfile.cs src/myfolder/myfile.cs origin/develop..feature/myfeature

~/.profile bash config file

alias mvidx=rewriteIndexToMoveFile

red="\e[0;31m"
green="\e[0;32m"

rewriteIndexToMoveFile() {
    if [ $# -ne 3 ] ; then        
        echo -e "Rewrite index to move a file in a range of commits."
        echo -e "Args: <from file path> <to file path> <range of commits>"
        echo -e "${green}Examples:"
        echo -e "mvidx src/myproject/myfile.cs src/myproject/subfolder/myfile.cs origin/develop..feature/myfeature"
        return
    fi

  fromFilePath=$1
  toFilePath=$2
  revisionRange=$3

  echo -e "Renaming ${red}$fromFilePath${nc} to ${red}$toFilePath${nc}."

git filter-branch --index-filter 'git ls-files -s \
    | sed "s|\t'"$fromFilePath"'|\t'"$toFilePath"'|" \
    | GIT_INDEX_FILE=$GIT_INDEX_FILE.new git update-index --index-info \
        && mv "$GIT_INDEX_FILE.new" "$GIT_INDEX_FILE"' \
  $revisionRange
}