How to extract one file with commit history from a

2019-01-31 07:12发布

问题:

My situation was, I have a git repo converted from SVN to HG to GIT, and I wanted to extract just one source file. I also had weird characters like aÌ (an encoding mismatch corrupted Unicode ä) and spaces in the filenames.

Seems it's not particularly easy, and that's the reason I'll be answering my own question despite many similar questions regarding git [index-filter|subdirectory-filter|filter-tree], as I needed to use all the previous to achieve this!

So the question is: "How can I extract one file from a repository and place it at the root of the new repo?"

回答1:

A faster and easier-to-understand filter that accomplishes the same thing:

git filter-branch --index-filter '
                        git read-tree --empty
                        git reset $GIT_COMMIT -- $your $files $here
                ' \
        -- --all -- $your $files $here


回答2:

First a quick note, that even a spell like in a comment on Splitting a set of files within a git repo into their own repository, preserving relevant history

SPELL='git ls-tree -r --name-only --full-tree "$GIT_COMMIT" | grep -v "trie.lisp" | tr "\n" "\0" | xargs -0 git rm --cached -r --ignore-unmatch'
git filter-branch --prune-empty --index-filter "$SPELL" -- --all

will not help with files named like imaging/DrinkkejaI<0300>$'\302\210'.txt_74x2032.gif. The aI<0300>$'\302\210' part once was a single letter: ä.

So in order to extract a single file, in addition to filter-branch I also needed to do:

git filter-branch -f --subdirectory-filter lisp/source/model HEAD

Alternatively, you can use --tree-filter: (the test is needed, because the file was at another directory earlier, see: How can I move a directory in a Git repo for all commits?)

MV_FILTER='test -f source/model/trie.lisp && mv ./source/model/trie.lisp . || echo "Nothing to do."'
git filter-branch --tree-filter $MV_FILTER HEAD --all

To see all the names a file have had, use:

git log --pretty=oneline --follow --name-only git-path/to/file | grep -v ' ' | sort -u

As described at http://whileimautomaton.net/2010/04/03012432

Also follow the steps on afterwards:

$ git reset --hard
$ git gc --aggressive
$ git prune
$ git remote rm origin # Otherwise changes will be pushed to where the repo was cloned from


回答3:

Note that things get much easier if you combine this with the additional step of moving the desired file(s) into a new directory.

This might be a quite common use case (e.g. moving the desired single file to the root dir).
I did it (using git 1.9) like this (first moving the file(s), then deleting the old tree):

git filter-branch -f --tree-filter 'mkdir -p new_path && git mv -k -f old_path/to/file new_path/'
git filter-branch -f --prune-empty --index-filter 'git rm -r --cached --ignore-unmatch old_path'

You can even easily use wildcards for the desired files (without messing around with grep -v ).

I'd think that this ('mv' and 'rm') could also be done in one filter-branch but it did'n work for me.

I didn't try it with weird characters but I hope this helps anyway. Making things easier seems always to be a good idea to me.

Hint:
This is a time consuming action on large repos. So if you want to do several actions (like getting a bunch of files and then rearrange them in 'new_path/subdirs') it's a good idea to do the 'rm' part as soon as possible to get a smaller and faster tree.