We recently completed a conversion from Mercurial to Git, everything went smoothly, we were even able to get the transforms needed to make everything look / work relatively correctly in the repository. We added a .gitignore
and got underway.
However we're experiencing some extreme slowdowns as soon as we encorporate/work with any of our old feature branches. A little exploring and we found that since the .gitignore
was only added to the develop
branch when we look at other commits without merging develop up into them git chuggs because it's choking trying to analyze all our build artifacts (binary files) etc... since there was no .gitignore
file for these old branches.
What we'd like to do is effectively insert a new root commit with the .gitignore so it would retroactively populate in all heads/tags. We're comfortable with a re-write of history, our team is relatively small so getting everyone to halt for this operation and re-pull thier repositories when the history re-write is done is no problem.
I've found information about rebasing master onto a new root commit and this works for master, the problem is it leaves our feature branches detached on the old history tree, it also replays the entire history with a new commit date/time.
Any ideas or are we out of luck on this one?
What you want to do will involve two phases: retroactively add a new root with a suitable .gitignore
and scrub your history to remove files that should not have been added. The git filter-branch
command can do both.
Setup
Consider a representative of your history.
$ git lola --name-status
* f1af2bf (HEAD, bar-feature) Add bar
| A .gitignore
| A bar.c
| D main.o
| D module.o
| * 71f711a (master) Add foo
|/
| A foo.c
| A foo.o
* 7f1a361 Commit 2
| A module.c
| A module.o
* eb21590 Commit 1
A main.c
A main.o
For clarity, the *.c
files represent C source files and *.o
are compiled object files that should have been ignored.
On the bar-feature branch, you added a suitable .gitignore
and deleted object files that should not have been tracked, but you want that policy reflected everywhere in your import.
Note that git lola
is a non-standard but useful alias.
git config --global alias.lola \
'log --graph --decorate --pretty=oneline --abbrev-commit --all'
New Root Commit
Create the new root commit as follows.
$ git checkout --orphan new-root
Switched to a new branch 'new-root'
The git checkout
documentation notes a possible unanticipated state of the new orphan branch.
If you want to start a disconnected history that records a set of paths that is totally different from the one of start_point, then you should clear the index and the working tree right after creating the orphan branch by running git rm -rf .
from the top level of the working tree. Afterwards you will be ready to prepare your new files, repopulating the working tree, by copying them from elsewhere, extracting a tarball, etc.
Continuing our example:
$ git rm -rf .
rm 'foo.c'
rm 'foo.o'
rm 'main.c'
rm 'main.o'
rm 'module.c'
rm 'module.o'
$ echo '*.o' >.gitignore
$ git add .gitignore
$ git commit -m 'Create .gitignore'
[new-root (root-commit) 00c7780] Create .gitignore
1 file changed, 1 insertion(+)
create mode 100644 .gitignore
Now the history looks like
$ git lola
* 00c7780 (HEAD, new-root) Create .gitignore
* f1af2bf(bar-feature) Add bar
| * 71f711a (master) Add foo
|/
* 7f1a361 Commit 2
* eb21590 Commit 1
That is slightly misleading because it makes new-root look like it is a descendant of bar-feature, but it really has no parent.
$ git rev-parse HEAD^
HEAD^
fatal: ambiguous argument 'HEAD^': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
Make note of the SHA for the orphan because you will need it later. In this example, it is
$ git rev-parse HEAD
00c778087723ae890e803043493214fb09706ec7
Rewriting History
We want git filter-branch
to make three broad changes.
- Splice in the new root commit.
- Delete all the temporary files.
- Use the
.gitignore
from the new root unless one already exists.
On the command line, that is incanted as
git filter-branch \
--parent-filter '
test $GIT_COMMIT = eb215900cd15ca2cf9ded74f1a0d9d25f65eb2bf && \
echo "-p 00c778087723ae890e803043493214fb09706ec7" \
|| cat' \
--index-filter '
git rm --cached --ignore-unmatch "*.o"; \
git ls-files --cached --error-unmatch .gitignore >/dev/null 2>&1 ||
git update-index --add --cacheinfo \
100644,$(git rev-parse new-root:.gitignore),.gitignore' \
--tag-name-filter cat \
-- --all
Explanation:
- The
--parent-filter
option hooks in your new root commit.
eb215...
is the full SHA of the old root commit, cf. git rev-parse eb215
- The
--index-filter
option has two parts:
- Running
git rm
as above deletes anything matching *.o
from the entire tree because the glob pattern is quoted and interpreted by git rather than the shell.
- Check for an existing
.gitignore
with git ls-files
, and if it is not there, point to the one in new-root.
- If you have any tags, they will be mapped over with the identity operation,
cat
.
- The lone
--
terminates options, and --all
is shorthand for all refs.
The output you see will resemble
Rewrite eb215900cd15ca2cf9ded74f1a0d9d25f65eb2bf (1/5)rm 'main.o'
Rewrite 7f1a361ee918f7062f686e26b57788dd65bb5fe1 (2/5)rm 'main.o'
rm 'module.o'
Rewrite 71f711a15fa1fc60542cc71c9ff4c66b4303e603 (3/5)rm 'foo.o'
rm 'main.o'
rm 'module.o'
Rewrite f1af2bf89ed2236fdaf2a1a75a34c911efbd5982 (5/5)
Ref 'refs/heads/bar-feature' was rewritten
Ref 'refs/heads/master' was rewritten
WARNING: Ref 'refs/heads/new-root' is unchanged
Your originals are still safe. The master branch now lives under refs/original/refs/heads/master
, for example. Review the changes in your newly rewritten branches. When you are ready to delete the backup, run
git update-ref -d refs/original/refs/heads/master
You could cook up a command to cover all backup refs in one command, but I recommend careful review for each one.
Conclusion
Finally, the new history is
$ git lola --name-status
* ab8cb1c (bar-feature) Add bar
| M .gitignore
| A bar.c
| * 43e5658 (master) Add foo
|/
| A foo.c
* 6469dab Commit 2
| A module.c
* 47f9f73 Commit 1
| A main.c
* 00c7780 (HEAD, new-root) Create .gitignore
A .gitignore
Observe that all the object files are gone. The modification to .gitignore
in bar-feature is because I used different contents to make sure it would be preserved. For completeness:
$ git diff new-root:.gitignore bar-feature:.gitignore
diff --git a/new-root:.gitignore b/bar-feature:.gitignore
index 5761abc..c395c62 100644
--- a/new-root:.gitignore
+++ b/bar-feature:.gitignore
@@ -1 +1,2 @@
*.o
+*.obj
The new-root ref is no longer useful, so dispose of it with
$ git checkout master
$ git branch -d new-root
Disclaimer: This is theoretical (based on documentation), I have not done this.
Clone and try.
From what I understand you have never commitedfiles that wouldnow be filtered by the .gitignore
you want to add at the root of your history.
Therefore if you rebase your master branch onto a newroot commit containing only the .gitignore, you won't actually modify the content of the commits, and you should afterwards be able to rebase any and all of the other branches that you have onto the new commit, and rebase shall do the work for you.
Because the content of the commits is the same, the patch ID should remain the same, and rebase will only apply that which is necessary.
You will need to rebase each branch one by one though, but that can easily be scripted.
More info can be found in the git rebase documentation in section :
RECOVERING FROM UPSTREAM REBASE at the end of the page.
EDIT: Ok nevermind, tested and doesn't work exactly this way. You have to give the point of rebase for each branch in the new history "manually", which is a pain.
Could still be made to work but it is clearly a worse solution than accepted answer.