I accidentally committed an unwanted file (filename.orig
while resolving a merge) to my repository several commits ago, without me noticing it until now. I want to completely delete the file from the repository history.
Is it possible to rewrite the change history such that filename.orig
was never added to the repository in the first place?
Please don't use this recipe if your situation is not the one described in the question. This recipe is for fixing a bad merge, and replaying your good commits onto a fixed merge.
Although
filter-branch
will do what you want, it is quite a complex command and I would probably choose to do this withgit rebase
. It's probably a personal preference.filter-branch
can do it in a single, slightly more complex command, whereas therebase
solution is performing the equivalent logical operations one step at a time.Try the following recipe:
(Note that you don't actually need a temporary branch, you can do this with a 'detached HEAD', but you need to take a note of the commit id generated by the
git commit --amend
step to supply to thegit rebase
command rather than using the temporary branch name.)Definitely,
git filter-branch
is the way to go.Sadly, this will not suffice to completely remove
filename.orig
from your repo, as it can be still be referenced by tags, reflog entries, remotes and so on.I recommend removing all these references as well, and then calling the garbage collector. You can use the
git forget-blob
script from this website to do all this in one step.git forget-blob filename.orig
Intro: You Have 5 Solutions Available
The original poster states:
There are many different ways to remove the history of a file completely from git:
In the case of the original poster, amending the commit isn't really an option by itself, since he made several additional commits afterwards, but for the sake of completeness, I will also explain how to do it, for anyone else who justs wants to amend their previous commit.
Note that all of these solutions involve altering/re-writing history/commits in one way another, so anyone with old copies of the commits will have to do extra work to re-sync their history with the new history.
Solution 1: Amending Commits
If you accidentally made a change (such as adding a file) in your previous commit, and you don't want the history of that change to exist anymore, then you can simply amend the previous commit to remove the file from it:
Solution 2: Hard Reset (Possibly Plus a Rebase)
Like solution #1, if you just want to get rid of your previous commit, then you also have the option of simply doing a hard reset to its parent:
That command will hard-reset your branch to the previous 1st parent commit.
However, if, like the original poster, you've made several commits after the commit you want to undo the change to, you can still use hard resets to modify it, but doing so also involves using a rebase. Here are the steps that you can use to amend a commit further back in history:
Solution 3: Non-interactive Rebase
This will work if you just want to remove a commit from history entirely:
Solution 4: Interactive Rebases
This solution will allow you to accomplish the same things as solutions #2 and #3, i.e. modify or remove commits further back in history than your immediately previous commit, so which solution you choose to use is sort of up to you. Interactive rebases are not well-suited to rebasing hundreds of commits, for performance reasons, so I would use non-interactive rebases or the filter branch solution (see below) in those sort of situations.
To begin the interactive rebase, use the following:
This will cause git to rewind the commit history back to the parent of the commit that you want to modify or remove. It will then present you a list of the rewound commits in reverse order in whatever editor git is set to use (this is Vim by default):
The commit that you want to modify or remove will be at the top of this list. To remove it, simply delete its line in the list. Otherwise, replace "pick" with "edit" on the 1st line, like so:
Next, enter
git rebase --continue
. If you chose to remove the commit entirely, then that it all you need to do (other than verification, see final step for this solution). If, on the other hand, you wanted to modify the commit, then git will reapply the commit and then pause the rebase.At this point, you can remove the file and amend the commit, then continue the rebase:
That's it. As a final step, whether you modified the commit or removed it completely, it's always a good idea to verify that no other unexpected changes were made to your branch by diffing it with its state before the rebase:
Solution 5: Filtering Branches
Finally, this solution is best if you want to completely wipe out all traces of a file's existence from history, and none of the other solutions are quite up to the task.
That will remove
<file>
from all commits, starting from the root commit. If instead you just want to rewrite the commit rangeHEAD~5..HEAD
, then you can pass that as an additional argument tofilter-branch
, as pointed out in this answer:Again, after the
filter-branch
is complete, it's usually a good idea to verify that there are no other unexpected changes by diffing your branch with its previous state before the filtering operation:Filter-Branch Alternative: BFG Repo Cleaner
I've heard that the BFG Repo Cleaner tool runs faster than
git filter-branch
, so you might want to check that out as an option too. It's even mentioned officially in the filter-branch documentation as a viable alternative:Additional Resources
Rewriting Git history demands changing all the affected commit ids, and so everyone who's working on the project will need to delete their old copies of the repo, and do a fresh clone after you've cleaned the history. The more people it inconveniences, the more you need a good reason to do it - your superfluous file isn't really causing a problem, but if only you are working on the project, you might as well clean up the Git history if you want to!
To make it as easy as possible, I'd recommend using the BFG Repo-Cleaner, a simpler, faster alternative to
git-filter-branch
specifically designed for removing files from Git history. One way in which it makes your life easier here is that it actually handles all refs by default (all tags, branches, etc) but it's also 10 - 50x faster.You should carefully follow the steps here: http://rtyley.github.com/bfg-repo-cleaner/#usage - but the core bit is just this: download the BFG jar (requires Java 6 or above) and run this command:
Your entire repository history will be scanned, and any file named
filename.orig
(that's not in your latest commit) will be removed. This is considerably easier than usinggit-filter-branch
to do the same thing!Full disclosure: I'm the author of the BFG Repo-Cleaner.
The simplest way I found was suggested by
leontalbot
(as a comment), which is a post published by Anoopjohn. I think its worth its own space as an answer:(I converted it to a bash script)
All credits goes to
Annopjohn
, and toleontalbot
for pointing it out.NOTE
Be aware that the script doesn't include validations, so be sure you don't make mistakes and that you have a backup in case something goes wrong. It worked for me, but it may not work in your situation. USE IT WITH CAUTION (follow the link if you want to know what is going on).
You can also use:
git reset HEAD file/path