Remove sensitive files and their commits from Git

2018-12-31 01:25发布

I would like to put a Git project on GitHub but it contains certain files with sensitive data (usernames and passwords, like /config/deploy.rb for capistrano).

I know I can add these filenames to .gitignore, but this would not remove their history within Git.

I also don't want to start over again by deleting the /.git directory.

Is there a way to remove all traces of a particular file in your Git history?

10条回答
路过你的时光
2楼-- · 2018-12-31 02:11

To be clear: The accepted answer is correct. Try it first. However, it may be unnecessarily complex for some use cases, particularly if you encounter obnoxious errors such as 'fatal: bad revision --prune-empty', or really don't care about the history of your repo.

An alternative would be:

  1. cd to project's base branch
  2. Remove the sensitive code / file
  3. rm -rf .git/ # Remove all git info from your code
  4. Go to github and delete your repository
  5. Follow this guide to push your code to a new repository as you normally would - https://help.github.com/articles/adding-an-existing-project-to-github-using-the-command-line/

This will of course remove all commit history branches, and issues from both your github repo, and your local git repo. If this is unacceptable you will have to use an alternate approach.

Call this the nuclear option.

查看更多
爱死公子算了
3楼-- · 2018-12-31 02:14

Here is my solution in windows

git filter-branch --tree-filter "rm -f 'filedir/filename'" HEAD

git push --force

make sure that the path is correct otherwise it won't work

I hope it helps

查看更多
君临天下
4楼-- · 2018-12-31 02:15

So, It looks something like this:

git rm --cached /config/deploy.rb
echo /config/deploy.rb >> .gitignore

Remove cache for tracked file from git and add that file to .gitignore list

查看更多
低头抚发
5楼-- · 2018-12-31 02:16

For all practical purposes, the first thing you should be worried about is CHANGING YOUR PASSWORDS! It's not clear from your question whether your git repository is entirely local or whether you have a remote repository elsewhere yet; if it is remote and not secured from others you have a problem. If anyone has cloned that repository before you fix this, they'll have a copy of your passwords on their local machine, and there's no way you can force them to update to your "fixed" version with it gone from history. The only safe thing you can do is change your password to something else everywhere you've used it.


With that out of the way, here's how to fix it. GitHub answered exactly that question as an FAQ:

Note for Windows users: use double quotes (") instead of singles in this command

git filter-branch --index-filter \
'git update-index --remove filename' <introduction-revision-sha1>..HEAD
git push --force --verbose --dry-run
git push --force

Keep in mind that once you've pushed this code to a remote repository like GitHub and others have cloned that remote repository, you're now in a situation where you're rewriting history. When others try pull down your latest changes after this, they'll get a message indicating that the the changes can't be applied because it's not a fast-forward.

To fix this, they'll have to either delete their existing repository and re-clone it, or follow the instructions under "RECOVERING FROM UPSTREAM REBASE" in the git-rebase manpage.


In the future, if you accidentally commit some changes with sensitive information but you notice before pushing to a remote repository, there are some easier fixes. If you last commit is the one to add the sensitive information, you can simply remove the sensitive information, then run:

git commit -a --amend

That will amend the previous commit with any new changes you've made, including entire file removals done with a git rm. If the changes are further back in history but still not pushed to a remote repository, you can do an interactive rebase:

git rebase -i origin/master

That opens an editor with the commits you've made since your last common ancestor with the remote repository. Change "pick" to "edit" on any lines representing a commit with sensitive information, and save and quit. Git will walk through the changes, and leave you at a spot where you can:

$EDITOR file-to-fix
git commit -a --amend
git rebase --continue

For each change with sensitive information. Eventually, you'll end up back on your branch, and you can safely push the new changes.

查看更多
高级女魔头
6楼-- · 2018-12-31 02:18

If you have already pushed to GitHub, the data is compromised even if you force push it away one second later because:

To test this out, I have created a repo: https://github.com/cirosantilli/test-dangling and done:

git init
git remote add origin git@github.com:cirosantilli/test-dangling.git

touch a
git add .
git commit -m 0
git push

touch b
git add .
git commit -m 1
git push

touch c
git rm b
git add .
git commit --amend --no-edit
git push -f

If you delete the repository however, commits do disappear even from the API immediately and give 404, e.g. https://api.github.com/repos/cirosantilli/test-dangling-delete/commits/8c08448b5fbf0f891696819f3b2b2d653f7a3824 This works even if you recreate another repository with the same name.

So my recommended course of action is:

  • change your credentials

  • if that is not enough (e.g. naked pics):

    • delete the repository
    • contact support
查看更多
人间绝色
7楼-- · 2018-12-31 02:18

I've had to do this a few times to-date. Note that this only works on 1 file at a time.

  1. Get a list of all commits that modified a file. The one at the bottom will the the first commit:

    git log --pretty=oneline --branches -- pathToFile

  2. To remove the file from history use the first commit sha1 and the path to file from the previous command, and fill them into this command:

    git filter-branch --index-filter 'git rm --cached --ignore-unmatch <path-to-file>' -- <sha1-where-the-file-was-first-added>..

查看更多
登录 后发表回答