I have a Git repository which contains a number of subdirectories. Now I have found that one of the subdirectories is unrelated to the other and should be detached to a separate repository.
How can I do this while keeping the history of the files within the subdirectory?
I guess I could make a clone and remove the unwanted parts of each clone, but I suppose this would give me the complete tree when checking out an older revision etc. This might be acceptable, but I would prefer to be able to pretend that the two repositories doesn't have a shared history.
Just to make it clear, I have the following structure:
XYZ/
.git/
XY1/
ABC/
XY2/
But I would like this instead:
XYZ/
.git/
XY1/
XY2/
ABC/
.git/
ABC/
Update: This process is so common, that the git team made it much simpler with a new tool,
git subtree
. See here: Detach (move) subdirectory into separate Git repositoryYou want to clone your repository and then use
git filter-branch
to mark everything but the subdirectory you want in your new repo to be garbage-collected.To clone your local repository:
(Note: the repository will be cloned using hard-links, but that is not a problem since the hard-linked files will not be modified in themselves - new ones will be created.)
Now, let us preserve the interesting branches which we want to rewrite as well, and then remove the origin to avoid pushing there and to make sure that old commits will not be referenced by the origin:
or for all remote branches:
Now you might want to also remove tags which have no relation with the subproject; you can also do that later, but you might need to prune your repo again. I did not do so and got a
WARNING: Ref 'refs/tags/v0.1' is unchanged
for all tags (since they were all unrelated to the subproject); additionally, after removing such tags more space will be reclaimed. Apparentlygit filter-branch
should be able to rewrite other tags, but I could not verify this. If you want to remove all tags, usegit tag -l | xargs git tag -d
.Then use filter-branch and reset to exclude the other files, so they can be pruned. Let's also add
--tag-name-filter cat --prune-empty
to remove empty commits and to rewrite tags (note that this will have to strip their signature):or alternatively, to only rewrite the HEAD branch and ignore tags and other branches:
Then delete the backup reflogs so the space can be truly reclaimed (although now the operation is destructive)
and now you have a local git repository of the ABC sub-directory with all its history preserved.
Note: For most uses,
git filter-branch
should indeed have the added parameter-- --all
. Yes that's really --space--all
. This needs to be the last parameters for the command. As Matli discovered, this keeps the project branches and tags included in the new repo.Edit: various suggestions from comments below were incorporated to make sure, for instance, that the repository is actually shrunk (which was not always the case before).
Put this into your gitconfig:
The Easy Way™
It turns out that this is such a common and useful practice that the overlords of git made it really easy, but you have to have a newer version of git (>= 1.7.11 May 2012). See the appendix for how to install the latest git. Also, there's a real-world example in the walkthrough below.
Prepare the old repo
Note:
<name-of-folder>
must NOT contain leading or trailing characters. For instance, the folder namedsubproject
MUST be passed assubproject
, NOT./subproject/
Note for windows users: when your folder depth is > 1,
<name-of-folder>
must have *nix style folder separator (/). For instance, the folder namedpath1\path2\subproject
MUST be passed aspath1/path2/subproject
Create the new repo
Link the new repo to Github or wherever
Cleanup, if desired
Note: This leaves all the historical references in the repository.See the Appendix below if you're actually concerned about having committed a password or you need to decreasing the file size of your
.git
folder....
Walkthrough
These are the same steps as above, but following my exact steps for my repository instead of using
<meta-named-things>
.Here's a project I have for implementing JavaScript browser modules in node:
I want to split out a single folder,
btoa
, into a separate git repositoryI now have a new branch,
btoa-only
, that only has commits forbtoa
and I want to create a new repository.Next I create a new repo on Github or bitbucket, or whatever and add it is the
origin
(btw, "origin" is just a convention, not part of the command - you could call it "remote-server" or whatever you like)Happy day!
Note: If you created a repo with a
README.md
,.gitignore
andLICENSE
, you will need to pull first:Lastly, I'll want to remove the folder from the bigger repo
...
Appendix
Latest git on OS X
To get the latest version of git:
To get brew for OS X:
http://brew.sh
Latest git on Ubuntu
If that doesn't work (you have a very old version of ubuntu), try
If that still doesn't work, try
Thanks to rui.araujo from the comments.
clearing your history
By default removing files from git doesn't actually remove them from git, it just commits that they aren't there anymore. If you want to actually remove the historical references (i.e. you have a committed a password), you need to do this:
After that you can check that your file or folder no longer shows up in the git history at all
However, you can't "push" deletes to github and the like. If you try you'll get an error and you'll have to
git pull
before you cangit push
- and then you're back to having everything in your history.So if you want to delete history from the "origin" - meaning to delete it from github, bitbucket, etc - you'll need to delete the repo and re-push a pruned copy of the repo. But wait - there's more! - If you're really concerned about getting rid of a password or something like that you'll need to prune the backup (see below).
making
.git
smallerThe aforementioned delete history command still leaves behind a bunch of backup files - because git is all too kind in helping you to not ruin your repo by accident. It will eventually deleted orphaned files over the days and months, but it leaves them there for a while in case you realize that you accidentally deleted something you didn't want to.
So if you really want to empty the trash to reduce the clone size of a repo immediately you have to do all of this really weird stuff:
That said, I'd recommend not performing these steps unless you know that you need to - just in case you did prune the wrong subdirectory, y'know? The backup files shouldn't get cloned when you push the repo, they'll just be in your local copy.
Credit
Use this filter command to remove a subdirectory, while preserving your tags and branches: