I have a Git repository which contains a number of subdirectories. Now I have found that one of the subdirectories is unrelated to the other and should be detached to a separate repository.
How can I do this while keeping the history of the files within the subdirectory?
I guess I could make a clone and remove the unwanted parts of each clone, but I suppose this would give me the complete tree when checking out an older revision etc. This might be acceptable, but I would prefer to be able to pretend that the two repositories doesn't have a shared history.
Just to make it clear, I have the following structure:
XYZ/
.git/
XY1/
ABC/
XY2/
But I would like this instead:
XYZ/
.git/
XY1/
XY2/
ABC/
.git/
ABC/
I had exactly this problem but all the standard solutions based on git filter-branch were extremely slow. If you have a small repository then this may not be a problem, it was for me. I wrote another git filtering program based on libgit2 which as a first step creates branches for each filtering of the primary repository and then pushes these to clean repositories as the next step. On my repository (500Mb 100000 commits) the standard git filter-branch methods took days. My program takes minutes to do the same filtering.
It has the fabulous name of git_filter and lives here:
https://github.com/slobobaby/git_filter
on GitHub.
I hope it is useful to someone.
As I mentioned above, I had to use the reverse solution (deleting all commits not touching my
dir/subdir/targetdir
) which seemed to work pretty well removing about 95% of the commits (as desired). There are, however, two small issues remaining.FIRST,
filter-branch
did a bang up job of removing commits which introduce or modify code but apparently, merge commits are beneath its station in the Gitiverse.This is a cosmetic issue which I can probably live with (he says...backing away slowly with eyes averted).
SECOND the few commits that remain are pretty much ALL duplicated! I seem to have acquired a second, redundant timeline that spans just about the entire history of the project. The interesting thing (which you can see from the picture below), is that my three local branches are not all on the same timeline (which is, certainly why it exists and isn't just garbage collected).
The only thing I can imagine is that one of the deleted commits was, perhaps, the single merge commit that
filter-branch
actually did delete, and that created the parallel timeline as each now-unmerged strand took its own copy of the commits. (shrug Where's my TARDiS?) I'm pretty sure I can fix this issue, though I'd really love to understand how it happened.In the case of crazy mergefest-O-RAMA, I'll likely be leaving that one alone since it has so firmly entrenched itself in my commit history—menacing at me whenever I come near—, it doesn't seem to be actually causing any non-cosmetic problems and because it is quite pretty in Tower.app.
I'm sure git subtree is all fine and wonderful, but my subdirectories of git managed code that I wanted to move was all in eclipse. So if you're using egit, it's painfully easy. Take the project you want to move and team->disconnect it, and then team->share it to the new location. It will default to trying to use the old repo location, but you can uncheck the use-existing selection and pick the new place to move it. All hail egit.
Check out git_split project at https://github.com/vangorra/git_split
Turn git directories into their very own repositories in their own location. No subtree funny business. This script will take an existing directory in your git repository and turn that directory into an independent repository of its own. Along the way, it will copy over the entire change history for the directory you provided.
Proper way now is the following:
git filter-branch --prune-empty --subdirectory-filter FOLDER_NAME [first_branch] [another_branch]
GitHub now even have small article about such cases.
But be sure to clone your original repo to separate directory first (as it would delete all the files and other directories and you probable need to work with them).
So your algorithm should be:
git filter-branch
left only files under some subdirectory, push to new remoteUpdate: The git-subtree module was so useful that the git team pulled it into core and made it
git subtree
. See here: Detach (move) subdirectory into separate Git repositorygit-subtree may be useful for this
http://github.com/apenwarr/git-subtree/blob/master/git-subtree.txt (deprecated)
http://psionides.jogger.pl/2010/02/04/sharing-code-between-projects-with-git-subtree/