How to import existing Git repository into another

2019-01-01 12:12发布

问题:

I have a Git repository in a folder called XXX, and I have second Git repository called YYY.

I want to import the XXX repository into the YYY repository as a subdirectory named ZZZ and add all XXX\'s change history to YYY.

Folder structure before:

XXX
 |- .git
 |- (project files)
YYY
 |- .git
 |- (project files)

Folder structure after:

YYY
 |- .git  <-- This now contains the change history from XXX
 |-  ZZZ  <-- This was originally XXX
      |- (project files)
 |-  (project files)

Can this be done, or must I resort to using sub-modules?

回答1:

Probably the simplest way would be to pull the XXX stuff into a branch in YYY and then merge it into master:

In YYY:

git remote add other /path/to/XXX
git fetch other
git checkout -b ZZZ other/master
mkdir ZZZ
git mv stuff ZZZ/stuff                      # repeat as necessary for each file/dir
git commit -m \"Moved stuff to ZZZ\"
git checkout master                
git merge ZZZ --allow-unrelated-histories   # should add ZZZ/ to master
git commit
git remote rm other
git branch -d ZZZ                           # to get rid of the extra branch before pushing
git push                                    # if you have a remote, that is

I actually just tried this with a couple of my repos and it works. Unlike Jörg\'s answer it won\'t let you continue to use the other repo, but I don\'t think you specified that anyway.

Note: Since this was originally written in 2009, git has added the subtree merge mentioned in the answer below. I would probably use that method today, although of course this method does still work.



回答2:

If you want to retain the exact commit history of the second repository and therefore also retain the ability to easily merge upstream changes in the future then here is the method you want. It results in unmodified history of the subtree being imported into your repo plus one merge commit to move the merged repository to the subdirectory.

git remote add XXX_remote <path-or-url-to-XXX-repo>
git fetch XXX_remote
git merge -s ours --no-commit XXX_remote/master
git read-tree --prefix=ZZZ/ -u XXX_remote/master
git commit -m \"Imported XXX as a subtree.\"

You can track upstream changes like so:

git pull -s subtree XXX_remote master

Git figures out on its own where the roots are before doing the merge, so you don\'t need to specify the prefix on subsequent merges.

GIT 2.9+: The merge command will require the option: --allow-unrelated-histories. Thanks @stuXnet!

The method in the other answer that uses read-tree and skips the merge -s ours step is effectively no different than copying the files with cp and committing the result.

Original source was from github\'s \"Subtree Merge\" help article.



回答3:

git-subtree is a script designed for exactly this use case of merging multiple repositories into one while preserving history (and/or splitting history of subtrees, though that is seems to be irrelevant to this question). It is distributed as part of the git tree since release 1.7.11.

To merge a repository <repo> at revision <rev> as subdirectory <prefix>, use git subtree add as follows:

git subtree add -P <prefix> <repo> <rev>

git-subtree implements the subtree merge strategy in a more user friendly manner.

For your case, inside repository YYY, you would run:

git subtree add -P ZZZ /path/to/XXX.git master


回答4:

There is a well-known instance of this in the Git repository itself, which is collectively known in the Git community as \"the coolest merge ever\" (after the subject line Linus Torvalds used in the e-mail to the Git mailinglist which describes this merge). In this case, the gitk Git GUI which now is part of Git proper, actually used to be a separate project. Linus managed to merge that repository into the Git repository in a way that

  • it appears in the Git repository as if it had always been developed as part of Git,
  • all the history is kept intact and
  • it can still be developed independently in its old repository, with changes simply being git pulled.

The e-mail contains the steps needed to reproduce, but it is not for the faint of heart: first, Linus wrote Git, so he probably knows a bit more about it than you or me, and second, this was almost 5 years ago and Git has improved considerably since then, so maybe it is now much easier.

In particular, I guess nowadays one would use a gitk submodule, in that specific case.



回答5:

The simple way to do that is to use git format-patch.

Assume we have 2 git repositories foo and bar.

foo contains:

  • foo.txt
  • .git

bar contains:

  • bar.txt
  • .git

and we want to end-up with foo containing the bar history and these files:

  • foo.txt
  • .git
  • foobar/bar.txt

So to do that:

 1. create a temporary directory eg PATH_YOU_WANT/patch-bar
 2. go in bar directory
 3. git format-patch --root HEAD --no-stat -o PATH_YOU_WANT/patch-bar --src-prefix=a/foobar/ --dst-prefix=b/foobar/
 4. go in foo directory
 5. git am PATH_YOU_WANT/patch-bar/*

And if we want to rewrite all message commits from bar we can do, eg on Linux:

git filter-branch --msg-filter \'sed \"1s/^/\\[bar\\] /\"\' COMMIT_SHA1_OF_THE_PARENT_OF_THE_FIRST_BAR_COMMIT..HEAD

This will add \"[bar] \" at the beginning of each commit message.



回答6:

Based on this article, using subtree is what worked for me and only applicable history was transferred. Posting here in case anyone needs the steps (make sure to replace the placeholders with values applicable to you):

in your source repository split subfolder into a new branch

git subtree split --prefix=<source-path-to-merge> -b subtree-split-result

in your destination repo merge in the split result branch

git remote add merge-source-repo <path-to-your-source-repository>
git fetch merge-source-repo
git merge -s ours --no-commit merge-source-repo/subtree-split-result
git read-tree --prefix=<destination-path-to-merge-into> -u merge-source-repo/subtree-split-result

verify your changes and commit

git status
git commit

Don\'t forget to

Clean up by deleting the subtree-split-result branch

git branch -D subtree-split-result

Remove the remote you added to fetch the data from source repo

git remote rm merge-source-repo



回答7:

This function will clone remote repo into local repo dir, after merging all commits will be saved, git log will be show the original commits and proper paths:

function git-add-repo
{
    repo=\"$1\"
    dir=\"$(echo \"$2\" | sed \'s/\\/$//\')\"
    path=\"$(pwd)\"

    tmp=\"$(mktemp -d)\"
    remote=\"$(echo \"$tmp\" | sed \'s/\\///g\'| sed \'s/\\./_/g\')\"

    git clone \"$repo\" \"$tmp\"
    cd \"$tmp\"

    git filter-branch --index-filter \'
        git ls-files -s |
        sed \"s,\\t,&\'\"$dir\"\'/,\" |
        GIT_INDEX_FILE=\"$GIT_INDEX_FILE.new\" git update-index --index-info &&
        mv \"$GIT_INDEX_FILE.new\" \"$GIT_INDEX_FILE\"
    \' HEAD

    cd \"$path\"
    git remote add -f \"$remote\" \"file://$tmp/.git\"
    git pull \"$remote/master\"
    git merge --allow-unrelated-histories -m \"Merge repo $repo into master\" --edit \"$remote/master\"
    git remote remove \"$remote\"
    rm -rf \"$tmp\"
}

How to use:

cd current/package
git-add-repo https://github.com/example/example dir/to/save

If make a little changes you can even move files/dirs of merged repo into different paths, for example:

repo=\"https://github.com/example/example\"
path=\"$(pwd)\"

tmp=\"$(mktemp -d)\"
remote=\"$(echo \"$tmp\" | sed \'s/\\///g\' | sed \'s/\\./_/g\')\"

git clone \"$repo\" \"$tmp\"
cd \"$tmp\"

GIT_ADD_STORED=\"\"

function git-mv-store
{
    from=\"$(echo \"$1\" | sed \'s/\\./\\\\./\')\"
    to=\"$(echo \"$2\" | sed \'s/\\./\\\\./\')\"

    GIT_ADD_STORED+=\'s,\\t\'\"$from\"\',\\t\'\"$to\"\',;\'
}

# NOTICE! This paths used for example! Use yours instead!
git-mv-store \'public/index.php\' \'public/admin.php\'
git-mv-store \'public/data\' \'public/x/_data\'
git-mv-store \'public/.htaccess\' \'.htaccess\'
git-mv-store \'core/config\' \'config/config\'
git-mv-store \'core/defines.php\' \'defines/defines.php\'
git-mv-store \'README.md\' \'doc/README.md\'
git-mv-store \'.gitignore\' \'unneeded/.gitignore\'

git filter-branch --index-filter \'
    git ls-files -s |
    sed \"\'\"$GIT_ADD_STORED\"\'\" |
    GIT_INDEX_FILE=\"$GIT_INDEX_FILE.new\" git update-index --index-info &&
    mv \"$GIT_INDEX_FILE.new\" \"$GIT_INDEX_FILE\"
\' HEAD

GIT_ADD_STORED=\"\"

cd \"$path\"
git remote add -f \"$remote\" \"file://$tmp/.git\"
git pull \"$remote/master\"
git merge --allow-unrelated-histories -m \"Merge repo $repo into master\" --edit \"$remote/master\"
git remote remove \"$remote\"
rm -rf \"$tmp\"

Notices
Paths replaces via sed, so make sure it moved in proper paths after merging.
The --allow-unrelated-histories parameter only exists since git >= 2.9.



回答8:

Adding another answer as I think this is a bit simpler. A pull of repo_dest is done into repo_to_import and then a push --set-upstream url:repo_dest master is done.

This method has worked for me importing several smaller repos into a bigger one.

How to import: repo1_to_import to repo_dest

# checkout your repo1_to_import if you don\'t have it already 
git clone url:repo1_to_import repo1_to_import
cd repo1_to_import

# now. pull all of repo_dest
git pull url:repo_dest
ls 
git status # shows Your branch is ahead of \'origin/master\' by xx commits.
# now push to repo_dest
git push --set-upstream url:repo_dest master

# repeat for other repositories you want to import

Rename or move files and dirs into desired position in original repo before you do the import. e.g.

cd repo1_to_import
mkdir topDir
git add topDir
git mv this that and the other topDir/
git commit -m\"move things into topDir in preparation for exporting into new repo\"
# now do the pull and push to import

The method described at the following link inspired this answer. I liked it as it seemed more simple. BUT Beware! There be dragons! https://help.github.com/articles/importing-an-external-git-repository git push --mirror url:repo_dest pushes your local repo history and state to remote (url:repo_dest). BUT it deletes the old history and state of the remote. Fun ensues! :-E



回答9:

I wanted to import only some files from the other repository (XXX) in my case. The subtree was too complicated for me and the other solutions didn\'t work. This is what I did:

ALL_COMMITS=$(git log --reverse --pretty=format:%H -- ZZZ | tr \'\\n\' \' \')

This gives you a space-separated list of all the commits that affect the files I wanted to import (ZZZ) in reverse order (you might have to add --follow to capture renames as well). I then went into the target repository (YYY), added the other repository (XXX) as remote, did a fetch from it and finally:

git cherry-pick $ALL_COMMITS

which adds all the commits to your branch, you\'ll thus have all the files with their history and can do whatever you want with them as if they\'ve always been in this repository.



回答10:

I was in a situation where I was looking for -s theirs but of course, this strategy doesn\'t exist. My history was that I had forked a project on GitHub, and now for some reason, my local master could not be merged with upstream/master although I had made no local changes to this branch. (Really don\'t know what happened there -- I guess upstream had done some dirty pushes behind the scenes, maybe?)

What I ended up doing was

# as per https://help.github.com/articles/syncing-a-fork/
git fetch upstream
git checkout master
git merge upstream/master
....
# Lots of conflicts, ended up just abandonging this approach
git reset --hard   # Ditch failed merge
git checkout upstream/master
# Now in detached state
git branch -d master # !
git checkout -b master   # create new master from upstream/master

So now my master is again in sync with upstream/master (and you could repeat the above for any other branch you also want to sync similarly).



回答11:

See Basic example in this article and consider such mapping on repositories:

  • A <-> YYY,
  • B <-> XXX

After all activity described in this chapter (after merging), remove branch B-master:

$ git branch -d B-master

Then, push changes.

It works for me.



回答12:

I can suggest another solution (alternative to git-submodules) for your problem - gil (git links) tool

It allows to describe and manage complex git repositories dependencies.

Also it provides a solution to the git recursive submodules dependency problem.

Consider you have the following project dependencies: sample git repository dependency graph

Then you can define .gitlinks file with repositories relation description:

# Projects
CppBenchmark CppBenchmark https://github.com/chronoxor/CppBenchmark.git master
CppCommon CppCommon https://github.com/chronoxor/CppCommon.git master
CppLogging CppLogging https://github.com/chronoxor/CppLogging.git master

# Modules
Catch2 modules/Catch2 https://github.com/catchorg/Catch2.git master
cpp-optparse modules/cpp-optparse https://github.com/weisslj/cpp-optparse.git master
fmt modules/fmt https://github.com/fmtlib/fmt.git master
HdrHistogram modules/HdrHistogram https://github.com/HdrHistogram/HdrHistogram_c.git master
zlib modules/zlib https://github.com/madler/zlib.git master

# Scripts
build scripts/build https://github.com/chronoxor/CppBuildScripts.git master
cmake scripts/cmake https://github.com/chronoxor/CppCMakeScripts.git master

Each line describe git link in the following format:

  1. Unique name of the repository
  2. Relative path of the repository (started from the path of .gitlinks file)
  3. Git repository which will be used in git clone command Repository branch to checkout
  4. Empty line or line started with # are not parsed (treated as comment).

Finally you have to update your root sample repository:

# Clone and link all git links dependencies from .gitlinks file
gil clone
gil link

# The same result with a single command
gil update

As the result you\'ll clone all required projects and link them to each other in a proper way.

If you want to commit all changes in some repository with all changes in child linked repositories you can do it with a single command:

gil commit -a -m \"Some big update\"

Pull, push commands works in a similar way:

gil pull
gil push

Gil (git links) tool supports the following commands:

usage: gil command arguments
Supported commands:
    help - show this help
    context - command will show the current git link context of the current directory
    clone - clone all repositories that are missed in the current context
    link - link all repositories that are missed in the current context
    update - clone and link in a single operation
    pull - pull all repositories in the current directory
    push - push all repositories in the current directory
    commit - commit all repositories in the current directory

More about git recursive submodules dependency problem.



回答13:

I don\'t know of an easy way to do that. You COULD do this:

  1. Use git filter-branch to add a ZZZ super-directory on the XXX repository
  2. Push the new branch to the YYY repository
  3. Merge the pushed branch with YYY\'s trunk.

I can edit with details if that sounds appealing.



回答14:

I think you can do this using \'git mv\' and \'git pull\'.

I\'m a fair git noob - so be careful with your main repository - but I just tried this in a temp dir and it seems to work.

First - rename the structure of XXX to match how you want it to look when it\'s within YYY:

cd XXX
mkdir tmp
git mv ZZZ tmp/ZZZ
git mv tmp ZZZ

Now XXX looks like this:

XXX
 |- ZZZ
     |- ZZZ

Now use \'git pull\' to fetch the changes across:

cd ../YYY
git pull ../XXX

Now YYY looks like this:

YYY
 |- ZZZ
     |- ZZZ
 |- (other folders that already were in YYY)