-->

Can Git-svn be used on large, branched repositorie

2019-03-17 13:43发布

问题:

I am trying to use Git as a frontend to a SVN repository in order to be able to use Git's nice features like simple branching, stashing etc.

The problem is that the SVN repository is quite large (8,000 revs) and contains lots of branches and tags (old as well as new).

It's a near standard layout, with a config containing fetch, branches and tags directives.

Since the oldest branch and tag refers to revision 10, it means that every svn fetch reads the entire repository history from revision 10 and forward, which can takes hours on the slow connection.

If I only track trunk, then it's fine, but I still want to make git aware of new branches and tags.

I usually look at git log -1 on the branch I'm at and gets the SVN revision from the comment, so I can do git svn fetch -r7915:HEAD or similar. I guess that's what git svn fetch --parent does. But why do I need to do this?

I'm on Windows, and use TortoiseGit which has quite nice support for git-svn, but since TortoiseGit only runs git svn fetch I'm kind of stuck.

Am I doing something wrong? I expect svn fetch to be a fast operation when the first svn clone -s is complete.

回答1:

Thanks for the answers. They did not really help me, though.

This command is the best solution so far:

git svn log --all -1 | \
  sed -n '2s/r\\([0-9]*\\).*/\\1/p' | \
  xargs --replace=from git svn fetch -r from:HEAD

It uses git svn log --all to find the highest SVN revision number fetched so far, and fetches everything from that point onwards.

I wish git svn fetch would have an option to behave like this. Unless the SVN revisions are changed, there is no reason git svn should fetch the same revisions over and over each time.



回答2:

If you do not need to have full history in the git repository, I recommend you take a look at the "git + svn" approach, detailed in the link below, instead of the standard git-svn integration. Your initial import into git should be very quick, since you will not be importing history.

Make sure to read the section entitled "Benefits, Drawbacks, and Lessons Learned".

http://www.lostechies.com/blogs/derickbailey/archive/2010/02/03/branch-per-feature-how-i-manage-subversion-with-git-branches.aspx



回答3:

You're using it correctly: the initial import of a Subversion repository with lots of history will be very slow.

The bad news is because Subversion's branches and tags are only directories, git-svn is forced to take the pessimistic route of reading each branch from its head all the way back to the first revision. Yes, if you've been disciplined in your use of Subversion, this will result in many fetches of the same data, but real-world usage patterns make this an unlikely case.

Start the clone in the evening and come back to a nice git repo the next morning!

Once you've cloned, git svn fetch even warns you:

This may take a while on large repositories

Subversion is simple and stupid, so git has to take things slowly.



回答4:

Do you have symlinks in the SVN repo? If not, have you tried this setting:

svn.brokenSymlinkWorkaround

This disables potentially expensive checks to workaround broken symlinks checked into SVN by broken clients. Set this option to "false" if you track a SVN repository with many empty blobs that are not symlinks. This option may be changed while git svn is running and take effect on the next revision fetched. If unset, git svn assumes this option to be "true".