How can I recover after a checksum mismatch with &

2020-06-17 14:50发布

问题:

I'm cloning an SVN repository to git as part of our migration plan. I've hit various snags along the way, forcing me to continue the clone with a git svn fetch command. The most recent failure I can't figure out how to solve:

$ git svn fetch
Checksum mismatch: dc/trunk-4632-jh/dc-smtpd/lib/Qpsmtpd/Address.pm.t 8ce3aea3f47dc115e8fe53bd62d0f074cfe93ec6
expected: 59de969022e46135fa6dc7599fc2f3b4
     got: 4334926a01c905cdb7fce71265e370c1

I found this related answer, however that solution doesn't work because git svn log is not yet functional, as the repo is not fully in place:

$ git svn log dc/trunk-4632-jh/dc-smtpd/lib/Qpsmtpd/Address.pm.t
fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions
log --no-color --first-parent --pretty=medium HEAD: command returned error: 128

How can I proceed?

回答1:

I know this is old but maybe it will be helpful for future reference as all search results on this are not helpful.

I've hit similar issue on our huge repository which takes days to clone and unfortunately at one point I had to restart my machine. I am currently working out how to resolve the problem, so please keep in mind this is more a suggestion than tested solution.

I think you need to try creating a branch and checking out the commits you currently have from previous fetch:

git checkout -b master git-svn

After that is done you should have working tree up to that commit. Another fetches will probably fail due to object mismatch but at that point at least it should be possible to use "git svn reset" to revert faulty svn fetches (see OP's related answer link). If that's true find offending commit, reset before it and then continue fetching.

You might want to rebase and revert to state before that broken commit on your master branch or convert back to bare repository, if that's what you're after (in my case it is).

Hope this works. I'll post an update when my checkout is done (will take at least few hours... sigh).

Edit: That seemed to work. I successfully discarded some git-svn commits and am able to re-fetch them again. :)

Edit2: Make sure to reset until you don't get any object mismatch warnings on git svn fetch (otherwise you will run into the same issue soon).

Cheers,

Henryk



回答2:

Another answer to an old question but straight forward solutions are tough to find for this problem so hopefully this helps others.

I think this issue occurs due to a corrupted file during transfer. Not sure how or why it happens, but in my case, I get the same error at different revisions every time I do a new clone and sometimes not at all.

Using the questioners error message

$ git svn fetch
Checksum mismatch: dc/trunk-4632-jh/dc-smtpd/lib/Qpsmtpd/Address.pm.t 
8ce3aea3f47dc115e8fe53bd62d0f074cfe93ec6
expected: 59de969022e46135fa6dc7599fc2f3b4
got: 4334926a01c905cdb7fce71265e370c1

The following steps allowed me to resume and progress :-

  1. View all branches. These will all be remote branches. git branch -a
  2. Checkout branch affected. git checkout remotes/origin/trunk-4632-jh

    This will take some time to complete.

  3. Find the last revision that the problematic file was changed. git svn log dc-smtpd/lib/Qpsmtpd/Address.pm.t

    Note the highest revision #

  4. Reset back to this rev. git svn reset -r (rev #) -p

  5. Carry on. git svn fetch

Good luck.



回答3:

See also: Git svn rebase : checksum mismatch

In our case the additional treatment of the files (server-side includes in Apache) caused the checksum problem.

Disabling SSI in Apache's /etc/httpd.conf file for the period of migration by commenting out the

 AddType text/html .shtml
 AddOutputFilter INCLUDES .shtml

directives solved the problem, caused by the interpretation of .shtml files by the front-end Apache server, which produced a new content (and thus a new hash), other than the hash of the original file itself.



回答4:

That means some files in the repository got corrupted. It can be caused by various reasons such as software bugs, bit rots in drives, etc. I was recently transitioning very old ~10GB svn repository to git, therefore some corruption was expected.

To fix the corruption, you basically need to dump the entire repository and import it while filtering the errors out. Note that our goal is to complete the import process no matter why or how the repository got corrupted. You cannot simply fix the corruption without having a backup and diffing through the revision files.

First basic one-off command you could use is:

svnadmin create repo2
svnadmin dump repo | sed '/^Text-content-md5/d' | svnadmin load repo2

This removes the checksum calculation from the dump so the new repo will have updated checksums.

If you encountered more errors during the dump and load (which is expected), try incremental approach so you can continue from the point you left. Below command will dump the revisions starting from 101 to 150 (inclusive).

svnadmin dump --incremental -r101:150 repo | sed '/^Text-content-md5/d' | svnadmin load repo2

Some common errors and solutions:

  • 'Premature end of content data in dumpstream': That means Content-length of some file does not match the repository version, so some data is lost in the specified file. We must skip it. Add | svndumpfilter exclude path/to/file.jar command like this:

    svnadmin dump --incremental -r101:150 repo | svndumpfilter exclude path/to/file.jar | sed '/^Text-content-md5/d' | svnadmin load repo2
    
  • Property errors: Add --bypass-prop-validation to svnadmin load command

After populating your second repo, you would simply svnserve -d -r repo2 and try git svn fetch again.

Good luck!