Testing what is about to be committed in a pre-com

2019-03-18 14:12发布

问题:

The internet is absolutely littered with incorrect and non-ideal answers to this question. This is unfortunate because you would think this would be a common thing you would want to do.

The problem: When a pre-commit hook runs, the repository might not be clean. So if you naively run your tests, they will not be against what you're committing, but whatever dirt happens to be in your working tree.

The obvious thing to do is to git stash --keep-index --include-untracked at the start of the pre-commit and git pop at the exit. That way you are testing against the (pure) index, which is what we want.

Unfortunately, this generates merge conflict markers if you use git add --patch, (especially if you edit hunks), since the contents of stash@{0} might not match up against the work tree after commit.

Another common solution is to clone the repository and run the tests in a new temporary one. There are two issues with that: One is that we haven't committed yet, so we can't easily get a copy of the repository in the state we're about to commit (I'm sure there is a way of doing that, but I'm not interested because:). Secondly, my tests might be sensitive to the location of the current working directory. For example because of local environment configuration.

So: How can I restore my work-tree to whatever state it was in before the git stash --keep-index --include-untracked, without introducing merge conflict markers, and without modifying the post-commit HEAD?

回答1:

git write-tree is useful in pre-commit hooks. It writes a tree into the repo of the index (this tree will be reused if and when the commit is finalised.)

Once the tree is written to the repo, you can use git archive | tar -x to write the tree to a temporary directory.

E.g.:

#!/bin/bash

TMPDIR=$(mktemp -d)
TREE=$(git write-tree)
git archive $TREE | tar -x -C $TMPDIR

# Run tests in $TMPDIR

RESULT=$?
rm -rf "$TMPDIR"
exit $RESULT


回答2:

If cloning the entire repo is too expensive, perhaps you just need a copy of the working directory. Making a copy would be simpler than trying to deal with conflicts. For example:

#!/bin/sh -e

trap 'rm -rf $TMPD' 0
mkdir ${TMPD=$PWD/.tmpdir}
git ls-tree -r HEAD | while read mod type sha name; do
    if test "$type" = blob; then
        mkdir -p $TMPD/$( dirname "$name" ) 
        git show $sha > $TMPD/"$name";
        chmod $mod $TMPD/"$name"
    fi
done
cd $TMPD
git diff --cached HEAD | patch
# Run tests here

This will dump the state of the tree as it will be after the commit in $TMPD, so you can run your tests there. You should get a temporary directory in a more secure fashion than is done here, but in order for the final diff to work (or to simplify the script and cd earlier), it must be a child of the working directory.



回答3:

If you can afford to use a temporary directory (ie. make a complete copy of the current checkout) you can use a temporary directory like so:

tmpdir=$(mktemp -d) # Or put it wherever you like
git archive HEAD | tar -xf - -C "$tmpdir"
git diff --staged | patch -p1 -d "$tmpdir"
cd "$tmpdir"
...

This is basically William Pursell's solution but takes advantage of git archive which makes the code simpler, and I expect will be faster.

Alternatively, by cd'ing first:

cd somewhere
git -C path/to/repo archive HEAD | tar -xf -
git -C path/to/repo diff --staged | patch -p1
...

git -C requires Git 1.8.5.



回答4:

I have found the following to be useful:

## bash declare -a files readarray -t files < <(git status --porcelain | perl -ane 'print $F[1],qq(\n) if m/^[ACM] /') # declare -a delfiles readarray -t delfiles < <(git status --porcelain | perl -ane 'print $F[1],qq(\n) if m/^D /') # declare -a huhfiles readarray -t huhfiles < <(git status --porcelain | perl -ane 'print $F[1],qq(\n) if m/^\? /')

It may be inefficient to call git status three times, but this code is less complex than calling once, storing in memory and looping over the results. And I don't think putting the results to a temp file and reading it off the disk three times would be faster. Maybe. I don't know. This was the first pass. Feel free to critique.



回答5:

I have finally found the solution I was looking for. Only the state of the index before commit is checked, and it leaves the index and working tree in exactly as it was before the commit.

If you see any problems or a better way, please do reply, either as a comment or your own answer.

This assumes that nothing else will try to stash or otherwise modify the git repository or working tree whilst it is running. This comes with no warranty, might be wrong and throw your code into the wind. USE WITH CAUTION.

# pre-commit.sh
REPO_PATH=$PWD
git stash save -q --keep-index --include-untracked # (stash@{1})
git stash save -q                                  # (stash@{0})

# Our state at this point:
# * clean worktree
# * stash@{0} contains what is to be committed
# * stash@{1} contains everything, including dirt

# Now reintroduce the changes to be committed so that they can be tested
git stash apply stash@{0} -q

git_unstash() {
    G="git --work-tree \"$REPO_PATH\" --git-dir \"$REPO_PATH/.git\"" 
    eval "$G" reset -q --hard             # Clean worktree again
    eval "$G" stash pop -q stash@{1}      # Put worktree to original dirty state
    eval "$G" reset -q stash@{0} .        # Restore index, ready for commit
    eval "$G" stash drop -q stash@{0}     # Clean up final remaining stash
}
trap git_unstash EXIT

... tests against what is being committed go here ...