Git 'pre-receive' hook and 'git-clang-

2020-07-05 05:37发布

问题:

Let's immediately start with a scrap of the pre-receive hook that I've already written:

#!/bin/sh
##
  format_bold='\033[1m'
   format_red='\033[31m'
format_yellow='\033[33m'
format_normal='\033[0m'
##
  format_error="${format_bold}${format_red}%s${format_normal}"
format_warning="${format_bold}${format_yellow}%s${format_normal}"
##
stdout() {
  format="${1}"
  shift
  printf "${format}" "${@}"
}
##
stderr() {
  stdout "${@}" 1>&2
}
##
output() {
  format="${1}"
  shift
  stdout "${format}\n" "${@}"
}
##
error() {
  format="${1}"
  shift
  stderr "${format_error}: ${format}\n" 'error' "${@}"
}
##
warning() {
  format="${1}"
  shift
  stdout "${format_warning}: ${format}\n" 'warning' "${@}"
}
##
die() {
  error "${@}"
  exit 1
}
##
git() {
  command git --no-pager "${@}"
}
##
list() {
  git rev-list "${@}"
}
##
clang_format() {
  git clang-format --style='file' "${@}"
}
##
while read sha1_old sha1_new ref; do
  case "${ref}" in
  refs/heads/*)
    branch="$(expr "${ref}" : 'refs/heads/\(.*\)')"
    if [ "$(expr "${sha1_new}" : '0*$')" -ne 0 ]; then # delete
      unset sha1_new
      # ...
    else # update
      if [ "$(expr "${sha1_old}" : '0*$')" -ne 0 ]; then # create
        unset sha1_old
        sha1_range="${sha1_new}"
      else
        sha1_range="${sha1_old}..${sha1_new}"
        # ...
        fi
      fi
      # ...
             GIT_WORK_TREE="$(mktemp --tmpdir -d 'gitXXXXXX')"
      export GIT_WORK_TREE
             GIT_DIR="${GIT_WORK_TREE}/.git"
      export GIT_DIR
      mkdir -p "${GIT_DIR}"
      cp -a * "${GIT_DIR}/"
      ln -s "${PWD}/../.clang-format" "${GIT_WORK_TREE}/"
      error=
      for sha1 in $(list "${sha1_range}"); do
        git checkout --force "${sha1}" > '/dev/null' 2>&1
        if [ "$(list --count "${sha1}")" -eq 1 ]; then
          # What should I put here?
        else
          git reset --soft 'HEAD~1' > '/dev/null' 2>&1
        fi
        diff="$(clang_format --diff)"
        if [ "${diff%% *}" = 'diff' ]; then
          error=1
          error '%s: %s\n%s'                                                   \
                'Code style issues detected'                                   \
                "${sha1}"                                                      \
                "${diff}"                                                      \
                1>&2
        fi
      done
      if [ -n "${error}" ]; then
        die '%s' 'Code style issues detected'
      fi
    fi
    ;;
  refs/tags/*)
    tag="$(expr "${ref}" : 'refs/tags/\(.*\)')"
    # ...
    ;;
  *)
    # ...
    ;;
  esac
done
exit 0

NOTE:
Places with irrelevant code are stubbed with # ....

NOTE:
If you are not familiar with git-clang-format, take a look here.

That hook works as expected, and so far, I didn't notice any bugs, but if you spot any problem or have an improvement suggestion, I'd appreciate any report. Probably, I should give a comment on what's the intention behind this hook. Well, it does check every pushed revision for compliance with code style conventions using git-clang-format, and if any of them does not comply, it will output the relevant diff (the one telling developers what should be fixed) for each of them. Basically, I have two in-depth questions regarding this hook.

First, notice that I perform copy of the remote's (server) bare repository to some temporary directory and check out the code for analysis there. Let me explain the intention of this. Note that I do several git checkouts and git resets (due to for loop) in order to analyze all of the pushed revisions individually with git-clang-format. What I am trying to avoid here, is the (possible) concurrency issue on push access to the remote's (server) bare repository. That is, I'm under impression that if multiple developers will try to push at the same time to a remote with this pre-receive hook installed, that might cause problems if each of these push "sessions" does not do git checkouts and git resets with its private copy of the repository. So, to put it simple, does git-daemon have built-in lock management for concurrent push "sessions"? Will it execute the corresponding pre-receive hook instances strictly sequentially or there is a possibility of interleaving (which can potentially cause undefined behavior)? Something tells me that there should be a built-in solution for this problem with concrete guarantees, otherwise how would remotes work in general (even without complex hooks) being subjected to concurrent pushes? If there is such a built-in solution, then the copy is redundant and simply reusing the bare repository would actually speed up the processing. By the way, any reference to official documentation regarding this question is very welcome.

Second, git-clang-format processes only staged (but not committed) changes vs. specific commit (HEAD by default). Thus, you can easily see where a corner case lies. Yes, it's with the root commits (revisions). In fact, git reset --soft 'HEAD~1' cannot be applied to root commits as they have no parents to reset to. Hence, the following check with my second question is there:

        if [ "$(list --count "${sha1}")" -eq 1 ]; then
          # What should I put here?
        else
          git reset --soft 'HEAD~1' > '/dev/null' 2>&1
        fi

I've tried git update-ref -d 'HEAD' but this breaks the repository in such a way that git-clang-format is not able to process it anymore. I believe this is related to the fact that all of these pushed revisions that are being analyzed (including this root one) do not really belong to any branch yet. That is, they are in detached HEAD state. It would be perfect to find a solution to this corner case as well, so that initial commits can also undergo the same check by git-clang-format for compliance with code style conventions.

Peace.

回答1:

NOTE:
For those looking for an up-to-date, (more or less) comprehensive, and well-tested solution, I host the corresponding public repository [1]. Currently, the two important hooks relying on git-clang-format are implemented: pre-commit and pre-receive. Ideally, you get the most automation and fool-proof workflow when using both of them simultaneously. As usual, improvement suggestions are very welcome.

NOTE:
Currently, the pre-commit hook [1] requires the git-clang-format.diff patch (authored by me as well) [1] to be applied to git-clang-format. The motivation and use case examples for this patch are summarized in the official patch review submission to LLVM/Clang [2]. Hopefully, it will be accepted and merged upstream soon.


I've managed to implement a solution for the second question. I have to admit that it was not easy to find due to scarce Git documentation and absence of examples. Let's take a look at the corresponding code changes first:

# ...
clang_format() {
  git clang-format --commit="${commit}" --style='file' "${@}"
}
# ...
      for sha1 in $(list "${sha1_range}"); do
        git checkout --force "${sha1}" > '/dev/null' 2>&1
        if [ "$(list --count "${sha1}")" -eq 1 ]; then
          commit='4b825dc642cb6eb9a060e54bf8d69288fbee4904'
        else
          commit='HEAD~1'
        fi
        diff="$(clang_format --diff)"
        # ...
      done
      # ...

As you can see, instead of repeatedly doing git reset --soft 'HEAD~1', I now explicitly instruct git-clang-format to operate against HEAD~1 with the --commit option (whereas its default is HEAD that was implied in the initial version presented in my question). However, that still does not solve the problem on its own because when we would hit root commit this would again result in error as HEAD~1 would not refer to a valid revision anymore (similarly to how it would not be possible to do git reset --soft 'HEAD~1'). That's why for this particular case, I instruct git-clang-format to operate against the (magic) 4b825dc642cb6eb9a060e54bf8d69288fbee4904 hash [3, 4, 5, 6]. To learn more about this hash, consult the references, but, in brief, it refers to the Git empty tree object — the one that has nothing staged or committed, which is exactly what we need git-clang-format to operate against in our case.

NOTE:
You don't have to remember 4b825dc642cb6eb9a060e54bf8d69288fbee4904 by heart and it's better not to hard code it (just in case this magic hash ever changes in future). It turns out that it can always be retrieved with git hash-object -t tree '/dev/null' [5, 6]. Thus, in my final version of the above pre-receive hook, I have commit="$(git hash-object -t tree '/dev/null')" instead.

P.S. I'm still looking for a good quality answer on my first question. By the way, I asked these questions on the official Git mailing list and received no answers so far, what a shame...



回答2:

Condensed

I had a little bit of trouble understanding the first example, in part due to the length and extra tidbits that make it useful for the OP's specific use case. I combed through and condensed it down to this:

ref_name=$1
new_rev=$3

# only check branches, not tags or bare commits
if [ -z $(echo $ref_name | grep "refs/heads/") ]; then
  exit 0
fi

# don't check empty branches
if [ "$(expr "${new_rev}" : '0*$')" -ne 0 ]; then
  exit 0
fi

# Checkout a copy of the branch (but also changes HEAD)
my_work_tree=$(mktemp -d -t git-work-tree.XXXXXXXX) 2>/dev/null
git --work-tree="${my_work_tree}" --git-dir="." checkout $new_rev -f >/dev/null

# Do the formatter check
echo "Checking code formatting..."
pushd ${my_work_tree} >/dev/null
prettier './**/*.{js,css,html,json,md}' --list-different
my_status=$?
popd >/dev/null

# reset HEAD to master, and cleanup
git --work-tree="${my_work_tree}" --git-dir="." checkout master -f >/dev/null
rm -rf "${my_work_tree}"

# handle error, if any
if [ "0" != "$my_status" ]; then
  echo "Please format the files listed above and re-commit."
  echo "(and don't forget your .prettierrc, if you have one)"
  exit 1
fi

This example is using Prettier, but it'll map pretty well to clang-format, eslint, etc. There are a few limitations to the (perhaps oversimplified, but working) example above. I'd recommend diving deeper...

Better, but longer

Once you've grok'd that I'd also recommend taking a scroll down towards the bottom of this one:

  • Reject Ugly Commits with Server-Side Git Hooks