View commits on a new branch in the update hook

2020-03-30 04:56发布

I wrote an update hook (server-side) which checks all commit messages (To check if there is an Issue Id)

There is an extract of my python code (update.py):

[...]
if newrev == "0000000000000000000000000000000000000000":
  newrev_type = "delete"
elif oldrev == "0000000000000000000000000000000000000000":  
  newrev_type = "create"

  # HERE IS MY QUESTION, I want to get the commits SHA-1 :-)

else:
  POPnewrev_type = os.popen("git cat-file -t " + newrev)
  newrev_type = POPnewrev_type.read()[0:-1]
  # get the SHA-1
  POPanalyzelog = os.popen("git log " + oldrev + ".." + newrev + " --pretty=#%H")
  analyzelog = POPanalyzelog.read().split('#')
[...]

So, here, in case of newrev_type = "delete", the user wants to delete a branch => No problem.
In case of pushing in an existing branch, we get the SHA-1 of the commits => OK
But when the user creates a branch, I don't know how to get the SHA-1...

Do you have any ideas?

1条回答
一夜七次
2楼-- · 2020-03-30 05:23

Before I answer, let's note some reminders. There are several "stumbling blocks" that get people when they write hooks. You have hit my "third" in the list below.

In both pre-receive and update, you are given three arguments (in different orders and through different methods, arguments vs stdin; but the same three arguments, in the end, with the same "deal" as it were). Two are old and new sha1 and the third is the reference name. Let's call them oldrev and newrev (as you did) and the third refname.

When you finish your script, a return value of 0 allows git to update refname, and a nonzero return forbids it. That is, the script is called with a proposal: "I (the git update operation now running) propose to make a change in some label(s)". For the update hook you get each label individually, and each return value allows or disallows one change; for the pre-receive hook, you get them all in a batch, one per line, on standard input, and your return value allows or disallows the change as a whole. (If you reject the change in pre-receive, no updates will happen. After pre-receive OKs them, or is absent, update gets a chance one at a time.)

If the refname starts with "refs/heads/", it is a branch name. Other possibilities include "refs/tags/" and "refs/notes/" although note references are relatively new. Most refnames will point to commit objects, except that tags often (but not always) point to annotated-tag objects.

So here's the first stumbling block: the refname might not be a branch. Make sure that it's OK to apply your logic to tags (and maybe notes), or handle them separately (whichever is appropriate).

If the old and new sha1 are both "non-null" (not "0" * 40), the proposal is to move the label. It used to name oldrev and now it will (if you allow it) name newrev.

Here's the second stumbling block: when a label moves, there's no guarantee that the old revision and new revision are related at all. Watch out for "nonsense" results from oldrev..newrev, which occur in that case. You may (or may not, depending on what you're doing) want to verify that oldrev is an ancestor of newrev. (See git merge-base --is-ancestor.)

When the new sha1 is null, the proposal is to remove the label, which is pretty straightforward (everyone seems to get this right instinctively :-) ).

When the old sha1 is null, the proposal is to set a new label. Here's the third stumbling block: That label did not exist before. That tells you nothing about which commit(s), if any, you want to consider to be "part of" the new label. Labels only name one commit and it's up to someone interpreting them, at some future point, what that label "means".

As an example, suppose I have a copy of your repo (I did a git clone earlier) and am allowed to git push back to it. I decide: gosh, rev 1234567 should have a tag, and ref 5555555 should have a branch label:

git tag new-tag 1234567
git branch new-branch 5555555
git push --tags origin refs/heads/new-branch:refs/heads/new-branch

If 1234567 refers to a commit object, I have created a new lightweight tag pointing to that; if it's an annotated tag, I've made a name (probably "another" name) for the annotated tag.

Assuming 5555555 refers to a commit object, I have in fact created a new branch, but what is its "history"? In this case, it probably has none at all, I probably just added the label "in the middle of" some existing branch. (But maybe not: maybe I added it where my master now points, and I am going to rewind master back to origin/master in a moment, after my push finishes.)

The most common answer seems to be "the new branch names any commit starting from newrev but not already named by, or through the parents of, any other branch-name". There is a way to find a list of such commits. In sh form (see notes below):

git rev-list $newrev --not \
    $(git for-each-ref refs/heads/ --format='%(refname)')

In this case, since you're in a pre-receive or update hook, the new refname has not actually been born yet, so it should not be necessary to exclude it, but a comment on this answer suggests that sometimes it might, in which case (again in sh):

git rev-list $newrev --not \
    $(git for-each-ref refs/heads/ --format='%(refname)' |
    grep -v ^$newref\$)

would do the trick. But there's another potential stumbling block here, which you can't do anything about in an update hook: if a push is creating more than one branch, the resulting list could depend on the multiple new branch names and/or the order of their creation. In a post-receive hook, you can find all new branch creations, and:

  • reject if there is more than one, or
  • add more --not arguments to git rev-list as needed.

If you do the latter, beware of the case of creating two or more new branch labels at the same revision: they'll each refer to all of the others' commits.

A final stumbling block (rarely hit): in the post-receive hook, the input stream listing revision numbers and reference names is coming from a pipe, and can only be read once. If you want to read it multiple times, you must save it to a temporary file (so that you can seek back to offset 0, or close and re-open it).

A few final notes:

  • I'd recommend doing:

    NULL_SHA1 = "0" * 40

    earlier in the python code, and then using rev == NULL_SHA1 as the test. If nothing else, it makes it easy to see that there are exactly forty 0s, and that the point is to check for a "null sha1".

    Git may move to using SHA3-256, now that SHA-1 has been broken by example. (This is not fatal to Git, but shows that compute power has advanced to the point that it's perhaps unwise to keep depending on it.) It's not clear how this will affect hooks, but you might now want to match against any number of 0s as long as they are all zeros, using:

    re.match('0+$', hash)
    

    (or re.search('^0+$', ...) if you prefer re.search for some reason). You can pre-compile this as nullhash = re.compile('^0+$') and then use nullhash.match or nullhash.search (as before, the prefix hat is only required if you are using the general search rather than the left-anchored match).

  • Use subprocess.Popen with shell=False for a little bit more efficiency (save firing up "sh") and safety (not a problem with refnames, see git check-ref-format, but just a general rule).

  • Use git rev-list directly, rather than log with format %H (and study the manual page for git rev-list closely; it's highly relevant to most hooks).

  • Leave in the refs/heads/ and/or refs/tags/ prefixes: git rev-list is happy with these prefixes, and they serve to make sure that you get the right reference. For instance, if there are both a tag and a branch named master, which one do you get? (You get the tag—but why not use the full name, and not have to remember that?)

查看更多
登录 后发表回答