Switching remote to a specific branch

2019-08-04 09:40发布

问题:

So I understand the value of branching, but one thing is confusing me. I have my local repo. I push changes to a remote, which has a post receive hook set up to write files on a website. So I create a branch (new-branch) to try out something. I edit files, commit, and push to the remote. Great. Problem is, while the repo on my remote is up to date, I guess it's still set to master, as the changes in new-branch aren't reflected on the site. How do I set the remote to a particular branch so that branch is driving the post receive hook, and not just defaulting to the master branch? I would just merge on master but I'm not ready to as I'm still working out the kinks on the branch.

回答1:

OK, based on comments, it sounds like you control the server too, which is needed here. (That is, where I write "they" below, it's just you yourself, but wearing a different hat, as it were.)

Let's start with some background: git push and "bare" repositories

From git's point of view, you work on your repo, and they (whoever they are) work on their repo. So when you do a git push, which delivers commits you've made directly to their repo, they're probably working on their repo. They probably have some branch checked out and are in the middle of editing their copy of xyz.html or whatever.

If your push overwrote their xyz.html, wouldn't that annoy them? They're in the middle of editing! So, for a conventional repository, git will reject an attempt to push to whatever branch is currently checked-out (the one git status prints as "# on branch ...", or shown by git symbolic-ref --short HEAD).1 You can only push to other branches, which by definition, they're not working on, so they won't get annoyed.

Now, in fact, a lot of these centralized "push here" repositories don't have anyone working on them. They're often set up as "bare" repositories, which means they have no work-tree at all. This in turn means the "they"—whoever "they" may be—cannot possibly be working on their copy of xyz.html, since they don't have a working copy of xyz.html in the first place. Cloning with --bare, or converting a regular repository to a bare one, disables the "can't push to current branch" check, which is what they want here.


1In particular, you get a "refusing to update checked out branch" error. This is actually configurable, via receive.denyCurrentBranch, in the (non-bare) repository on the server.


Auto-deploy

So far so good; but now they may want their central push-server bare repo to also automatically deploy to a web site (or, for code repositories, to a test system like Jenkins, or whatever)—the key point is that "they", whoever they are again, have a post-receive hook that deploys one particular branch. But we just described the bare repository as having no work-tree. How does this auto-deploy work? Let's put on our other hat and become "them".

Because this is git, there are actually lots of ways, but the usual (and maybe-best) method is with a post-receive script that checks which branch(es) are being updated, and if the interesting one(s) is/are pushed-to, does the deployment.

Here's an example of a ridiculously simple post-receive hook shell script that has an unimplemented deploy shell function to deploy pushes to master:

#! /bin/sh

deploy()
{
    local newref=$1 branch=$2

    echo deploy invoked: $newref $branch
}

while read oldsha newsha refname; do
    case "$refname" in
    refs/heads/master) deploy $newsha ${refname#refs/heads/};;
    esac
done

Git runs the post-receive hook with its stdin being fed a series of input lines. Each line has three items. The last one is the full name of the reference, so branch master is refs/heads/master. The first two are the "old" SHA-1 and the "new" SHA-1 respectively, for that reference.

At most one of the old and new SHA-1 values can be 40 0 characters: this means the reference is being created (old-SHA-1 is all zeros) or deleted (new is all zeros). Otherwise, the reference currently exists and is simply being updated (pointed to a new commit, normally): it used to point to the old ID and now points to the new.

(In a pre-receive hook, which gets exactly the same stuff, you can reject the attempt to update the reference. In a post-receive hook the update has already happened and the only thing you can do is report on it or make use of it somehow.)

In our case, we really don't care about the old value. It doesn't matter if branch master never existed before, or what was deployed then. (Well, we might want it, in which case we could add it: this is just a shell script, after all.) We don't even really need the new value, because we can read that right out of the repository with a git command, but it's nice to have it, especially if we want to check and complain if the branch has been deleted. (The branch is probably deleted if the attempt to read the master branch reference fails, but that could happen if the server has caught fire and the repository is half-destroyed, too. Although in this case we might not care any more. :-) )

Mainly, though, what we need to do is check out all the files from branch master, sticking them into a deployment area. As it turns out, we can do this with a simple git checkout, even in a bare repository, by specifying an alternate work-tree:

NULL_SHA1=0000000000000000000000000000000000000000 # 40 0s

deploy() {
    local newref=$1 branch=$2

    if [ $newref = $NULL_SHA1 ]; then
        echo "deploy: branch $branch is deleted!"
        return 1
    fi
    # next bit is stupid, hardcodes "master" even though
    # we have "$branch", but read on...
    git --work-tree=/deploy/master checkout master
}

(Side note: sometimes you will see this as GIT_WORK_TREE=/deploy/master git checkout master or similar. That does exactly the same thing. If you don't specify --work-tree git uses $GIT_WORK_TREE if it's set.)

If we want more than just one (master) auto-deploy branch, we can add more branch names to the set that invoke deploy in our script, and fix the dumb bit: check out $branch to /deploy/$branch, for instance.

There are a number of potential glitches here though:

  1. The target directory (here /deploy/master) must exist.
  2. This kind of git checkout updates git's notion of "current branch". A bare repository still has such a thing: there's still a file named HEAD that contains the current branch name. If we git checkout otherbranch we'll change it; and this affects git clone operations that clone from this bare repository.
  3. Git will make sure we don't clobber modified files. Depending on the deployment directory and what gets done there, this may not be an issue.
  4. Git likes to optimize checkouts.

Item 1 is easy enough to fix by first doing a mkdir -p.

Item 2 is not a problem if we only deploy master, or if we don't mind new clones potentially checking out branch otherbranch automatically. But we can fix it by using a different form of git checkout. We can add -f as well to fix item 3.

git --work-tree=/deploy/$branch checkout -f $branch -- .

This all introduces a new problem, though, which goes along with item 4.

Item 4 is the trickiest. When you do a checkout without the -- . at the end, so that you're switching branches, git updates its index (staging area) file to keep track of what's already in the work tree. Then when you do a new git checkout that replaces the previous one, it can tell which files can be left alone, which ones must be removed if any, and which ones must be rewritten or added. Some deployment scripts (that check out one branch, like master) depend on this behavior.

If we switch the deployment code to use the -- . form, git still updates the index, but it won't remove files for us. This means we need to clean out files that went away. (It also winds up making a very messy index, but usually that's OK: this is a --bare repository that no one works in anyway.)

Cleaning out just the right files is tricky. We can use the $oldsha method, comparing the old and new commits to determine what files to remove. Or we can simply blow away the /deploy/$branch directory entirely:

rm -rf /deploy/$branch
mkdir /deploy/$branch
git --work-tree=/deploy/$branch checkout -f $branch -- .

This is often sufficient. It still has a bug: there's a short period when the old deployed version is being removed, and the new deployed version is being built up, when things are a mess in the deploy directory. But sometimes that's OK.

We can mostly-fix the bug by switching the order around: make a new empty directory, populate it, then mv the new directory into place (mving the old one out of the way) and only then rm -rf the (moved-out-of-the-way) old one. There's still a tiny window during which the deploy directory does not exist, but it's as small as possible. (Well, there's a potential to close it entirely with symlinks: see comments below.)

There's one last trick we can do, which I have never actually tested: we can get git to do most of the work, by using a different index file for each deploy directory. Git uses $GIT_INDEX_FILE, if it's set, or $GIT_DIR/index if not. So if we were to set GIT_INDEX_FILE=$GIT_DIR/index.$branch, we should get a file unique to the particular branch name. Then we can go back to the "check out a specific branch" form of git checkout and let it remove files if needed.

This last method opens the inconsistency window a bit wider: if git has to remove a dozen files and update or create another 100 files, whatever's using the deployed version has a lot more chances to see partial removes and/or updates as git checkout progresses. But it is a lot simpler and has less "wasted motion" in the work tree.

One final ironic musing, as it were

Note that our deploy function sets up a work tree—or maybe one of several, if we have multiple deployable branches—as a literal git work tree, possibly even using $GIT_WORK_TREE. It then clobbers whatever is in that work tree, replacing it with the latest version in the interesting branch. This is exactly what we said would be annoying, back in the "background" section. It's not annoying at all, it's just what we want!

Well, sort of. The nice thing about this work tree, vs that bare repository, is that you cannot accidentally cd /deploy/master, see a .git directory, and start working in it. There is no .git directory here. Nonetheless, the combination of "work directory here" plus "bare repository there" really just equals "regular, non-bare repository", in almost every sense.



标签: git branch