Finding a branch point with Git?

2018-12-31 12:49发布

I have a repository with branches master and A and lots of merge activity between the two. How can I find the commit in my repository when branch A was created based on master?

My repository basically looks like this:

-- X -- A -- B -- C -- D -- F  (master) 
          \     /   \     /
           \   /     \   /
             G -- H -- I -- J  (branch A)

I'm looking for revision A, which is not what git merge-base (--all) finds.

标签: git branch
21条回答
低头抚发
2楼-- · 2018-12-31 12:50

I recently needed to solve this problem as well and ended up writing a Ruby script for this: https://github.com/vaneyckt/git-find-branching-point

查看更多
其实,你不懂
3楼-- · 2018-12-31 12:52

I was looking for the same thing, and I found this question. Thank you for asking it!

However, I found that the answers I see here don't seem to quite give the answer you asked for (or that I was looking for) -- they seem to give the G commit, instead of the A commit.

So, I've created the following tree (letters assigned in chronological order), so I could test things out:

A - B - D - F - G   <- "master" branch (at G)
     \   \     /
      C - E --'     <- "topic" branch (still at E)

This looks a little different than yours, because I wanted to make sure that I got (referring to this graph, not yours) B, but not A (and not D or E). Here are the letters attached to SHA prefixes and commit messages (my repo can be cloned from here, if that's interesting to anyone):

G: a9546a2 merge from topic back to master
F: e7c863d commit on master after master was merged to topic
E: 648ca35 merging master onto topic
D: 37ad159 post-branch commit on master
C: 132ee2a first commit on topic branch
B: 6aafd7f second commit on master before branching
A: 4112403 initial commit on master

So, the goal: find B. Here are three ways that I found, after a bit of tinkering:


1. visually, with gitk:

You should visually see a tree like this (as viewed from master):

gitk screen capture from master

or here (as viewed from topic):

gitk screen capture from topic

in both cases, I've selected the commit that is B in my graph. Once you click on it, its full SHA is presented in a text input field just below the graph.


2. visually, but from the terminal:

git log --graph --oneline --all

(Edit/side-note: adding --decorate can also be interesting; it adds an indication of branch names, tags, etc. Not adding this to the command-line above since the output below doesn't reflect its use.)

which shows (assuming git config --global color.ui auto):

output of git log --graph --oneline --all

Or, in straight text:

*   a9546a2 merge from topic back to master
|\  
| *   648ca35 merging master onto topic
| |\  
| * | 132ee2a first commit on topic branch
* | | e7c863d commit on master after master was merged to topic
| |/  
|/|   
* | 37ad159 post-branch commit on master
|/  
* 6aafd7f second commit on master before branching
* 4112403 initial commit on master

in either case, we see the 6aafd7f commit as the lowest common point, i.e. B in my graph, or A in yours.


3. With shell magic:

You don't specify in your question whether you wanted something like the above, or a single command that'll just get you the one revision, and nothing else. Well, here's the latter:

diff -u <(git rev-list --first-parent topic) \
             <(git rev-list --first-parent master) | \
     sed -ne 's/^ //p' | head -1
6aafd7ff98017c816033df18395c5c1e7829960d

Which you can also put into your ~/.gitconfig as (note: trailing dash is important; thanks Brian for bringing attention to that):

[alias]
    oldest-ancestor = !zsh -c 'diff -u <(git rev-list --first-parent "${1:-master}") <(git rev-list --first-parent "${2:-HEAD}") | sed -ne \"s/^ //p\" | head -1' -

Which could be done via the following (convoluted with quoting) command-line:

git config --global alias.oldest-ancestor '!zsh -c '\''diff -u <(git rev-list --first-parent "${1:-master}") <(git rev-list --first-parent "${2:-HEAD}") | sed -ne "s/^ //p" | head -1'\'' -'

Note: zsh could just as easily have been bash, but sh will not work -- the <() syntax doesn't exist in vanilla sh. (Thank you again, @conny, for making me aware of it in a comment on another answer on this page!)

Note: Alternate version of the above:

Thanks to liori for pointing out that the above could fall down when comparing identical branches, and coming up with an alternate diff form which removes the sed form from the mix, and makes this "safer" (i.e. it returns a result (namely, the most recent commit) even when you compare master to master):

As a .git-config line:

[alias]
    oldest-ancestor = !zsh -c 'diff --old-line-format='' --new-line-format='' <(git rev-list --first-parent "${1:-master}") <(git rev-list --first-parent "${2:-HEAD}") | head -1' -

From the shell:

git config --global alias.oldest-ancestor '!zsh -c '\''diff --old-line-format='' --new-line-format='' <(git rev-list --first-parent "${1:-master}") <(git rev-list --first-parent "${2:-HEAD}") | head -1'\'' -'

So, in my test tree (which was unavailable for a while, sorry; it's back), that now works on both master and topic (giving commits G and B, respectively). Thanks again, liori, for the alternate form.


So, that's what I [and liori] came up with. It seems to work for me. It also allows an additional couple of aliases that might prove handy:

git config --global alias.branchdiff '!sh -c "git diff `git oldest-ancestor`.."'
git config --global alias.branchlog '!sh -c "git log `git oldest-ancestor`.."'

Happy git-ing!

查看更多
萌妹纸的霸气范
4楼-- · 2018-12-31 12:52

surely I'm missing something, but IMO, all the problems above are caused because we are always trying to find the branch point going back in the history, and that causes all sort of problems because of the merging combinations available.

Instead, I've followed a different approach, based in the fact that both branches share a lot of history, exactly all the history before branching is 100% the same, so instead of going back, my proposal is about going forward (from 1st commit), looking for the 1st difference in both branches. The branch point will be, simply, the parent of the first difference found.

In practice:

#!/bin/bash
diff <( git rev-list "${1:-master}" --reverse --topo-order ) \
     <( git rev-list "${2:-HEAD}" --reverse --topo-order) \
--unified=1 | sed -ne 's/^ //p' | head -1

And it's solving all my usual cases. Sure there are border ones not covered but... ciao :-)

查看更多
余欢
5楼-- · 2018-12-31 12:53

Given that so many of the answers in this thread do not give the answer the question was asking for, here is a summary of the results of each solution, along with the script I used to replicate the repository given in the question.

The log

Creating a repository with the structure given, we get the git log of:

$ git --no-pager log --graph --oneline --all --decorate
* b80b645 (HEAD, branch_A) J - Work in branch_A branch
| *   3bd4054 (master) F - Merge branch_A into branch master
| |\  
| |/  
|/|   
* |   a06711b I - Merge master into branch_A
|\ \  
* | | bcad6a3 H - Work in branch_A
| | * b46632a D - Work in branch master
| |/  
| *   413851d C - Merge branch_A into branch master
| |\  
| |/  
|/|   
* | 6e343aa G - Work in branch_A
| * 89655bb B - Work in branch master
|/  
* 74c6405 (tag: branch_A_tag) A - Work in branch master
* 7a1c939 X - Work in branch master

My only addition, is the tag which makes it explicit about the point at which we created the branch and thus the commit we wish to find.

The solution which works

The only solution which works is the one provided by lindes correctly returns A:

$ diff -u <(git rev-list --first-parent branch_A) \
          <(git rev-list --first-parent master) | \
      sed -ne 's/^ //p' | head -1
74c6405d17e319bd0c07c690ed876d65d89618d5

As Charles Bailey points out though, this solution is very brittle.

If you branch_A into master and then merge master into branch_A without intervening commits then lindes' solution only gives you the most recent first divergance.

That means that for my workflow, I think I'm going to have to stick with tagging the branch point of long running branches, since I can't guarantee that they can be reliably be found later.

This really all boils down to gits lack of what hg calls named branches. The blogger jhw calls these lineages vs. families in his article Why I Like Mercurial More Than Git and his follow-up article More On Mercurial vs. Git (with Graphs!). I would recommend people read them to see why some mercurial converts miss not having named branches in git.

The solutions which don't work

The solution provided by mipadi returns two answers, I and C:

$ git rev-list --boundary branch_A...master | grep ^- | cut -c2-
a06711b55cf7275e8c3c843748daaa0aa75aef54
413851dfecab2718a3692a4bba13b50b81e36afc

The solution provided by Greg Hewgill return I

$ git merge-base master branch_A
a06711b55cf7275e8c3c843748daaa0aa75aef54
$ git merge-base --all master branch_A
a06711b55cf7275e8c3c843748daaa0aa75aef54

The solution provided by Karl returns X:

$ diff -u <(git log --pretty=oneline branch_A) \
          <(git log --pretty=oneline master) | \
       tail -1 | cut -c 2-42
7a1c939ec325515acfccb79040b2e4e1c3e7bbe5

The script

mkdir $1
cd $1
git init
git commit --allow-empty -m "X - Work in branch master"
git commit --allow-empty -m "A - Work in branch master"
git branch branch_A
git tag branch_A_tag     -m "Tag branch point of branch_A"
git commit --allow-empty -m "B - Work in branch master"
git checkout branch_A
git commit --allow-empty -m "G - Work in branch_A"
git checkout master
git merge branch_A       -m "C - Merge branch_A into branch master"
git checkout branch_A
git commit --allow-empty -m "H - Work in branch_A"
git merge master         -m "I - Merge master into branch_A"
git checkout master
git commit --allow-empty -m "D - Work in branch master"
git merge branch_A       -m "F - Merge branch_A into branch master"
git checkout branch_A
git commit --allow-empty -m "J - Work in branch_A branch"

I doubt the git version makes much difference to this, but:

$ git --version
git version 1.7.1

Thanks to Charles Bailey for showing me a more compact way to script the example repository.

查看更多
旧时光的记忆
6楼-- · 2018-12-31 12:53

In general, this is not possible. In a branch history a branch-and-merge before a named branch was branched off and an intermediate branch of two named branches look the same.

In git, branches are just the current names of the tips of sections of history. They don't really have a strong identity.

This isn't usually a big issue as the merge-base (see Greg Hewgill's answer) of two commits is usually much more useful, giving the most recent commit which the two branches shared.

A solution relying on the order of parents of a commit obviously won't work in situations where a branch has been fully integrated at some point in the branch's history.

git commit --allow-empty -m root # actual branch commit
git checkout -b branch_A
git commit --allow-empty -m  "branch_A commit"
git checkout master
git commit --allow-empty -m "More work on master"
git merge -m "Merge branch_A into master" branch_A # identified as branch point
git checkout branch_A
git merge --ff-only master
git commit --allow-empty -m "More work on branch_A"
git checkout master
git commit --allow-empty -m "More work on master"

This technique also falls down if an integration merge has been made with the parents reversed (e.g. a temporary branch was used to perform a test merge into master and then fast-forwarded into the feature branch to build on further).

git commit --allow-empty -m root # actual branch point
git checkout -b branch_A
git commit --allow-empty -m  "branch_A commit"
git checkout master
git commit --allow-empty -m "More work on master"
git merge -m "Merge branch_A into master" branch_A # identified as branch point
git checkout branch_A
git commit --allow-empty -m "More work on branch_A"

git checkout -b tmp-branch master
git merge -m "Merge branch_A into tmp-branch (master copy)" branch_A
git checkout branch_A
git merge --ff-only tmp-branch
git branch -d tmp-branch

git checkout master
git commit --allow-empty -m "More work on master"
查看更多
残风、尘缘若梦
7楼-- · 2018-12-31 12:54

After a lot of research and discussions, it's clear there's no magic bullet that would work in all situations, at least not in the current version of Git.

That's why I wrote a couple of patches that add the concept of a tail branch. Each time a branch is created, a pointer to the original point is created too, the tail ref. This ref gets updated every time the branch is rebased.

To find out the branch point of the devel branch, all you have to do is use devel@{tail}, that's it.

https://github.com/felipec/git/commits/fc/tail

查看更多
登录 后发表回答