Given the following branch structure:
*------*---*
Master \
*---*--*------*
A \
*-----*-----*
B (HEAD)
If I want to merge my B changes (and only my B changes, no A changes) into master what is the difference between these two set of commands?
>(B) git rebase master
>(B) git checkout master
>(master) git merge B
>(B) git rebase --onto master A B
>(B) git checkout master
>(master) git merge B
I'm mainly interested in learning if code from Branch A could make it into master if I use the first way.
You can try it yourself and see. You can create a local git repository to play with:
Before any of the given operations your repository looks like this
After a standard rebase (without
--onto master
) the structure will be:...where the
x'
are commits from theA
branch. (Note how they're now duplicated at the base of branchB
.)Instead, a rebase with
--onto master
will create the following cleaner and simpler structure:Bear with me for a while before I answer the question as asked. One the earlier answers is right but there are labeling and other relatively minor (but potentially confusing) issues, so I want to start with branch drawings and branch labels. Also, people coming from other systems, or maybe even just new to revision control and git, often think of branches as "lines of development" rather than "traces of history" (git implements them as the latter, rather than the former, so a commit is not necessarily on any specific "line of development").
First, there is a minor problem with the way you drew your graph:
Here's the exact same graph, but with the labels drawn in differently and some more arrow-heads added (and I've numbered the commit nodes for use below):
Why this matters is that git is quite loose about what it means for a commit to be "on" some branch—or perhaps a better phrase is to say that some commit is "contained in" some set of branches. Commits cannot be moved or changed, but branch labels can and do move.
More specifically, a branch name like
master
,A
, orB
points to one specific commit. In this case,master
points to commit 2,A
points to commit 6, andB
points to commit 9. The first few commits 0 through 2 are contained within all three branches; commits 3, 4, and 5 are contained within bothA
andB
; commit 6 is contained only withinA
; and commits 7 through 9 are contained only inB
. (Incidentally, multiple names can point to the same commit, and that's normal when you make a new branch.)Before we proceed, let me re-draw the graph yet one more way:
This just emphasizes that it's not a horizontal line of commits that matter, but rather the parent/child relationships. The branch label points to a starting commit, and then (at least the way these graphs are drawn) we move left, maybe also going up or down as needed, to find parent commits.
When you rebase commits, you're actually copying those commits.
Git can never change any commit
There's one "true name" for any commit (or indeed any object in a git repository), which is its SHA-1: that 40-hex-digit string like
9f317ce...
that you see ingit log
for instance. The SHA-1 is a cryptographic1 checksum of the contents of the object. The contents are the author and committer (name and email), time stamps, a source tree, and the list of parent commits. The parent of commit #7 is always commit #5. If you make a mostly-exact copy of commit #7, but set its parent to commit #2 instead of commit #5, you get a different commit with a different ID. (I've run out of single digits at this point—normally I use single uppercase letters to represent commit IDs, but with branches namedA
andB
I thought that would be confusing. So I'll call a copy of #7, #7a, below.)What
git rebase
doesWhen you ask git to rebase a chain of commits—such as commits #7-8-9 above—it has to copy them, at least if they're going to move anywhere (if they're not moving it can just leave the originals in place). It defaults to copying commits from the currently-checked-out branch, so
git rebase
needs just two extra pieces of information:When you run
git rebase <upstream>
, you let git figure out both parts from one single piece of information. When you use--onto
, you get to tell git separately about the both parts: you still supply anupstream
but it doesn't compute the target from<upstream>
, it only computes the commits to copy from<upstream>
. (Incidentally, I think<upstream>
is not a good name, but it's what rebase uses and I don't have anything way better, so let's stick with it here. Rebase calls target<newbase>
, but I think target is a much better name.)Let's take a look at these two options first. Both assume that you're on branch
B
in the first place:git rebase master
git rebase --onto master A
With the first command, the
<upstream>
argument torebase
ismaster
. With the second, it'sA
.Here's how git computes which commits to copy: it hands the current branch to
git rev-list
, and it also hands<upstream>
togit rev-list
, but using--not
—or more precisely, with the equivalent of the two-dotexclude..include
notation. This means we need to know howgit rev-list
works.While
git rev-list
is extremely complicated—most git commands end up using it; it's the engine forgit log
,git bisect
,rebase
,filter-branch
, and so on—this particular case is not too hard: with the two-dot notation,rev-list
lists every commit reachable from the right-hand side (including that commit itself), excluding every commit reachable from the left-hand side.In this case,
git rev-list HEAD
finds all commits reachable fromHEAD
—that is, almost all commits: commits 0-5 and 7-9—andgit rev-list master
finds all commits reachable frommaster
, which is commit #s 0, 1, and 2. Subtracting 0-through-2 from 0-5,7-9 leaves 3-5,7-9. These are the candidate commits to copy, as listed bygit rev-list master..HEAD
.For our second command, we have
A..HEAD
instead ofmaster..HEAD
, so the commits to subtract are 0-6. Commit #6 doesn't appear in theHEAD
set, but that's fine: subtracting away something that's not there, leaves it not there. The resulting candidates-to-copy is therefore 7-9.That still leaves us with figuring out the target of the rebase, i.e., where should copied commits land? With the second command, the answer is "the commit identified by the
--onto
argument". Since we said--onto master
, that means the target is commit #2.rebase #1
git rebase master
With the first command, though, we didn't specify a target directly, so git uses the commit identified by
<upstream>
. The<upstream>
we gave wasmaster
, which points to commit #2, so the target is commit #2.The first command is therefore going to start by copying commit #3 with whatever minimal changes are needed so that its parent is commit #2. Its parent is already commit #2. Nothing has to change, so nothing changes, and rebase just re-uses the existing commit #3. It must then copy #4 so that its parent is #3, but the parent is already #3, so it just re-uses #4. Likewise, #5 is already good. It completely ignores #6 (that's not in the set of commits to copy); it checks #s 7-9 but they're all good as well, so the whole rebase ends up just re-using all the original commits. You can force copies anyway with
-f
, but you didn't, so this whole rebase ends up doing nothing.rebase #2
git rebase --onto master A
The second rebase command used
--onto
to select #2 as its target, but told git to copy just commits 7-9. Commit #7's parent is commit #5, so this copy really has to do something.2 So git makes a new commit—let's call this #7a—that has commit #2 as its parent. The rebase moves on to commit #8: the copy now needs #7a as its parent. Finally, the rebase moves on to commit #9, which needs #8a as its parent. With all commits copied, the last thing rebase does is move the label (remember, labels move and change!). This gives a graph like this:OK, but what about
git rebase --onto master A B
?This is almost the same as
git rebase --onto master A
. The difference is that extraB
at the end. Fortunately, this difference is very simple: if you givegit rebase
that one extra argument, it runsgit checkout
on that argument first.3Your original commands
In your first set of commands, you ran
git rebase master
while on branchB
. As noted above, this is a big no-op: since nothing needs to move, git copies nothing at all (unless you use-f
/--force
, which you didn't). You then checked outmaster
and usedgit merge B
, which—if it it is told to4—creates a new commit with the merge. Therefore Dherik's answer, as of the time I saw it at least, is correct here: The merge commit has two parents, one of which is the tip of branchB
, and that branch reaches back through three commits that are on branchA
and therefore some of what's onA
winds up being merged intomaster
.With your second command sequence, you first checked out
B
(you were already onB
so this was redundant, but was part of thegit rebase
). You then had rebase copy three commits, producing the final graph above, with commits 7a, 8a, and 9a. You then checked outmaster
and made a merge commit withB
(see footnote 4 again). Again Dherik's answer is correct: the only thing missing is that the original, abandoned commits are not drawn-in and it's not as obvious that the new merged-in commits are copies.1This only matters in that it's extraordinarily difficult to target a particular checksum. That is, if someone you trust tells you "I trust the commit with ID 1234567...", it's almost impossible for someone else—someone you may not trust so much—to come up with a commit that has that same ID, but has different contents. The chances of it happening by accident are 1 in 2160, which is much less likely than you having a heart attack while being struck by lightning while drowning in a tsunami while being abducted by space aliens. :-)
2The actual copy is made using the equivalent of
git cherry-pick
: git compares the commit's tree with its parent's tree to get a diff, then applies the diff to the new parent's tree.3This is actually, literally true at this time:
git rebase
is a shell script that parses your options, then decides which kind of internal rebase to run: the non-interactivegit-rebase--am
or the interactivegit-rebase--interactive
. After it's figured out all the arguments, if there's the one left-over branch name argument, the script doesgit checkout <branch-name>
before starting the internal rebase.4Since
master
points to commit 2 and commit 2 is an ancestor of commit 9, this would normally not make a merge commit after all, but instead do what Git calls a fast-forward operation. You can instruct Git not to do these fast-forwards usinggit merge --no-ff
. Some interfaces, such as GitHub's web interface and perhaps some GUIs, may separate the different kinds of operations, so that their "merge" forces a true merge like this.With a a fast-forward merge, the final graph for the first case is:
In either case, commits 1 through 9 are now on both branches,
master
andB
. The difference, compared to the true merge is that, from the graph, you can see the history that includes the merge.In other words, the advantage to a fast-forward merge is that it leaves no trace of what is otherwise a trivial operation. The disadvantage of a fast-forward merge is, well, that it leaves no trace. So the question of whether to allow the fast-forward is really a question of whether you want to leave an explicit merge in the history formed by the commits.
The differences:
First set
(B)
git rebase master
Nothing happened. There are no new commits in
master
branch since the creation ofB
branch.(B)
git checkout master
(master)
git merge B
Second set
(B)
git rebase --onto master A B
(B)
git checkout master
(master)
git merge B
Be careful what you understand for "only my B changes".
In the first set, the
B
branch is (before the final merge):And in the second set your B branch is:
If I understand correctly, what you want is only the B commits that are not in A branch. So, the second set is the right choice for you before the merge.
git log --graph --decorate --oneline A B master
(or an equivalent GUI tool) can be used after each git command to visualize the changes.This is the initial state of the repository, with
B
as the current branch.Here is a script to create a repository in this state.
The first rebase command does nothing.
Checking out
master
and mergingB
simply pointsmaster
at the same commit asB
, (i.e.9a90b7c
). No new commits are created.The second rebase command copies the commits in the range
A..B
and points them atmaster
. The three commits in this range are9a90b7c C9, 2968483 C8, and 187c9c8 C7
. The copies are new commits with their own commit IDs;7c0e241
,40b105d
, and5b0bda1
. The branchesmaster
andA
are unchanged.As before, checking out
master
and mergingB
simply pointsmaster
at the same commit asB
, (i.e.7c0e241
). No new commits are created.The original chain of commits that
B
was pointing at still exists.