I have to development branches and I found branch B
depending on code from branch A
.
I want to rebase A
into B
so I can keep on developing B
.
Soon A
will be merged to master
(before B
), but not now. Then, when I merge B
, will it break the reference from having A
rebased on it?
Can I just then rebase B
on master
and all will be fine or do I need to do any special step?
Note that git (and pretty much all other version control systems) calls this rebasing onto, not into.
You may indeed do this kind of rebase. However, it's important to know that when you rebase some commit(s), git actually copies them to new commits. This affects everyone who shares the code, if they already have the old commits.
The picture
Let's suppose that what you have right now can be drawn this way:
Here branch
A
points to commitK
, branchB
points to commitG
, and branchmaster
points to commitR
(each commit now has these one-letter codes so that we can identify it more easily; in reality, their names are SHA-1 hashes likebfcd84f529b...
). CommitsN
throughR
are therefore onmaster
(along with all commits hidden off to the left ofN
); commitsN
,O
,P
,J
, andK
(and all commits left ofN
as before) are on branchA
; and commitsN
,E
,F
, andG
(and all commits left ofN
as always) are on branchB
.Merge bases
Commit
N
is the merge base of branchesB
andmaster
, and commitP
is the merge base of branchesA
andmaster
. The merge base is simply1 the commit after which the branches diverge, which becomes easy to see if you draw the graph as we just did. This matters for yourgit rebase
because the default method git uses is to find this merge base commit, and copy commits that are after that point. (And, this means that before you do a rebase, it may be a good idea for you to draw the graph, so that you can see what you will be doing.)Copying commits
To copy a single commit yourself, use the command
git cherry-pick
. You don't need to do this now—rebase will do it for you—but it's a good idea to know how it works, so let's cover that briefly.In git, each commit is a fully self-contained snapshot. For instance, consider commit
F
on branchB
. CommitF
contains the complete source tree associated with your project, as of the time you made commitF
. This is also true of commitsE
andG
. If we compare commitE
vs commitF
, we will see what changed between "the tree for commitE
" and "the tree for commitF
". Similarly, if we compareF
vsG
, we will find out what changed in going fromF
toG
.Converting a commit to a set of changes—called a changeset—allows us to copy a commit. Suppose, for instance, we see what happened from
N
toE
. Then suppose further that we check out commitK
and make the same change, and then commit it on a new, temporary branch:I called the new commit
E'
, rather than giving it a new single letter, because it's a copy of commitE
. It's not exactly the same as commitE
of course. For one thing, it has a different parent (it has commitK
as its parent); and it probably has a bunch of other differences fromE
: specifically whatever happened in commitsO
,P
,J
, andK
, since we've now applied "the changes fromN
toE
onto "what was inK
".Rebase is just repeated copies
git rebase
works by:If you
git checkout B
and thengit rebase A
, the commits identified in step 1 are those after the merge base betweenB
andA
, up to the tip of branchB
. TheA
andB
here are because your current branch isB
and you saidgit rebase A
.Exercise: look at the graph and identify the merge-base of
B
andA
. It might be a bit hard to spot, but what if we re-draw the graph like this?(This drawing is, topologically speaking, exactly the same as the original graph we drew, but now it's super-obvious that commit
N
is the merge base.)In short, these are commits
E
,F
, andG
. So rebase goes on to step 2 and copies those three commits, placing the copies after ("onto") the tip of branchA
(because you saidgit rebase A
).Thus, at this point in the rebase process, we now have:
There's just one last thing for rebase to do now, which is to re-point the current branch
B
, to the last commit just copied. That's pretty easy:The original commits,
E
throughG
, are mostly gone at this point (they're still retained through the reflog for branchB
but unless you specifically look there, you won't see them any more). The new copies,E'
throughG'
, have been added on just past the tip of branchA
.Why all this matters
All of this copying and shuffling-about takes place in your repository, and not in anyone else's copy of your repository. This means that any co-worker or centralized server does not have these copies yet. If you
git push
the copies to a centralized server, the server will get the copies—but if the server had the original, now abandoned, commits, and if the server called that "branchB
", you will have to force-push to the server.Doing this force-push will make the server discard its copies of the original commits, and put in place the copied commits and call those "branch
B
". This is true even if, on the server, there is a commitH
subsequent to commitG
, with branchB
pointing to commitH
. In other words, this can "lose" commits from the server. That's the first wrinkle: you must coordinate with everyone else using the server, so that you do not lose commits.Further, after doing a force push, and/or if your co-workers are sharing your repository (by fetching/pulling from it), your co-workers may need to adjust their repositories to account for the copy-commits-and-move-branch-label. To them, this is an upstream rebase and they must recover from it. This is the second wrinkle: you must coordinate with everyone else using the branch, so that they know to recover from the upstream rebase.
In short, make sure all your co-developers are OK with this.
If you have others working with you, and they are sharing the branches you will rebase, just make sure they all know about it. If these commits that you will rebase are not published—if they're purely private to your own repository—then this coordination requires no effort at all, as there is no one else to inform, so there is no one who can object to it.
Merging, and maybe rebasing again
Again, the way to answer these questions is to start by drawing the commit graph. Let's use the new, rebased drawing, then add the merge commit that merges
B
intomaster
. Since we want to show the merge, we need to re-draw the graph a bit yet again, to make it easier to connect the rows:Now we'll run
git checkout master
andgit merge A
to make new merge commitM
:We did not have to change anything in
B
here, and we can continue developing on it as usual. But, if we want to—and if all our co-developers agree—we can now rebaseB
again, this time ontomaster
.If we
git checkout B; git rebase master
, therebase
will, as usual, start by finding the merge base betweenB
andmaster
. What is the merge base? Look at the graph and find the point wheremaster
andB
join up. It's definitely not the new merge commitM
, but it's notN
any more either.The commits that are on branch
B
areN
,O
,P
,J
,K
,E'
,F'
, andG'
(and any commits we can't see that are to the left ofN
). This part is straightforward enough. The tricky part is the commits that are on branchmaster
. These areN
,O
,P
,Q
,R
, andM
, but alsoJ
andK
. This is because those two commits can be reached by following commitM
's second parent.Therefore, the merge base—which is defined as the "closest" commit that's on both branches—is actually commit
K
. This is exactly what we want! To rebaseB
ontomaster
, we want git to copyE'
,F'
, andG'
. The new copiesE''
,F''
, andG''
will go afterM
, and we will get this:The real lesson here is draw the graph as it will help you figure out what is going on.
1At least, it's simple when there is a single such point. In more complex graphs, there can be more than one merge base. Rebasing with complex graphs requires a lot more than I can write in this particular SO posting, though.