I understand the scenario presented in Pro Git about the risks of git rebase
. The author basically tells you how to avoid duplicated commits:
Do not rebase commits that you have pushed to a public repository.
I am going to tell you my particular situation because I think it does not exactly fit the Pro Git scenario and I still end up with duplicated commits.
Let's say I have two remote branches with their local counterparts:
origin/master origin/dev
| |
master dev
All four branches contains the same commits and I am going to start development in dev
:
origin/master : C1 C2 C3 C4
master : C1 C2 C3 C4
origin/dev : C1 C2 C3 C4
dev : C1 C2 C3 C4
After a couple of commits I push the changes to origin/dev
:
origin/master : C1 C2 C3 C4
master : C1 C2 C3 C4
origin/dev : C1 C2 C3 C4 C5 C6 # (2) git push
dev : C1 C2 C3 C4 C5 C6 # (1) git checkout dev, git commit
I have to go back to master
to make a quick fix:
origin/master : C1 C2 C3 C4 C7 # (2) git push
master : C1 C2 C3 C4 C7 # (1) git checkout master, git commit
origin/dev : C1 C2 C3 C4 C5 C6
dev : C1 C2 C3 C4 C5 C6
And back to dev
I rebase the changes to include the quick fix in my actual development:
origin/master : C1 C2 C3 C4 C7
master : C1 C2 C3 C4 C7
origin/dev : C1 C2 C3 C4 C5 C6
dev : C1 C2 C3 C4 C7 C5' C6' # git checkout dev, git rebase master
If I display the history of commits with GitX/gitk I notice that origin/dev
now contains two identical commits C5'
and C6'
which are different to Git. Now if I push the changes to origin/dev
this is the result:
origin/master : C1 C2 C3 C4 C7
master : C1 C2 C3 C4 C7
origin/dev : C1 C2 C3 C4 C5 C6 C7 C5' C6' # git push
dev : C1 C2 C3 C4 C7 C5' C6'
Maybe I don't fully understand the explanation in Pro Git, so I would like to know two things:
- Why does Git duplicate these commits while rebasing? Is there a particular reason to do that instead of just applying
C5
andC6
afterC7
? - How can I avoid that? Would it be wise to do it?
I found out that in my case, this issue the consequence of a Git configuration problem. (Involving pull and merge)
Description of the problem:
Sympthoms: Commits duplicated on child branch after rebase, implying numerous merges during and after rebase.
Workflow: Here are steps of the workflow I was performing:
As conséquences of this workflow, duplication of all commits of "Feature-branch" since previous rebase... :-(
The issue was due to the pull of changes of child branch before rebase. Git default pull configuration is "merge". This is changing indexes of commits performed on the child branch.
The solution: in Git configuration file, configure pull to work in rebase mode:
Hope it can help JN Grx
You should not be using rebase here, a simple merge will suffice. The Pro Git book that you linked basically explains this exact situation. The inner workings might be slightly different, but here's how I visualize it:
C5
andC6
are temporarily pulled out ofdev
C7
is applied todev
C5
andC6
are played back on top ofC7
, creating new diffs and therefore new commitsSo, in your
dev
branch,C5
andC6
effectively no longer exist: they are nowC5'
andC6'
. When you push toorigin/dev
, git seesC5'
andC6'
as new commits and tacks them on to the end of the history. Indeed, if you look at the differences betweenC5
andC5'
inorigin/dev
, you'll notice that though the content is the same, the line numbers are probably different -- which makes the hash of the commit different.I'll restate the Pro Git rule: never rebase commits that have ever existed anywhere but your local repository. Use merge instead.
Short answer
You omitted the fact that you ran
git push
, got the following error, and then proceeded to rungit pull
:Despite Git trying to be helpful, its 'git pull' advice is most likely not what you want to do.
If you are:
git push --force
to update the remote with your post-rebase commits (as per user4405677's answer).git rebase
in the first place. To updatedev
with changes frommaster
, you should, instead of runninggit rebase master dev
, rungit merge master
whilst ondev
(as per Justin's answer).A slightly longer explanation
Each commit hash in Git is based on a number of factors, one of which is the hash of the commit that comes before it.
If you reorder commits you will change commit hashes; rebasing (when it does something) will change commit hashes. With that, the result of running
git rebase master dev
, wheredev
is out of sync withmaster
, will create new commits (and thus hashes) with the same content as those ondev
but with the commits onmaster
inserted before them.You can end up in a situation like this in multiple ways. Two ways I can think of:
master
that you want to base yourdev
work ondev
that have already been pushed to a remote, which you then proceed to change (reword commit messages, reorder commits, squash commits, etc.)Let's better understand what happened—here is an example:
You have a repository:
You then proceed to change commits.
(This is where you'll have to take my word for it: there are a number of ways to change commits in Git. In this example I changed the time of
C3
, but you be inserting new commits, changing commit messages, reordering commits, squashing commits together, etc.)This is where it is important to notice that the commit hashes are different. This is expected behaviour since you have changed something (anything) about them. This is okay, BUT:
Trying to push will show you an error (and hint that you should run
git pull
).If we run
git pull
, we see this log:Or, shown another way:
And now we have duplicate commits locally. If we were to run
git push
we would send them up to the server.To avoid getting to this stage, we could have run
git push --force
(where we instead rangit pull
). This would have sent our commits with the new hashes to the server without issue. To fix the issue at this stage, we can reset back to before we rangit pull
:Look at the reflog (
git reflog
) to see what the commit hash was before we rangit pull
.Above we see that
ba7688a
was the commit we were at before runninggit pull
. With that commit hash in hand we can reset back to that (git reset --hard ba7688a
) and then rungit push --force
.And we're done.
But wait, I continued to base work off of the duplicated commits
If you somehow didn't notice that the commits were duplicated and proceeded to continue working atop of duplicate commits, you've really made a mess for yourself. The size of the mess is proportional to the number of commits you have atop of the duplicates.
What this looks like:
Or, shown another way:
In this scenario we want to remove the duplicate commits, but keep the commits that we have based on them—we want to keep C6 through C10. As with most things, there are a number of ways to go about this:
Either:
cherry-pick
each commit (C6 through C10 inclusive) onto that new branch, and treat that new branch as canonical.git rebase --interactive $commit
, where$commit
is the commit prior to both the duplicated commits2. Here we can outright delete the lines for the duplicates.1 It doesn't matter which of the two you choose, either
ba7688a
or2a2e220
work fine.2 In the example it would be
85f59ab
.TL;DR
Set
advice.pushNonFastForward
tofalse
:I think you skipped an important detail when describing your steps. More specifically, your last step,
git push
on dev, would have actually given you an error, as you can not normally push non-fastforward changes.So you did
git pull
before the last push, which resulted in a merge commit with C6 and C6' as parents, which is why both will remain listed in log. A prettier log format might have made it more obvious they are merged branches of duplicated commits.Or you made a
git pull --rebase
(or without explicit--rebase
if it is implied by your config) instead, which pulled the original C5 and C6 back in your local dev (and further re-rebased the following ones to new hashes, C7' C5'' C6'').One way out of this could have been
git push -f
to force the push when it gave the error and wipe C5 C6 from origin, but if anyone else also had them pulled before you wiped them, you'd be in for a whole lot more trouble... basically everyone that has C5 C6 would need to do special steps to get rid of them. Which is exactly why they say you should never rebase anything that's already published. It's still doable if said "publishing" is within a small team, though.