I would like to know how squash all commits of merged branch like this :
feature | c3 - merge_master - c6
/ / \
master | c1 - c2 - c5 ------------- merge_feature - c7
and I aim to have this
master | c1 - c2 - c5 - squash_c3_c6 - c7
I found that git rebase c6 c5 --onto c6
allow me to replay c3
and c6
to have c3'
and c6'
but I always have c3
and c6
in my history.
I must do this in script to process a large repository (over 6k branches !) so I can't use git rebase -i
Any idea ?
You can't (quite) get what you want, but what you do get may well be fine.
In Git, "squash" means "copy". More specifically, a "squash merge" uses Git's merge machinery, performing the merging action (the word merge as a verb), but then makes an ordinary commit. Suppose, for instance, you had this:
C3--C6 <-- feature
/
C1
\
C2--C5 <-- master
(note that a branch name like master
or feature
actually points only to one commit, namely the tip commit of the branch; and some commits are on both branches, such as C1
). Running git checkout master && git merge --squash feature
would then:
- choose
C1
as the last place the branches shared a commit;
- choose
C5
as the tip of the current branch master
;
- choose
C6
as the tip of feature
.
These are the inputs to the merging process ("merge" as a verb). Then Git will:
- compare
C1
vs C5
: this is "what we changed";
- compare
C1
vs C6
: this is "what they changed".
Git then attempts to combine both sets of changes. If all goes well, the result is placed in the index and the work-tree (is "staged for commit"). Since you used --squash
which automatically turns on --no-commit
, Git stops at this point, without making a commit, but does build a pre-loaded commit message.
You can now run git commit
, which makes an ordinary commit (not a merge commit—this uses the word merge as an adjective, describing a type of commit). That ordinary commit would be just what you want:
C3--C6 <-- feature
/
C1
\
C2--C5--C7_which_is_3_plus_6 <-- master (HEAD)
At this point the only sensible thing to do with the branch name feature
is delete it, abandoning original commits C3
and C6
(they will eventually, or perhaps even very soon, be garbage-collected: deleting the branch also delete's the branch's reflogs that protect them).
The log message for C7
is anything you want it to be, but depending on how you configure Git you can get it to default to combining existing log messages from C3
and C6
(set merge.log
to true
or a number that is at least 2, or use --log
).
Note that the new commit C7
that takes what was done in original C3
and C6
(minus anything already duplicated by C5
). This is what, presumably, makes it safe to delete feature
entirely like this.
Unfortunately, that's not what you have. You already have a merge commit reachable from the tip of feature
, giving the graph that you drew:
C3--C4--C6 <-- feature
/ /
C1--C2--C5 <-- master
The merge base of feature
and master
is now, not commit C1
, but rather commit C2
. This is because when we start from master
and work backwards, we find commits C5
, then C2
, then C1
; and when we start from feature
and work backwards, we find C6
, then C4
, then both C3
and C2
simultaneously; and then C1
.
This means that C2
, not C1
, is the "nearest" commit to both branch tips that is on both branches. Using git merge --squash
when on master
will then:
- diff
C2
vs C5
: what we changed;
- diff
C2
vs C6
: what they changed;
- combine the changes, and stop due to the implied
--no-commit
You can then make a new commit C7
:
C3--C4--C6 <-- feature
/ /
C1--C2--C5----C7 <-- master
Again, the only sensible thing to do with feature
is to delete it. The result will be the sequence ...--C1--C2--C5--C7
. This is the same sequence of commits as before, and more importantly, the tree (source contents) associated with C7
should be the same as the tree you'd get without C4
, as long as C4
itself is not an evil merge and C6
does not undo part of C4
.
The reason for this is that C4
includes the changes from C3
, and obviously C6
includes the changes from C6
. This means that when Git ran git diff C2 C6
it saw the changes in both C3
and C6
: those are part of one of the two inputs to the merge-as-a-verb process. Hence the new C7
contains all the changes you want.
What it doesn't have, depending on --log
and merge.log
settings, is automatic population of the log message. But you can edit the log message for C7
any way you like, including finding an earlier commit like C1
and using git log --pretty=short --no-merges <hash>..feature
.
I think what you are looking for is git filter-brabch
which would remove the second parents from the merge commits.
Something like (not tested!)
git filter-branch --parent-filter 'read pkey pvalue rest && echo "$pkey $pvalue"' branch_first_commit..branch
(it might be faster with sed or awk but I'm afraid to make some mistake with quoting. It may also fail on the first commit)
Explanation: That older history could have conflicts, so replaying the branches history cannot be reliably done automatically. But the "squashed merge" is just the merge with removed second parents. And you already have the merges, so you only need to remove the parents.
So I played with git filter-branch --parent-filter
its allow me to have all merged branch commits flatten on my master, useful but it loose right ordering of commits => C1--C3--C2--C4--C5
. The result is good but history become inconsistent, I was expecting something like C1--C2--C3--C4--C5
and even with this it remain to squash C3 and C4.
@torek thanks for explanations, I better understand that C2 and C3 are equally ancestor of C4 without notion of origin like "commit from feature branch" or "commit from merge branch". And there is "merge commit" but just commit which do a merge.
I found that there is a way to script the rebase interactive mode and like I can achieve what I want with rebase interactive I'm going to script this.
I'll will keep you informed