I have a remote with a history that looks like this:
As you can see O and P are merge commits and both of them closed their old branch so now there's only one branch.
I want to squash C-D-E-G-J-K-L-N into one commit and F-H-I-M into an other commit because they are just tiny commits cluttering the history.
Locally I managed to squash C-D-E-G-J-K-L-N using the method described in the answer by John O'M. to this question, like this:
git checkout -b squashing-1 N
git reset --soft C~
git commit -m "Squashed history"
git replace N [ID_of_the_commit_i_just_made]
and this works, locally git log
from main-branch correctly reports Q, P, O, X, M, I, etc. (X is the new squashed commit).
From here the next steps would be to (1) check out the main branch and merge in the changes, (2) delete the temporary local branch, then (3) push the changes to the remote repo. But (1) and (3) report Already up to date or Everything is up to date since there are no actual changes to tree which is exactly the point of all this.
I've tried using git push --force origin main-branch
and git push --force-with-lease origin main-branch
too but i got the same result: Everything is up to date.
How can I correctly merge in these history changes and push them to BitBucket without having to re-create the entire repo?
You essentially have a choice to make: do you wish to make everyone use the replacement references, or do you prefer to rewrite the entire repository and make everyone have a big flag-day during which they switch from "old repository" to "new repository"? Neither method is particularly fun or profitable. :-)
How replacements work
What git replace
does is to add a new object into the Git repository and give it a name in the refs/replace/
name-space. The name in this name-space is the hash ID of the object that the new object replaces. For instance, if you're replacing commit 1234567...
, the name of the new object (whose ID is not 1234567...
—for concreteness, let's say it's fedcba9...
instead) is refs/replace/1234567...
.
The rest of Git, when looking for objects, checks first to see if there is a refs/replace/<hash-id>
object. If so (and replacing is not disabled), the rest of Git then returns the object to which the refs/replace/
name points, instead of the actual object. So when some other part of Git reads some commit that says "my parent commit is 1234567...
", that other part of Git goes to find 1234567...
, sees that refs/replace/1234567...
exists, and returns object fedcba9...
instead. You then see the replacement.
If you do not have the reference refs/replace/1234567...
, though, your Git never swaps in the replacement object. (This is true whether or not you have the replacement object. It's the reference itself that causes the replacement to occur. Having the reference guarantees that you have the object.)
Hence, for some other Git to execute this same replacement process, you must deliver the refs/replace/
reference to that other Git.
Transferring replacements from one Git to another
In general, you would push such objects with:
git push <repository> 'refs/replace/*:refs/replace/*'
(or specifically list the one replace reference you wish to push). To fetch these objects:
git fetch <repository> 'refs/replace/*:refs/replace/*'
(You can add this fetch refspec to the fetch
configuration in each clone. Using git fetch
or git fetch <repository>
will then automatically pick up any new replacement objects pushed. Pushing is still a pain, and of course this step has to be repeated on each new clone.)
Note that neither refspec here sets the force flag. It's up to you whether you want to force-overwrite existing refs/replace/
references, should such a thing happen.
Rewriting a repository
Alternatively, once you have replacements in place, you can run a repository-copying operation—by this, I mean a commit-by-commit copy, not a fast copy like git clone --mirror
—such as git filter-branch
. If this copying operation is run without disabling replacements, the replaced objects are not copied; instead, their replacements are copied. Hence:
git filter-branch --tag-name-filter cat -- --all
has the side effect of "cementing replacements" forever in the copied repository. You may then discard all the original references and all the replacement references. (The easy way to do this is to clone the filtered repository.)
Of course, since this is a new and different repository, it is not compatible with the original repository or any of its clones. But it no longer requires careful coordination of the refs/replace/
name-space (since it no longer has any replacement objects!).
From here the next steps would be to (1) check out the main branch and merge in the changes, (2) delete the temporary local branch, then (3) push the changes to the remote repo.
It seems you misunderstand what git replace
really did. There is nothing to merge, because the true history isn't changed in any way by git replace
. Rather, replace
makes a note off to the side that says "by default, when browsing the history, if you find this object, substitute this one instead". You actually can still see the real history, e.g. git --no-replace-objects log
.
So replace
creates the illusion of a rewritten history. In that it isn't a true rewrite and therefore doesn't create an "upstream rebase" situation for other developers, this is pretty cool. OTOH it cannot be trusted as a way to scrub sensitive data from the repo, since the rewrite really is just an illusion. And the output you get from git commands can be misleading, in that it can imply that the "real object" SHA ID is associated with the "replacement object" content (when in fact it's essentially certain that said content would not hash to said SHA).
What you really need to do if you decide to go ahead and share the replacement with origin is
git push origin refs/replace/*
Be aware that there are a few known bugs/quirks, and the documentation suggests that there may be unknown bugs/quirks.
Note: you will need to make sure the server allows it: a new configuration variable core.usereplacerefs
has been added with Git 2.19 (Q3 2018), primarily to help server installations that want to ignore the
replace mechanism altogether.
See commit da4398d, commit 6ebd1ca, commit 72470aa (18 Jul 2018) by Jeff King (peff
).
(Merged by Junio C Hamano -- gitster
-- in commit 1689c22, 15 Aug 2018)
add core.usereplacerefs
config option
We can already disable replace refs using a command line option or environment variable, but those are awkward to apply universally. Let's add a config option to do the same thing.
That raises the question of why one might want to do so universally. The answer is that replace refs violate the immutability of objects. For instance, if you wanted to
cache the diff between commit XYZ and its parent, then in theory that never changes; the hash XYZ represents the total state.
But replace refs violate that; pushing up a new ref may create a completely new diff.
The obvious "if it hurts, don't do it" answer is not to create replace refs if you're doing this kind of caching.
But for a site hosting arbitrary repositories, they may want to allow users to share replace refs with each other, but not actually respect them on the site (because the caching is more important than the replace feature).