Git stash apply did not return working directory?

2020-06-06 06:34发布

问题:

I committed and pushed some files to the remote. Then found that I had something wrong in it and wanted to revert the push to make some edits. I stashed, reverted, and want to re-apply the stash, but after application, my working directory is still missing the files. Please help. Here's the history.

$ git commit -m "Model package"
[dev ec5e61d] Model package
 40 files changed, 1306 insertions(+), 110 deletions(-)

$ git push

$ git log
commit ec5e61d2064a351f59f99480f1bf95927abcd419
Author: Me
Date:   Mon Feb 6 

    Model package


$ git revert ec5e61d2064a351f59f99480f1bf95927abcd419
error: Your local changes to the following files would be overwritten by merge:
        model/R/train.R
Please, commit your changes or stash them before you can merge.
Aborting



$ git stash
Saved working directory and index state WIP on dev: ec5e61d Model package
HEAD is now at ec5e61d Model package


$ git revert ec5e61d2064a351f59f99480f1bf95927abcd419
[dev 062f107] Revert "Model package"
 40 files changed, 135 insertions(+), 1331 deletions(-)


$ git stash apply
CONFLICT (modify/delete): model/R/train.R deleted in Updated upstream and modified in Stashed changes. Version Stashed changes of model/R/train.R left in tree.


$ git stash apply
model/R/train.R: needs merge
unable to refresh index

$ git commit model/R/train.R
[dev ed41d20] Resolve merge conflict
 1 file changed, 138 insertions(+)
 create mode 100644 model/R/train.R


 $ git stash apply
On branch dev
Your branch is ahead of 'origin/dev' by 2 commits.
  (use "git push" to publish your local commits)
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

        modified:   scripts/gm.r

But my files didn't return from the stash!

$ git stash list
stash@{0}: WIP on dev: ec5e61d Model package

$ git stash show
 model/R/train.R | 79 ++++++++++++++++++++++----------------------
 scripts/gm.r     | 87 ++++++++++++++++++++++++++++++++-----------------
 2 files changed, 97 insertions(+), 69 deletions(-)

回答1:

You have mixed together quite a few things here. In particular, you have used git stash and git revert, while also making various commits. We need to pick these apart.

What git stash does

While git stash can become very complicated,1 its normal use is fairly simple. You create a stash, which is a variety of commit. This has the side effect of "cleaning" the work-tree and index, a la git reset --hard HEAD (in fact there's a literal git reset --hard HEAD command inside git stash). Then you switch to some other commit, git stash apply the stash, and if all goes well, git stash drop the stash.

The apply step converts the special stash commit you made to a patch, and then applies it like git apply -3 would, doing a three-way merge if necessary. Hence, if you wanted to emulate simplified process with ordinary git commit you could do this—you don't need the git reset --hard part because you make a real branch for the new commit:

git checkout -b tempbranch
git add -A
git commit -m 'stash my work-in-progress'
git checkout <something else>
git show tempbranch | git apply -3

To drop the commit (like dropping the stash) you would then simply delete the temporary branch:

git branch -D tempbranch

Nonetheless, however you saved and applied the stash, extracting the stash can result in performing a three-way merge. As a result, it can have a merge conflict. This is what you encountered.


1The main complication is this: git stash saves, separately, both the current index and the current work-tree, as two commits (on no branch). It is therefore a lot like running git commit twice. But, when you use git stash apply, it combines those two commits into one change in the work-tree, by default.

If you did not stage anything before making the stash, this complication becomes irrelevant. Or, if you staged everything, and then did not change the work-tree, that also makes this complication irrelevant. In any case, if you apply the stash without --index, the apply step will, if needed, favor the work-tree changes over the index changes.

As another complication, if you add the -u or -a option to your git stash save step, Git saves a third commit. These three-commit stashes are harder to apply and should be used with care.


Commits and patches

Before we go on to the merge conflict, let's take a very brief look at what is in a Git commit, and what a patch (or diff) is.

A commit is a snapshot. It's all the files you told Git to keep track of, in the form they had when you ran git commit. More specifically, it's what was in the index at that time. The index, also called the staging area, is where you tell Git what to save in the next commit, using git add to update items in the index.

A commit is not a patch. It's not a change: it's a snapshot. However, every2 commit has a parent (or "previous") commit, and Git can easily compare a commit against its parent. This turns the commit into a patch:

  • You saved A, B, and C last time.
  • You saved A, B, and D this time.
  • Therefore, you deleted C, and added D.

Hence a commit can become a patch. A patch is simply instructions: add these things, delete these other things. You (or Git) can apply the patch to some other commit, and then make a new commit from the result. If the application deletes C and adds D, and you make a new commit, then the new commit's patch is "delete C and add D"—which is also the old commit's patch.

Turning a commit into a patch, and then playing it again elsewhere, repeats the change.


2At least one commit—the first one you ever make—is special because it has no previous commit, i.e., no parent. In addition, merge commits are special because they have two (or even more) parents. But we're mostly interested in ordinary, one-parent commits here.

Because git stash makes commits, the stash commits also have parents. This is, in fact, how Git turns stashes into patches, much the same way Git turns ordinary commits into patches.


Using git cherry-pick and git revert

(Note: you only used git revert directly, but I include both here because they are very similar.)

When we turn a commit into a patch, as above, and then apply it to another branch,3 this is a git cherry-pick operation. We're saying "the change we made last time over there, is a good idea: let's make the same change again, over here." We can make the same change to identical files—this always works—or to files that are merely sufficiently similar, so that we can in fact make the same changes.

The essential difference between "apply a patch" and "make a cherry-pick" is that "make a cherry-pick" takes a source commit, instead of just a patch. By default, git cherry-pick also makes a new commit from it automatically—so this can re-use the original commit's commit message, too.

Once you understand patches and how git cherry-pick works, what git revert does becomes trivially obvious: it simply reverse-applies the patch. You point Git at some particular commit and tell it to undo whatever that commit did. Git turns that commit into a patch; the patch says "delete C and add D"; and Git then deletes D and adds C. As with git cherry-pick, git revert makes a new commit from the result, by default.


3Technically, we need only apply it to another commit. But this gets into the question of what we mean by "branch": see What exactly do we mean by "branch"?


Merge conflicts

The idea behind a merge is easy enough. We—whoever "we" are—made some changes, and they—whoever "they" are—made some changes—but we both started from the same snapshot. We ask Git to combine our changes and their changes. For instance, suppose that you both started with F1, F2, and F3, and you changed file F1 and they didn't, and they changed file F2 and you didn't. That's really easy to combine: take your modified F1 and their modified F2. In F3, you changed a line near the top of the file, and they changed a line near the bottom. That's a little harder to combine, but still not a problem: Git just takes both changes.

Some changes, though, can't be combined at all. These changes are merge conflicts.

Merge conflicts have a number of forms. The most common ones occur when both "you" (in your commits) and "they" (in their commits) modify the same line(s) in the same file(s). This happens when we take our commit(s) and turn it/them into a patch—"delete C and add D"—but their change says to keep C and add E instead. Git doesn't know which change to use: should we keep C and add both D and E, or delete C and keep only E, or what?

This is not what you got, though: you got a modify/delete conflict. A modify/delete conflict occurs when you said that, in file train.R, we should delete C and add D—and they said "throw away the whole train.R file".

In all cases of merge conflicts, Git throws up its metaphorical hands and temporarily gives up doing the merge. It writes into your work-tree the conflicted file(s), declares the merge conflict, and stops. It is now your job to finish the merge—by coming up with the correct snapshot—and then git add and git commit the result.

Merge conflicts can, of course, occur when you're doing a git merge (and at this time it's pretty clear which is "ours" and which is "theirs").4 But they can also occur during git apply -3 (the -3 means "use three way merge"), git cherry-pick, and git revert. If the patch does not apply cleanly, Git can figure out which common base to use and then figure out the two parts of the changes: what's in the patch you are applying, and what changed from the common base to the commit you are on right now, that you're trying to patch with git cherry-pick or git revert.


4The "ours" version is, in all cases, the HEAD version. This is true during rebase as well, but rebase works by doing a repeated series of cherry-picks, and at this time, HEAD is the new replacement branch you're building. See this comment by CommaToast: "since the head is the seat of the mind, which is the source of identity, which is the source of self, it makes a bit more sense to think of whatever HEAD's pointing to as being ‘mine’ ...".


One last note about git commit

I mentioned above that you normally git add files to copy them from the work-tree to the index / staging area. This also marks the file as resolved, during a merge conflict, which is why it's your job to edit the file first, then use git add on it. Once you have fixed up and git add-ed all the files, you can git commit the resulting merge. This finishes the merge, or the cherry pick or revert or rebase step or whatever it was you were doing that had the conflict.

When you run git commit like this, with no specific named files, Git commits the index (staging-area) contents. As a result, the new HEAD commit matches the index. We'll use that fact in a moment.

If you run git commit <path>, though, this automatically stages the named file (as if you had run git add file), then commits—well, something. This part gets a bit tricky. The default is to behave as though you ran git commit --only <path>.

With --only, Git doesn't take any other already-staged files. That is, if you've git added five files, none named README.txt, and then you run git commit README.txt, Git makes a new commit using the work-tree version of README.txt but the HEAD version of the other five files. In other words, it only changes the named files, preserving the previous versions of all other files.

With --include, Git stages the named <file> too, and commits the entire result. That is, this is like doing git add <file> before running git commit. This is much easier to explain than the --only behavior, but of course if you want this, you can just git add <file>.

In either case, though, once the new commit is made, <file> has been staged, as if you had git add-ed it. So in our --only example, with five files add-ed, the index now has all six files updated. Of course, there's a new HEAD commit, and README.txt now matches the new HEAD commit, so there are only five differences staged.

So, now let's look at what you did

The first command you show is this:

$ git commit -m "Model package"
[dev ec5e61d] Model package
 40 files changed, 1306 insertions(+), 110 deletions(-)

We don't see any of your git adds, but you must have added 40 files so that Git said "40 files changed". But you must also not have git add-ed something, as we will see in a moment.

Next, you show:

$ git push

with no output. It's not immediately clear what, if anything, this pushed (this depends on your push.default configuration, and probably whether the current branch has an upstream setting, and so on). However, no matter what you may have pushed, your own local repository contents are not changed, and later evidence shows that the push succeeded, pushing commit ec5e61d to origin.

Now you show two commands with some output:

$ git log
commit ec5e61d2064a351f59f99480f1bf95927abcd419
Author: Me
Date:   Mon Feb 6 

    Model package

This is the commit you made: ec5e61d, on branch dev, as shown by the git commit output.

$ git revert ec5e61d2064a351f59f99480f1bf95927abcd419
error: Your local changes to the following files would be overwritten by merge:
        model/R/train.R
Please, commit your changes or stash them before you can merge.
Aborting

This is interesting, because it shows that you have changes in your work-tree to model/R/train.R even though you ran git commit recently, making the ec5e61d commit.

This means that whatever was in the index for model/R/train.R did not match what is now in the work-tree for model/R/train.R. We can't tell, from this, precisely what was in the index, just that it does not match the work-tree.

When git revert said Aborting, it means it did nothing at all: it found that your work-tree was not "clean", i.e., did not match the HEAD commit ec5e61d. By default git revert requires that the work-tree match the HEAD commit.

Next, you ran:

$ git stash
Saved working directory and index state WIP on dev: ec5e61d Model package
HEAD is now at ec5e61d Model package

The git stash command made its commit—really, two commits, one for the index and one for the work-tree; but the index is the same as the HEAD commit, so all we really have to care about is the work-tree commit—and then "cleaned up" the work-tree to match the HEAD commit.

In other words, the stash commit—or rather, the one of the two that we care about—has the work-tree version of model/R/train.R. It also has the work-tree version of any other files that were modified (and tracked) but not yet committed, and the work-tree version of all remaining files that still match the HEAD or ec5e61d commit as well. (Every commit, as always, has every file in it. That's what "being a snapshot" means. We'll be able to see, in a moment, more about what's different between the stashed work-tree commit and the ec5e61d commit.)

Once git stash has stashed all of these, it uses git reset --hard HEAD to discard the index and work-tree changes. (They are safely saved away in the stash, hence no longer needed here in the index and work-tree.) So now your model/R/train.R file in your work-tree matches that in HEAD.

Now you re-run your git revert:

$ git revert ec5e61d2064a351f59f99480f1bf95927abcd419
[dev 062f107] Revert "Model package"
 40 files changed, 135 insertions(+), 1331 deletions(-)

and this time, since you're reverting the very change you just committed, "reverse patching" easily succeeds (it's trivial to undo what you just did). Git makes a new commit, 062f107, on branch dev, so that the last two commits are:

... <- ec5e61d <- 062f107   <-- dev

The stash itself is attached to commit ec5e61d (each stash is directly attached to the commit that was HEAD at the time you made the stash).

Now you try to apply the stash, but this fails with a merge conflict:

$ git stash apply
CONFLICT (modify/delete): model/R/train.R deleted in Updated upstream
and modified in Stashed changes. Version Stashed changes of
model/R/train.R left in tree.

This is particularly interesting, as it means that model/R/train.R must have been deleted by the revert commit 062f107, which means it must have been added in commit ec5e61d. Meanwhile, the stash commit, when converted to a patch, says "make this change to model/R/train.R."

Because Git did not know how to combine "remove this file completely" with "make this change", it left the entire stashed work-tree version of model/R/train.R in the work-tree. It's now your job to figure out what should be in the work-tree, and git add that.

Meanwhile, any other changes Git should apply as a patch, Git did apply as a patch, to all the other files. These changes are staged for commit, i.e., the index is updated already for these files.

Next, you ran:

$ git stash apply
model/R/train.R: needs merge
unable to refresh index

This tries to apply the stash again, but can't, so (fortunately) it does nothing. This leave the unfinished merge unfinished.

Then you ran:

$ git commit model/R/train.R
[dev ed41d20] Resolve merge conflict
 1 file changed, 138 insertions(+)
 create mode 100644 model/R/train.R

Note that this has the git commit <file> form. This git adds model/R/train.R from the work-tree, then skips the other added files and makes a new HEAD commit ed41d20 from the result. So this new commit has all the same files as the old HEAD 062f107 except for model/R/train.R:

... <- ec5e61d <- 062f107 <- ed41d20   <-- dev

Now you ran git stash apply yet again:

$ git stash apply
On branch dev
Your branch is ahead of 'origin/dev' by 2 commits.
  (use "git push" to publish your local commits)
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

        modified:   scripts/gm.r

Just as last time, this turns the stash commit into a patch and attempts to apply the patch. The patch doesn't apply properly, so Git tries a three-way merge: find the base version, find what's in your HEAD (ed41d20) version ("ours"), diff those, and combine that diff with the patch. That diff turns out to be exactly the same as the patch, i.e., the patch is already applied. So you get the output of git status at the end, which shows one modified file, and the two commits—these must, of course, be 062f107 and ed41d20—that you have that origin does not yet have.

Last, you show these:

$ git stash list
stash@{0}: WIP on dev: ec5e61d Model package

$ git stash show
 model/R/train.R | 79 ++++++++++++++++++++++----------------------
 scripts/gm.r     | 87 ++++++++++++++++++++++++++++++++-----------------
 2 files changed, 97 insertions(+), 69 deletions(-)

which finish out our knowledge of what's in the stash: it's just like commit ec5e61d except for two files, model/R/train.R and scripts/gm.r. The git diff instructions for converting from commit ec5e61d into the work-tree commit in the stash5 say to delete 69 original lines, and insert 97 replacement lines, in those two files.

That's what git stash apply tried to do, only it was unable to delete any lines from a completely-deleted model/R/train.R, so it just put the entire saved-work-tree stash@{0} version of file model/R/train.R in the work-tree.


5Again, earlier, we proved that the index commit in the stash matches the ec5e61d commit exactly. Thus, we can ignore the index commit, and concentrate only on the work-tree commit.


What to do next

What you should do, at this point, depends on what result you want. I cannot give you the correct answers, but I can give you the important questions:

  • What commits do you want to have? What source tree snapshots do you want in each commit?
  • Does anyone else already have a copy of commit ec5e61d? (I.e., does anyone else have access to the upstream repository to which you pushed commit ec5e61d?)

Commit ec5e61d exists in both your repository, and the repository you pushed it to way back at the git push command. In that other repository, the name dev probably points to ec5e61d. It has file model/R/train.R as a new file, and additional changes to, or new creations of, 39 other files as well, when compared to its parent commit.

Commit 062f107 exists only in your repository. It's an exact backing-out of commit ec5e61d, and as such, it's probably useless, unless you need to make sure you exactly back out ec5e61d from any other repository.

Commit ed41d20 also exists only in your repository. It's a copy of whatever is in the parent of ec5e61d except that it has the stashed version of model/R/train.R added.

Your current work-tree mostly matches ed41d20 except that it has scripts/gm.r modified, with whatever was changed in the patch between ec5e61d and the stash commit, also changed the same way in the work-tree. And, since ed41d20 mostly matches the parent of ec5e61d, the difference between the parent of ec5e61d and the work-tree amounts to whatever you changed in scripts/gm.r in the stash, plus the stashed version of model/R/train.R.

You still have the stash, too. If that's useful, keep it until it is no longer useful. Once you are sure it is no longer useful, use git stash drop to drop it (this step is extremely hard to undo, so be very sure you are done with it).

You may well want to discard the most recent two commits (the revert, and the commit of just the one added file version of model/R/train.R) entirely. In that case you can git reset --hard origin/dev to discard the two commits and put your work-tree back the way it was when you made commit ec5e61d.

If you're the only user of the upstream repository (this is a very big "if"), and commit ec5e61d itself is not really useful, you may want to discard that commit as well—but be careful, not just because of the very big "if" part, but also because that particular commit becomes harder to recover—though not as hard as the stash. As long as you still have the stash, the revert and the git stash apply results are quite easy to repeat: you just run the same git commands.



标签: git