Git - Discard case only changes

2020-04-08 14:18发布

问题:

So, I'm using git (with Git Extensions 2 on Windows) with a large VB6 codebase. For anyone who isn't familiar with VB6, it is case-insensitive and has a habit of changing the case of variable names whenever you save a file. There are steps which can be taken to minimise this behaviour (see Stop Visual Basic 6 from changing my casing), but it is unfeasible to completely eliminate the problem in this way. The problem of course is that the case changes show up as changes in Git, thus interfering with the commit history to the point where actually changes are almost impossible to find.

I'm looking for a way to handle this from the source control side and would appreciate any input. The avenues I'm currently pursuing are in order of preference are:

  1. Make the Git diff case insensitive - Can't seem to find a way to do this. It will also not pick up changes to strings, but that's a price I'm willing to pay for an easy fix.
  2. Reset hunks with only case based changes before commiting.
  3. Move to Visual Source Safe which has an option for case insensitive diff - No...

I've got a feeling that option number 2 is probably my best bet, but I'm not really sure of the best way to handle it. My current line of thinking is:

  • Create some tool to automate Git command line
  • Use the interactive prompt to iterate all changes, splitting down to the smallest hunks
  • For each hunk, if the only changes are to case only, reset it

I'm pretty sure this is about as good of a solution as is possible. Running this tool before staging will solve all of the problems. Does anyone have any thoughts on this method?

Also, if i do go down this route, it would be preferable to have a Git hook to prevent any case only changes. I have absolutely no idea how to implement something like this, so any help towards creating such a script would be great.

To give some idea of the scale of the problem, when the case of a variable changes, it will change EVERY instance of every variable with the same name in open files. Every commit, this will have happened to several variables and it will appear as if ~30% of each modified file has changed. This makes a manual process (which is what I'm currently doing) quite impractical and only useful for really small commits.

Many thanks for any help!

回答1:

#!/bin/bash

# Script to discard any case only changes that haven't been staged.

# Create a backup of the current changes
BACKUPFILE="$( date +%Y%m%d%H%M%S.backup.patch )"
git diff > $BACKUPFILE

# For each file that has been changed
for f in $( git diff --name-only --ignore-submodules=all); do

    # Create a case insensitive patch
    git show :$f | diff -uiw - $f > temp.patch;

    # Reset the file
    git checkout -- $f;

    # Apply the case insensitive patch, hence discarding case only changes
    git apply --whitespace=nowarn temp.patch;

    # git apply doesn't respect autocrlf so replace all LF with CRLF
    # $ matches the end of line
    # '\r' expands to CRLF in a bash shell
    sed s/$/'\r'/ $f > temp.txt;
    mv temp.txt $f;

    # clean up temporary files
    rm temp.patch;

done

echo "If everything is as expected you can delete $BACKUPFILE.
If things have gone wrong you can revert to your previous state by executing;

    git reset --hard HEAD
    git apply $BACKUPFILE"


回答2:

Confronted with this I wrote a routine that knew all the nouns used in my code and would flip them to the appropriate case - it would be something to run in a pre-commit hook.

I think it also had a discovery routine that would add to the library when it found new Dim statements.

It does mean you have to be consistent in case for EVERY variable name (unless you add a means of ignoring some words).

Alas, I don't have it lying around any more... most of my VB6 toolkit has gone to the great /dev/null in the sky. There are some pecularities with line endings at the beginning of the files, as I recall.