git gui - can it be made to display UTF16?

2019-01-24 19:29发布

问题:

Is there any way to make git gui display and show diffs for UTF16 files somehow?

I found some information, but this is mostly referring to the command line rather than the gui.

回答1:

I have been working on a much better solution with help from the msysGit people, and have come up with this clean/smudge filter. The filter uses the Gnu file and iconv commands to determine the type of the file, and convert it to and from msysGit's internal UTF-8 format.

This type of Clean/Smudge Filter gives you much more flexibility. It should allow Git to treat your mixed-format files as UTF-8 text in most cases: diffs, merge, git-grep, as well as gitattributes properties like eol-conversion, ident-replacement, and built-in diff patterns.

The diff filter solution outlined above only works for diffs, and so is much more limited.

To set up this filter:

  1. Get Gnu libiconv, and file, and install both.
  2. Ensure that the GnuWin32\bin directory (usually "C:\Program Files\GnuWin32\bin") is in your %PATH%
  3. Add the following to ~\Git\etc\gitconfig:

    [filter "mixedtext"]
        clean = iconv -sc -f $(file -b --mime-encoding %f) -t utf-8
        smudge = iconv -sc -f utf-8 -t $(file -b --mime-encoding %f)
        required
    
  4. Add a line to your global ~/Git/etc/gitattributes or local ~/.gitattributes to handle mixed format text, for example:

    *.txt filter=mixedtext
    

I have used this on a directory with sql files in ANSI, UTF-16, and UTF-8 formats. It is working so far. Barring any surprises, this looks like the 20% effort that could cover 80% of all Windows text format problems.



回答2:

This method is for MSysGit 1.8.1, and is tested on Windows XP. I use Git Extensions 2.44, but since the changes are at the Git level, they should work for Git Gui as well. Steps:

  1. Install Gnu Iconv.

  2. Create the following script, name it astextutf16, and place it in the /bin directory of your Git installation (this is based on the existing astextplain script):

    #!/bin/sh -e
    # converts Windows Unicode (UTF-16 / UCS-2) to Git-friendly UTF-8
    # notes:
    # * requires Gnu iconv:
    #       http://gnuwin32.sourceforge.net/packages/libiconv.htm
    # * this script must be placed in: ~/Git/bin
    # * modify global ~/Git/etc/gitconfig or local ~/.git/config:
    #       [diff "astextutf16"]
    #           textconv = astextutf16
    # * or, from command line:
    #       $ git config diff.astextutf16.textconv astextutf16
    # * modify global ~/Git/etc/gitattributes or local ~/.gitattributes:
    #       *.txt diff=astextutf16
    if test "$#" != 1 ; then
        echo "Usage: astextutf16 <file>" 1>&2
        exit 1
    fi
    # -f(rom) utf-16 -t(o) utf-8
    "\Program Files\GnuWin32\bin\iconv.exe" -f utf-16 -t utf-8 "$1"
    exit 0
    
  3. Modify the global ~/Git/etc/gitconfig or your local ~/.git/config file, and add these lines:

    [diff "astextutf16"]  
        textconv = astextutf16
    
  4. Or, from command line:

    $ git config diff.astextutf16.textconv astextutf16

  5. Modify the global ~/Git/etc/gitattributes or your local ~/.gitattributes file, and map your extensions to be converted:

    *.txt diff=astextutf16

  6. Test. UTF-16 files should now be visible.



回答3:

I ran into a similar issue.

I would like to improved on the accepted answer, since it has a small flaw. The problem I ran into was that if the file did not exist, I received this error:

conversion to cannot unsupported

I changed the commands so that a file is not required. It uses only stdin/stdout. This fixed the issue. My .git/config file now looks like this:

[filter "mixedtext"]
    clean = "GITTMP=$(mktemp);TYPE=$( tee $GITTMP|file -b --mime-encoding - ); cat $GITTMP | iconv -sc -f $TYPE -t utf-8; rm -f $GITTMP"
    smudge = "GITTMP=$(mktemp);TYPE=$( tee $GITTMP|file -b --mime-encoding - ); cat $GITTMP | iconv -sc -f utf-8 -t $TYPE; rm -f $GITTMP"
    required = true

To create the entries in your .git/config file use these commands:

git config --replace-all filter.mixedtext.clean 'GITTMP=$(mktemp);TYPE=$( tee $GITTMP|file -b --mime-encoding - ); cat $GITTMP | iconv -sc -f $TYPE -t utf-8; rm -f $GITTMP'
git config --replace-all filter.mixedtext.smudge 'GITTMP=$(mktemp);TYPE=$( tee $GITTMP|file -b --mime-encoding - ); cat $GITTMP | iconv -sc -f utf-8 -t $TYPE; rm -f $GITTMP'
git config --replace-all filter.mixedtext.required true

My .gitattributes file looks like this:

*.txt filter=mixedtext
*.ps1 filter=mixedtext
*.sql filter=mixedtext

Specify only the files that might be an issue otherwise the clean/smudge has to do more work (temp files).

We also bulk converted the UTF-16le files in git to UTF-8 since this is the most compact and portable encoding for UTF. The same iconv command used in clean and smudge was perfect for permanently converting the files.

The nice thing about the clean/smudge commands is that even if a file is checked in with, say, UTF-16le, the diff will still work.