In another post I found the very neat
git diff --color-words='[^[:space:]]|([[:alnum:]]|UTF_8_GUARD)+'
which does a great job at compressing git-diff
's output to the essential while remaining legible (especially when adding --word-diff=plain
for additional [-
/-]
and {+
/+}
surrounding deletions/additions). While this does include whitespace changes, the output does not highlight them in any noticeable way, e.g. when changing the indentation of a line of python code (which is a severe change) will show up as that line with the longer indentation (before or after), but there is no highlighting whatsoever.
How can whitespace changes be hightlighted correctly, maybe by replacing whitespace by some unicode characters such as ·
, ⇥
and ↵
, or something more close to git diff --word-diff-regex=.
's {+ +}
etc but with the smarter word separation?
I couldn't solve your problem, but I worry that Git might be working against you here. Recall that --color-words=<regex>
is a combination of --word-diff=color
and --word-diff-regex=<regex>
. The man git diff
documentation says:
--word-diff-regex=<regex>
Use <regex> to decide what a word is, instead of considering runs
of non-whitespace to be a word. Also implies --word-diff unless it
was already enabled.
Every non-overlapping match of the <regex> is considered a word.
Anything between these matches is considered whitespace and
ignored(!) for the purposes of finding differences. You may want
to append |[^[:space:]] to your regular expression to make sure
that it matches all non-whitespace characters. A match that
contains a newline is silently truncated(!) at the newline.
The regex can also be set via a diff driver or configuration
option, see gitattributes(1) or git-config(1). Giving it
explicitly overrides any diff driver or configuration setting.
Diff drivers override configuration settings.
Note this part of the middle paragraph: "Anything between these matches is considered whitespace and ignored(!) for the purposes of finding differences." So, it sounds like Git trys to treat whitespace specially here, and that might be a problem.
The best I can get so far is
git diff --color-words='[[:space:]]|([[:alnum:]]|UTF_8_GUARD)+' --word-diff=plain
Note the removed ^
in front of [:space:]
!
Here's an alternative using the substitution suggested at the question's end:
git config --global core.pager 'less --raw-control-chars'
such that unicode symbols are displayed correctly instead of some weird <c3>
ish output. Add the following to your git configuration:
[diff "txt"]
textconv = unwhite.sh
and, lacking a global solution, to .gitattributes
something like
*.py diff=txt
Finally, unwhite.sh
:
#!/bin/bash
awk 1 ORS='[7m\\n[27m\n' $1 | sed -e 's/ /␣/g' -e 's/\t/→/g'
Be advised there are raw escape (awk
fails to support \e
) characters before the [
s, I display the newline-indicating \n
in inverted colors to differ them from literal \n
s. This may fail to copy paste, in which case you may have to manually insert them. Or try your luck with a unicode symbol such as ↵
instead.
I deviated from the original unicode symbols since they failed to display correctly on msysgit.