There are plenty of answers with great command line fu to find changes (or change statistics), but I'd like to find the opposite: how many lines (per file) have not changed since a particular commit?
The closest I could find is this: How to find which files have not changed since commit? but I'd like to know how many lines (ideally: in each file) have survived unchanged, not which files.
So, basically: can git diff --stat output unchanged lines in addition to insertions and deletions?
Alternatively, I'd imagine that git ls-files, git blame and some awk magic might do the trick, but I haven't been able to figure it out quite yet. -- For example, rather than label each line with the commit number of the last change, can I get git-blame to indicate if this change occurred before or after a given commit? Together with grep and wc -l that would get me there.
Figured it out. The key is that git blame can specify date ranges (see https://git-scm.com/docs/git-blame, section "SPECIFYING RANGES"). Assume 123456 is the commit I want to compare to. With
git blame 123456..
"lines that have not changed since the range boundary [...] are blamed for that range boundary commit", that is, it will show everything that hasn't changed since that commit as "^123456". Thus, per file, the answer to my question is
git blame 123456.. $file | grep -P "^\^123456" | wc -l # unchanged since
git blame 123456.. $file | grep -Pv "^\^123456" | wc -l # new since
Wrapped into bash script to go over all files in repo (git ls-files) and printing pretty:
#!/bin/bash
total_lines=0;
total_lines_unchanged=0;
total_lines_new=0;
echo "--- total unchanged new filename ---"
for file in `git ls-files | \
<can do some filtering of files here with grep>`
do
# calculate stats for this file
lines=`cat $file | wc -l`
lines_unchanged=`git blame 123456.. $file | grep -P "^\^123456" | wc -l`
lines_new=`git blame 123456.. $file | grep -Pv "^\^123456" | wc -l`
# print pretty
lines_pretty="$(printf "%6d" $lines)"
lines_unchanged_pretty="$(printf "%6d" $lines_unchanged)"
lines_new_pretty="$(printf "%6d" $lines_new)"
echo "$lines_pretty $lines_unchanged_pretty $lines_new_pretty $file"
# add to total
total_lines=$(($total_lines + $lines))
total_lines_unchanged=$(($total_lines_unchanged + $lines_unchanged))
total_lines_new=$(($total_lines_new + $lines_new))
done
# print total
echo "--- total unchanged new ---"
lines_pretty="$(printf "%6d" $total_lines)"
lines_unchanged_pretty="$(printf "%6d" $total_lines_unchanged)"
lines_new_pretty="$(printf "%6d" $total_lines_new)"
echo "$lines_pretty $lines_unchanged_pretty $lines_new_pretty TOTAL"
Thanks to Gregg for his answer, which had me look into the options for git-blame more!
git diff HEAD~ HEAD && echo files that changed
git rev-parse HEAD && echo hash of current rev
wc -l <filename> && echo total lines
git blame <filename> | grep -v -c -e"<first8bytesofhash>" && echo unchanged lines
git blame <filename> | grep -c -e"<first8bytesofhash>" && echo changed lines
I try with Python:
import commands
s,o=commands.getstatusoutput('git tag start')
s,o=commands.getstatusoutput('git log --pretty=%H --max-parents=0')
roots=o.split()
result=set()
for root in roots:
s,o=commands.getstatusoutput('git reset root')
s,o=commands.getstatusoutput('git ls-files')
all=set(o.split())
s,o=commands.getstatusoutput('git ls-files --modified')
modified=set(o.split())
unchanged=all-modified
result=result|unchanged
print result
s,o=commands.getstatusoutput('git reset start --hard')
$ wc -l main.c
718 main.c
$ git diff --numstat v2.0.0 main.c
152 70 main.c
That's 152 lines of the current main.c are changed or added since v2.0.0, so 566 lines are unchanged since then.
lines-unchanged-in-since () {
set -- $2 `wc -l $1` `git diff --numstat $2 $1`
echo $(($2-$4)) lines unchanged in $3 since $1
}