How to count total lines changed by a specific aut

2019-01-03 00:12发布

Is there a command I can invoke which will count the lines changed by a specific author in a Git repository? I know that there must be ways to count the number of commits as Github does this for their Impact graph.

21条回答
我想做一个坏孩纸
2楼-- · 2019-01-03 01:01

I found the following to be useful to see who had the most lines that were currently in the code base:

git ls-files -z | xargs -0n1 git blame -w | ruby -n -e '$_ =~ /^.*\((.*?)\s[\d]{4}/; puts $1.strip' | sort -f | uniq -c | sort -n

The other answers have mostly focused on lines changed in commits, but if commits don't survive and are overwritten, they may just have been churn. The above incantation also gets you all committers sorted by lines instead of just one at a time. You can add some options to git blame (-C -M) to get some better numbers that take file movement and line movement between files into account, but the command might run a lot longer if you do.

Also, if you're looking for lines changed in all commits for all committers, the follow little script is helpful:

http://git-wt-commit.rubyforge.org/#git-rank-contributors

查看更多
▲ chillily
3楼-- · 2019-01-03 01:03

This gives some statistics about the author, modify as required.

Using Gawk:

git log --author="_Your_Name_Here_" --pretty=tformat: --numstat \
| gawk '{ add += $1; subs += $2; loc += $1 - $2 } END { printf "added lines: %s removed lines: %s total lines: %s\n", add, subs, loc }' -

Using Awk on Mac OSX:

git log --author="_Your_Name_Here_" --pretty=tformat: --numstat | awk '{ add += $1; subs += $2; loc += $1 - $2 } END { printf "added lines: %s, removed lines: %s, total lines: %s\n", add, subs, loc }' -

EDIT (2017)

There is a new package on github that looks slick and uses bash as dependencies (tested on linux). It's more suitable for direct usage rather than scripts.

It's git-quick-stats (github link).

Copy git-quick-stats to a folder and add the folder to path.

mkdir ~/source
cd ~/source
git clone git@github.com:arzzen/git-quick-stats.git
mkdir ~/bin
ln -s ~/source/git-quick-stats/git-quick-stats ~/bin/git-quick-stats
chmod +x ~/bin/git-quick-stats
export PATH=${PATH}:~/bin

Usage:

git-quick-stats

enter image description here

查看更多
ら.Afraid
4楼-- · 2019-01-03 01:03

@mmrobins @AaronM @ErikZ @JamesMishra provided variants that all have an problem in common: they ask git to produce a mixture of info not intended for script consumption, including line contents from repository on the same line, then match the mess with a regexp.

This is a problem when some lines aren't valid UTF-8 text, and also when some lines happen to match the regexp (this happened here).

Here's a modified line that doesn't have these problems. It requests git to output data cleanly on separate lines, which makes it easy to filter what we want robustly:

git ls-files -z | xargs -0n1 git blame -w --line-porcelain | grep -a "^author " | sort -f | uniq -c | sort -n

You can grep for other strings, like author-mail, committer, etc.

Perhaps first do export LC_ALL=C (assuming bash) to force byte-level processing (this also happens to speed up grep tremendously from the UTF-8-based locales).

查看更多
冷血范
5楼-- · 2019-01-03 01:06

A solution was given with ruby in the middle, perl being a little more available by default here is an alternative using perl for current lines by author.

git ls-files -z | xargs -0n1 git blame -w | perl -n -e '/^.*\((.*?)\s*[\d]{4}/; print $1,"\n"' | sort -f | uniq -c | sort -n
查看更多
神经病院院长
6楼-- · 2019-01-03 01:06

The question asked for information on a specific author, but many of the answers were solutions that returned ranked lists of authors based on their lines of code changed.

This was what I was looking for, but the existing solutions were not quite perfect. In the interest of people that may find this question via Google, I've made some improvements on them and made them into a shell script, which I display below. An annotated one (which I will continue to maintain) can be found on my Github.

There are no dependencies on either Perl or Ruby. Furthermore, whitespace, renames, and line movements are taken into account in the line change count. Just put this into a file and pass your Git repository as the first parameter.

#!/bin/bash
git --git-dir="$1/.git" log > /dev/null 2> /dev/null
if [ $? -eq 128 ]
then
    echo "Not a git repository!"
    exit 128
else
    echo -e "Lines  | Name\nChanged|"
    git --work-tree="$1" --git-dir="$1/.git" ls-files -z |\
    xargs -0n1 git --work-tree="$1" --git-dir="$1/.git" blame -C -M  -w |\
    cut -d'(' -f2 |\
    cut -d2 -f1 |\
    sed -e "s/ \{1,\}$//" |\
    sort |\
    uniq -c |\
    sort -nr
fi
查看更多
三岁会撩人
7楼-- · 2019-01-03 01:07

In case anyone wants to see the stats for every user in their codebase, a couple of my coworkers recently came up with this horrific one-liner:

git log --shortstat --pretty="%cE" | sed 's/\(.*\)@.*/\1/' | grep -v "^$" | awk 'BEGIN { line=""; } !/^ / { if (line=="" || !match(line, $0)) {line = $0 "," line }} /^ / { print line " # " $0; line=""}' | sort | sed -E 's/# //;s/ files? changed,//;s/([0-9]+) ([0-9]+ deletion)/\1 0 insertions\(+\), \2/;s/\(\+\)$/\(\+\), 0 deletions\(-\)/;s/insertions?\(\+\), //;s/ deletions?\(-\)//' | awk 'BEGIN {name=""; files=0; insertions=0; deletions=0;} {if ($1 != name && name != "") { print name ": " files " files changed, " insertions " insertions(+), " deletions " deletions(-), " insertions-deletions " net"; files=0; insertions=0; deletions=0; name=$1; } name=$1; files+=$2; insertions+=$3; deletions+=$4} END {print name ": " files " files changed, " insertions " insertions(+), " deletions " deletions(-), " insertions-deletions " net";}'

(Takes a few minutes to crunch through our repo, which has around 10-15k commits.)

查看更多
登录 后发表回答