How can I list all modified files by an author bet

2019-05-09 03:29发布

问题:

The command:

 git log --oneline --name-status 
         --author=$AUTHOR $COMMIT_RANGE | grep -vE '[a-fA-F0-9]{5} ' 
         | sort | uniq | cat -n

Returns a list of the files modified by an author between a range of commits with the status e.g. M for modified.

 1  M   a_file
 2  M   another_file
 3  M   file
 4  D   file

How can I show only the last thing that happened to the file file, e.g. here it was deleted (D)?

I don't want to see the previous modifications to the file (i.e. the M), only the last thing that happened in that range of commits.

Thanks for the attention!

回答1:

You can tweak uniq to look only at a certain field:

git log --oneline --name-status 
         --author=$AUTHOR $COMMIT_RANGE | grep -vE '[a-fA-F0-9]{5} ' 
         | sort -r | uniq -f 1 | tac | cat -n

I used tac as well to reverse the output again. You can leave it out though. Also it's a GNU utility which you won't find on BSD / OS X machines.



回答2:

Here is what you want to do:

git log --oneline 
        --name-status 
        --author=$AUTHOR $COMMIT_RANGE 
        | grep -vE '[a-fA-F0-9]{5} ' | sort -r | uniq -f 1 |  head -1

What have i modified:

sort -r

reverse the sorting so the result will printed in reverse order

head -1

Take the first line of the result

uniq -f 1

Get the uniq values skipping the first row which is the counter so it cant be unique.



回答3:

This is the command I ended up with which works for me:

git log --oneline --name-status --author=$AUTHOR $COMMIT_RANGE | \
 grep -vE '[a-fA-F0-9]{5}' | cat -n | sed -e 's/     / /g' | sed -e 's/^  *//g' | \
 sort -k 3,3 -k 1n,1n | uniq -f 2 | sed -e 's/^[0-9]\{1,\} //' | cat -n

Where $AUTHOR is the author name and $COMMIT_RANGE is a range of commits in the form OLDER_COMMIT_SHA1..NEWER_COMMIT_SHA1 (HEAD can be used too for NEWER_COMMIT_SHA1).

Command explanation:

  1. First after removing the unwanted lines from the git log output with grep -vE, I number each line with cat -n;
  2. I remove the TAB git inserts when using --name-status and the initial spaces inserted by cat -n with sed -e 's/ / /g' | sed -e 's/^ *//g';
  3. Then I sort by the third column first so I have the same filenames adjacent line by line (I sort by filename); after that I sort by the first column numerically (the cat -n result). This way I keep the first occurrence of the filename which is the one I'm interesting in;
  4. After that it's time to keep the unique rows ignoring the first 2 columns (therefore only compare the filenames). This way uniq will return the first occurrence of a new filename first an will not repeat the other occurrences of the same filename; but at the same time the first filename for each 'group' is the one I'm interested in because it has the last --name-status flag I'm interested in and it tells me what's the last thing that happened on that file (was it a modification? A deletion? etc...);
  5. Then I just remove the count because I need to recount the remaining rows after the uniq command and count again.

I want to thank codeWizard, Arne and VonC for the uniq -f advice which helped me to work out the solution.