I'm looking for an efficient way to get the list of unique commit authors for an SVN repository as a whole, or for a given resource path. I haven't been able to find an SVN command specifically for this (and don't expect one) but I'm hoping there may be a better way that what I've tried so far in Terminal (on OS X):
svn log --quiet | grep "^r" | awk '{print $3}'
svn log --quiet --xml | grep author | sed -E "s:</?author>::g"
Either of these will give me one author name per line, but they both require filtering out a fair amount of extra information. They also don't handle duplicates of the same author name, so for lots of commits by few authors, there's tons of redundancy flowing over the wire. More often than not I just want to see the unique author usernames. (It actually might be handy to infer the commit count for each author on occasion, but even in these cases it would be better if the aggregated data were sent across instead.)
I'm generally working with client-only access, so svnadmin
commands are less useful, but if necessary, I might be able to ask a special favor of the repository admin if strictly necessary or much more efficient. The repositories I'm working with have tens of thousands of commits and many active users, and I don't want to inconvenience anyone.
Powershell has support for XML which eliminates the need for parsing string output.
Here's a quick script I used on a mac to get a unique list of users across multiple repositories.
To filter out duplicates, take your output and pipe through:
sort | uniq
. Thus:I woud not be surprised if this is the way to do what you ask. Unix tools often expect the user to do fancy processing and analysis with other tools.
P.S. Come to think of it, you can merge the
grep
andawk
...P.P.S. Per Kevin Reid...
P3.S. Per kan, using the vertical bars instead of spaces as field separators, to properly handle names with spaces (also updated the Python examples)...
For more efficient, you could do a Perl one-liner. I don't know Perl that well, so I'd wind up doing it in Python:
Or, if you wanted counts:
Then you'd run:
In PowerShell, set your location to the working copy and use this command.
The output format of
svn.exe log --quiet
looks like this:Filter out the horizontal rules with
? { $_ -notlike '-*' }
.Split by
' \| '
to turn a record into an array.The second element is the name.
Make an array of each line and select the second element with
% { ($_ -split ' \| ')[1] }
.Return unique occurrences with
Sort -Unique
. This sorts the output as a side effect.A simpler alternative:
I had to do this in Windows, so I used the Windows port of Super Sed ( http://www.pement.org/sed/ ) - and replaced the AWK & GREP commands:
This uses windows "sort" that might not be present on all machines.
This command has the additional
grep '|'
that eliminates false values. Otherwise, Random commits starting with'r'
get included and thus words from commit messages get returned.