I have a ksh script that returns a long list of values, newline separated, and I want to see only the unique/distinct values. It is possible to do this?
For example, say my output is file suffixes in a directory:
tar gz java gz java tar class class
I want to see a list like:
tar gz java class
With zsh you can do this:
Or you can use AWK:
Unique, as requested, (but not sorted);
uses fewer system resources for less than ~70 elements (as tested with time);
written to take input from stdin,
(or modify and include in another script):
(Bash)
This is the same as monoxide's answer, but a bit more concise.
For larger data sets where sorting may not be desirable, you can also use the following perl script:
This basically just remembers every line output so that it doesn't output it again.
It has the advantage over the "
sort | uniq
" solution in that there's no sorting required up front.You might want to look at the
uniq
andsort
applications.(FYI, yes, the sort is necessary in this command line,
uniq
only strips duplicate lines that are immediately after each other)EDIT:
Contrary to what has been posted by Aaron Digulla in relation to
uniq
's commandline options:Given the following input:
uniq
will output all lines exactly once:uniq -d
will output all lines that appear more than once, and it will print them once:uniq -u
will output all lines that appear exactly once, and it will print them once:Pipe them through
sort
anduniq
. This removes all duplicates.uniq -d
gives only the duplicates,uniq -u
gives only the unique ones (strips duplicates).