Sort a find command to respect a custom order in U

2019-06-01 12:21发布

I have a script that outputs file paths (via find), which I want to sort based on very specific custom logic:

  • 1st sort key: I want the 2nd and, if present, the 3rd --separated field to be sorted using custom ordering based on a list of keys I supply - but excluding a numerical suffix.
    With the sample input below, the list of keys is:
    rp,alpha,beta-ri,beta-rs,RC

  • 2nd sort key: numeric sorting by the trailing number on each line.

Given the following sample input (note that the /foo/bar/test/example/8.2.4.0 prefix of each line is incidental):

/foo/bar/test/example/8.2.4.0-RC10
/foo/bar/test/example/8.2.4.0-RC2
/foo/bar/test/example/8.2.4.0-RC1
/foo/bar/test/example/8.2.4.0-alpha10
/foo/bar/test/example/8.2.4.0-beta-ri10
/foo/bar/test/example/8.2.4.0-beta-ri2
/foo/bar/test/example/8.2.4.0-beta-rs10
/foo/bar/test/example/8.2.4.0-beta-rs2
/foo/bar/test/example/8.2.4.0-alpha2
/foo/bar/test/example/8.2.4.0-rp10
/foo/bar/test/example/8.2.4.0-rp2

I expect:

/foo/bar/test/example/8.2.4.0-rp2
/foo/bar/test/example/8.2.4.0-rp10
/foo/bar/test/example/8.2.4.0-alpha2
/foo/bar/test/example/8.2.4.0-alpha10
/foo/bar/test/example/8.2.4.0-beta-ri2
/foo/bar/test/example/8.2.4.0-beta-ri10
/foo/bar/test/example/8.2.4.0-beta-rs2
/foo/bar/test/example/8.2.4.0-beta-rs10
/foo/bar/test/example/8.2.4.0-RC1
/foo/bar/test/example/8.2.4.0-RC2
/foo/bar/test/example/8.2.4.0-RC10

3条回答
贪生不怕死
2楼-- · 2019-06-01 12:52

I found out a solution totally different of what @mklement0 suggests me.

#!/bin/bash

echo "Enter a version :"
read VERSION

while read line; 
do

  find $line -type d | grep $VERSION | sort -n >> outfile.txt

  grep '.*-alpha[0-9]' outfile.txt | sort -n >> outfile2.txt 
  grep '.*-beta-ri[0-9]' outfile.txt | sort -n >> outfile2.txt 
  grep '.*-beta-rs[0-9]' outfile.txt | sort -n >> outfile2.txt 
  grep '.*-RC[0-9]' outfile.txt | sort -n >> outfile2.txt   
  rm outfile.txt 

done <whatever.txt

Content of outfile2.txt :

/foo/bar/test/example/8.2.4.0-alpha10
/foo/bar/test/example/8.2.4.0-alpha8
/foo/bar/test/example/8.2.4.0-alpha9
/foo/bar/test/example/8.2.4.0-beta-ri1
/foo/bar/test/example/8.2.4.0-beta-ri2
/foo/bar/test/example/8.2.4.0-beta-rs1
/foo/bar/test/example/8.2.4.0-beta-rs2
/foo/bar/test/example/8.2.4.0-beta-rs3
/foo/bar/test/example/8.2.4.0-RC1

The only thing wrong with this is that alpha10 came before alpha8

Any clue ?

查看更多
三岁会撩人
3楼-- · 2019-06-01 12:56

Using a variant of my answer to your original question:

./your-script | awk -v keysInOrder='rp,alpha,beta-ri,beta-rs,RC' '
    BEGIN {
      FS=OFS="-"
      keyCount = split(keysInOrder, a, ",")
      for (i = 1; i <= keyCount; ++i) keysToOrdinal[a[i]] = i
    }
    { 
      sortKey = $2
      if (NF == 3) sortKey = sortKey FS $3
      sub(/[0-9]+$/, "", sortKey)
      auxFieldPrefix = "|" FS
      if (NF == 2) auxFieldPrefix = auxFieldPrefix FS
      sub(/[0-9]/, auxFieldPrefix "&", $NF)
      sortOrdinal = sortKey in keysToOrdinal ? keysToOrdinal[sortKey] : keyCount + 1
      print sortOrdinal, $0
    }
'  | sort -t- -k1,1n -k3,3 -k5,5n | sed 's/^[^-]*-//; s/|-\{1,2\}//'

./your-script represents whatever command produces the output you want to sort.

Note that an aux. character, |, is used to facilitate sorting, and the assumption is that this character doesn't appear in the input - which should be reasonable safe, given that filesystem paths usually don't contain pipe characters.

Any field 2 values (sans numeric suffix) that aren't in the list of sort keys, sort after the field 2/3 values that are, using alphabetic sorting among them.

查看更多
▲ chillily
4楼-- · 2019-06-01 13:06

While this does not match what the OP is looking for, it would be useful to point out that sort command has an option -V for version sorting. And it does the job by following correct order of characters in ASCII table (i.e. UPPERCASE letters first, lowercase letters next)

For example:

cat test.sort.txt 
/foo/bar/test/example/8.2.4.0-RC10
/foo/bar/test/example/8.2.4.0-RC2
/foo/bar/test/example/8.2.4.0-RC1
/foo/bar/test/example/8.2.4.0-alpha10
/foo/bar/test/example/8.2.4.0-beta-ri10
/foo/bar/test/example/8.2.4.0-beta-ri2
/foo/bar/test/example/8.2.4.0-beta-rs10
/foo/bar/test/example/8.2.4.0-beta-rs2
/foo/bar/test/example/8.2.4.0-alpha2
/foo/bar/test/example/8.2.4.0-rp10
/foo/bar/test/example/8.2.4.0-rp2

And sorting:

 % sort -V test.sort.txt              
/foo/bar/test/example/8.2.4.0-RC1
/foo/bar/test/example/8.2.4.0-RC2
/foo/bar/test/example/8.2.4.0-RC10
/foo/bar/test/example/8.2.4.0-alpha2
/foo/bar/test/example/8.2.4.0-alpha10
/foo/bar/test/example/8.2.4.0-beta-ri2
/foo/bar/test/example/8.2.4.0-beta-ri10
/foo/bar/test/example/8.2.4.0-beta-rs2
/foo/bar/test/example/8.2.4.0-beta-rs10
/foo/bar/test/example/8.2.4.0-rp2
/foo/bar/test/example/8.2.4.0-rp10

So, it is useful to be aware of this when giving version names.

With that said, if you insisted, this is one liner that use sed to enforce sorting:

cat test.sort.txt|sed -e 's/-rp/-x1xrp/;s/-alpha/-x2xalpha/;s/-beta-ri/-x3xbeta-ri/;s/-beta-rs/-x4xbeta-rs/;s/-RC/-x5xRC/'|sort -V|sed -e 's/x.x//'
/foo/bar/test/example/8.2.4.0-rp2
/foo/bar/test/example/8.2.4.0-rp10
/foo/bar/test/example/8.2.4.0-alpha2
/foo/bar/test/example/8.2.4.0-alpha10
/foo/bar/test/example/8.2.4.0-beta-ri2
/foo/bar/test/example/8.2.4.0-beta-ri10
/foo/bar/test/example/8.2.4.0-beta-rs2
/foo/bar/test/example/8.2.4.0-beta-rs10
/foo/bar/test/example/8.2.4.0-RC1
/foo/bar/test/example/8.2.4.0-RC2
/foo/bar/test/example/8.2.4.0-RC10
查看更多
登录 后发表回答