Get specific string

2019-07-12 12:20发布

问题:

I need to get a specific string from a bigger string:

From these Abcd1234_Tot9012_tore.dr or Abcd1234_Tot9012.tore.dr

I want to get those numbers which are between Tot and _ or . , so I should get 9012. Important thing is that the number of characters before and after these numbers may vary.

Could anyone give me a nice solution for this? Thanks in advance!

回答1:

This should also work if you are looking only for numbers after Tot

[srikanth@myhost ~]$ echo "Abcd1234_Tot9012_tore.dr" | awk ' { match($0,/Tot([0-9]*)/,a); print a[1]; } '
9012
[srikanth@myhost ~]$ echo "Abcd1234_Tot9012.tore.dr" | awk ' { match($0,/Tot([0-9]*)/,a); print a[1]; } '
9012


回答2:

I know this is tagged as bash/sed but perl is clearer for this kind of task, in my opinion. In case you're interested:

perl -ne 'print $1 if /Tot([0-9]+)[._]/' input.txt

-ne tells perl to loop the specified one-liner over the input file without printing anything by default.

The regex is readable as: match Tot, followed by a number, followed by either a dot or an underscore; capture the number (that's what the parens are for). As it's the first/capture group it's assigned to the $1 variable, which then is printed.



回答3:

Pure Bash:

string="Abcd1234_Tot9012_tore.dr"        # or ".tore.dr"

string=${string##*_Tot}
string=${string%%[_.]*}

echo "$string"

Remove longest leading part ending with '_Tot'.

Remove longest trailing part beginning with '_' or '.'.

Result:

9012


回答4:

awk

string="Abcd1234_Tot9012_tore.dr"
num=$(awk -F'Tot|[._]' '{print $3}' <<<"$string")

sed

string="Abcd1234_Tot9012_tore.dr"
num=$(sed 's/.*\([0-9]\{4\}\).*$/\1/' <<<"$string")

Example

$ string="Abcd1234_Tot9012_tore.dr"; awk -F'Tot|[._]' '{print $3}' <<<"$string"
9012

$ string="Abcd1234_Tot9013.tore.dr"; sed 's/.*\([0-9]\{4\}\).*$/\1/' <<<"$string"
9013


回答5:

You can use perl one-liner:

perl -pe 's/.*(?<=Tot)([0-9]{4}).*/\1/' file

Test:

[jaypal:~/Temp] cat file
Abcd1234_Tot9012_tore.dr
Abcd1234_Tot9012.tore.dr

[jaypal:~/Temp] perl -pe 's/.*(?<=Tot)([0-9]{4}).*/\1/' file
9012
9012


回答6:

Using grep you can do:

str=Abcd1234_Tot9012.tore.dr; grep -o "Tot[0-9]*" <<< $str|grep -o "[0-9]*$"

OUTPUT:

9012


回答7:

This might work for you:

echo -e "Abcd1234_Tot9012_tore.dr\nAbcd1234_Tot9012.tore.dr" | 
sed 's/Tot[^0-9]*\([0-9]*\)[_.].*/\n\1/;s/.*\n//'
9012
9012

This works equally as well:

echo -e "Abcd1234_Tot9012_tore.dr\nAbcd1234_Tot9012.tore.dr" |
sed 's/.*Tot\([0-9]*\).*/\1/'
9012
9012


标签: bash shell sed