I want to find some string in front of another str

2019-07-26 12:29发布

I want to use bash shell to split string like:

Calcipotriol - Daivonex Cream 50mcg/1g 30 g [1]
Aspirin - DBL Aspirin 100mg [1] tablet

I want to get brand name "Davionex Cream" and "DBL Aspirin" I want to get the name in front of parttern ***mg or ***mcg or ***g

how to do it?

3条回答
你好瞎i
2楼-- · 2019-07-26 13:08

You can use sed this way:

sed -E 's/^[[:alpha:]]+ - ([[:alpha:] ]+) [[:digit:]]+.*/\1/' <<< "Calcipotriol - Daivonex Cream 50mcg/1g 30 g [1]"

=> Daivonex Cream
  • ^[[:alpha:]]+ - => matches all the characters until the pattern we need to extract
  • ([[:alpha:] ]+) => this is the part we want to extract
  • [[:digit:]]+.* => this is everything that comes after; we assume this part starts with a space and one or more digits, followed by any number of characters
  • \1 => the part extracted by the (...) expression above; we replace the entire string with the matched part

You can check out this site to learn more about regexes: http://regexr.com/

查看更多
▲ chillily
3楼-- · 2019-07-26 13:24

If your sample input is representative, awk may offer the simplest solution:

awk -F'- | [0-9]+(mc?)?g' '{ print $2 }' <<'EOF'
Calcipotriol - Daivonex Cream 50mcg/1g 30 g [1]
Aspirin - DBL Aspirin 100mg [1] tablet
Foo - Foo Bar 22g [1] other
EOF

yields:

Daivonex Cream
DBL Aspirin
Foo Bar
查看更多
▲ chillily
4楼-- · 2019-07-26 13:27

In Bash you can do:

while IFS= read -r line || [[ -n "$line" ]]; do
    if [[ "$line" =~ ^([[:alpha:]]+)[[:space:][:punct:]]+([[:alpha:][:space:]]+)[[:space:]](.*)$ ]]
    then
         printf "1:'%s' 2:'%s' 3:'%s'\n" "${BASH_REMATCH[1]}" "${BASH_REMATCH[2]}" "${BASH_REMATCH[3]}"
    fi  
    done <<<"Calcipotriol - Daivonex Cream 50mcg/1g 30 g [1]
Aspirin - DBL Aspirin 100mg [1] tablet" 

Prints:

1:'Calcipotriol' 2:'Daivonex Cream' 3:'50mcg/1g 30 g [1]'
1:'Aspirin' 2:'DBL Aspirin' 3:'100mg [1] tablet'
查看更多
登录 后发表回答