可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I want to use bash shell to split string like:

Calcipotriol - Daivonex Cream 50mcg/1g 30 g [1]
Aspirin - DBL Aspirin 100mg [1] tablet

I want to get brand name "Davionex Cream" and "DBL Aspirin" I want to get the name in front of parttern ***mg or ***mcg or ***g

how to do it?

回答1:

In Bash you can do:

while IFS= read -r line || [[ -n "$line" ]]; do
    if [[ "$line" =~ ^([[:alpha:]]+)[[:space:][:punct:]]+([[:alpha:][:space:]]+)[[:space:]](.*)$ ]]
    then
         printf "1:'%s' 2:'%s' 3:'%s'\n" "${BASH_REMATCH[1]}" "${BASH_REMATCH[2]}" "${BASH_REMATCH[3]}"
    fi  
    done <<<"Calcipotriol - Daivonex Cream 50mcg/1g 30 g [1]
Aspirin - DBL Aspirin 100mg [1] tablet"

Prints:

1:'Calcipotriol' 2:'Daivonex Cream' 3:'50mcg/1g 30 g [1]'
1:'Aspirin' 2:'DBL Aspirin' 3:'100mg [1] tablet'

回答2:

If your sample input is representative, awk may offer the simplest solution:

awk -F'- | [0-9]+(mc?)?g' '{ print $2 }' <<'EOF'
Calcipotriol - Daivonex Cream 50mcg/1g 30 g [1]
Aspirin - DBL Aspirin 100mg [1] tablet
Foo - Foo Bar 22g [1] other
EOF

yields:

Daivonex Cream
DBL Aspirin
Foo Bar

回答3:

You can use sed this way:

sed -E 's/^[[:alpha:]]+ - ([[:alpha:] ]+) [[:digit:]]+.*/\1/' <<< "Calcipotriol - Daivonex Cream 50mcg/1g 30 g [1]"

=> Daivonex Cream

^[[:alpha:]]+ - => matches all the characters until the pattern we need to extract
([[:alpha:] ]+) => this is the part we want to extract
[[:digit:]]+.* => this is everything that comes after; we assume this part starts with a space and one or more digits, followed by any number of characters
\1 => the part extracted by the (...) expression above; we replace the entire string with the matched part

You can check out this site to learn more about regexes: http://regexr.com/