Bash: remove words from string containing numbers

2019-09-04 08:30发布

问题:

In bash how to perform a string rename deleting all words that contains a number:

name_befor_proc="art-of-medusa-feefacc0-c75e-4846-9ccf-7463d5944061.jpg"

result:

name_after_proc="art-of-medusa.jpg"

回答1:

In sed, remove everything between - that contains a number.

sed 's/[^-]*[0-9][^-\.]*-\{0,1\}//g;s/-\././' test
art-of-medusa.jpg


回答2:

I guess there is no generic solution, also you can use the following python script for your particular use case

name = "art-of-medusa-feefacc0-c75e-4846-9ccf-7463d5944061.jpg"
ext  = name.split(".")[1]
def contains_number(word):
    for i in "0123456789":
        if i in word:
            return False
    return True
final = '-'.join([word for word in name.split('-') if contains_number(word)])
if ext not in final:
    final += "."+ext

print final

output:

art-of-medusa.jpg


回答3:

It is not trivial!

awk -F"." -v sep="-" '
      {n=split($1,a,sep)
       for (i=1; i<=n; i++)
            {if (a[i] ~ /[0-9]/) delete a[i]}
       n=length(a)
       for (i in a)
            printf "%s%s", a[i], (++c<n?sep:"")
       printf "%s%s\n", FS, $2}'

Split the string (up to the dot) and loop through the pieces. If one contains a digit, remove it. Then, rejoin the array and print accordingly.

Test

$ awk -F"." -v sep="-" '{n=split($1,a,sep); for (i=1; i<=n; i++) {if (a[i] ~ /[0-9]/) delete a[i]}; n=length(a); for (i in a) printf "%s%s", a[i], (++c<n?sep:""); printf "%s%s\n", FS, $2}' <<< "art-of-medusa-feefacc0-c75e-4846-9ccf-7463d5944061.jpg"
art-of-medusa.jpg

Testing with "art-of-medusa-feefacc0-c75e-4846-9ccf-7463d5944061-a-23-b.jpg" to make sure other words are also matched:

$ awk -F"." -v sep="-" '{n=split($1,a,sep); for (i=1; i<=n; i++) {if (a[i] ~ /[0-9]/) delete a[i]}; n=length(a); for (i in a) printf "%s%s", a[i], (++c<n?sep:""); printf "%s%s\n", FS, $2}' <<< "art-of-medusa-feefacc0-c75e-4846-9ccf-7463d5944061-a-23-b.jpg"
art-of-medusa-a-b.jpg


回答4:

You can use gnu-awk for this:

s="art-of-medusa-feefacc0-c75e-4846-9ccf-7463d5944061.jpg"
name_after_proc=$(awk -v RS='[.-]' '!/[[:digit:]]/{printf r $1} {r=RT}' <<< "$s")

echo "$name_after_proc"
art-of-medusa.jpg


回答5:

Two possible solutions:

  1. Using Sed:

    sed 's/[a-zA-Z0-9]*[0-9][a-zA-Z0-9]*/ /g' filename

  2. Using grep:

    grep -wo -E [a-zA-Z]+ foo | xargs filename



标签: bash shell