shell script iterate throw directories and split f

2019-06-11 17:49发布

I need to extract 2 things from filenames - the extension and a number.

I have a folder "/var/www/html/MyFolder/", this folder contains a few more folders and in each folder are some files stored. The file has the following structure: "a_X_mytest.jpg" or "a_X_mytest.png". The "a_" is fix and in each folder the same, and i need the "X" and the file extension.

My script looks like this:

#!/bin/bash
for dir in /var/www/html/MyFolder/*/
do
  dir=${dir%*/}
  find "/var/www/html/MyFolder/${dir##*/}/a_*.*" -maxdepth 1 -mindepth 1 -type f
done

That's only the beginning from my script.

There is a mistake in my script:

find: `/var/www/html/MyFolder/first/a_*.*': No such file or directory
find: `/var/www/html/MyFolder/sec/a_*.*': No such file or directory
find: `/var/www/html/MyFolder/test/a_*.*': No such file or directory

Does anybody know where the mistake is? The next step, when the lines above are working, is to split the found files and get the two parts.

To split i would use this:

arrFIRST=(${IN//_/ })
echo ${arrFIRST[1]}
arrEXT=(${IN//./ })
echo ${arrEXT[1]}

Can anybody help me with my problem?

标签: bash shell unix
2条回答
\"骚年 ilove
2楼-- · 2019-06-11 18:22

tl;dr:

Your script can be simplified to the following:

for file in /var/www/html/MyFolder/*/a_*.*; do
  [[ -f $file ]] || continue
  [[ "${file##*/}" =~ _(.*)_.*\.(.*)$ ]] && 
    x=${BASH_REMATCH[1]} ext=${BASH_REMATCH[2]}
  echo "$x"
  echo "$ext"
done
  • A single glob (filename pattern, wildcard pattern) is sufficient in your case, because a glob can have multiple wildcards across levels of the hierarchy: /var/www/html/MyFolder/*/a_*.* finds files matching a_*.* in any immediate subfolder of (*/) of folder /var/www/html/MyFolder.
    You only need find to match files located on different levels of a subtree (but you may also need it for more complex matching needs).
  • [[ -f $file ]] || break ensures that only files are considered and also effectively exits the loop if NO matches are found.
  • [[ ... =~ ... ]] uses bash's regex-matching operator, =~, to extract the tokens of interest from the filename part of each matching file (${file##*/}).
  • The results of the regex matching are stored in reserved array variable "${BASH_REMATCH}", with the 1st element containing what the 1st parenthesized subexpression ((...) - a.k.a. capture group) captured, and so on.

    • Alternatively, you could have used read with an array to parse matching filenames into their components:

      IFS='_.' read -ra tokens <<<"${file##*/}"
      x="${tokens[0]}"
      ext="${tokens[@]: -1}"
      

As for why what you tried didn't work:

  • find does NOT support globs as filename arguments, so it interprets "/var/www/html/MyFolder/${dir##*/}/a_*.*" literally.
  • Also, you have to separate the root folder for your search from the filename pattern to look for on any level of the root folder's subtree:
    • the root folder becomes the filename argument
    • the filename pattern is passed (always quoted) via the -name or -iname (for case-insensitive matching) options
    • Ergo: find "/var/www/html/MyFolder/${dir##*/}" -name 'a_*.*' ..., analogous to @konsolebox' answer.
查看更多
Animai°情兽
3楼-- · 2019-06-11 18:26

I'm not sure about the needed complexity but perhaps what you want is

find /var/www/html/MyFolder/ -mindepth 2 -maxdepth 2 -type f -name 'a_*.*'

Thus:

while IFS= read -r FILE; do
    # Do something with "$FILE"...
done < <(exec find /var/www/html/MyFolder/ -mindepth 2 -maxdepth 2 -type f -name 'a_*.*')

Or

readarray -t FILES < <(exec find /var/www/html/MyFolder/ -mindepth 2 -maxdepth 2 -type f -name 'a_*.*')
for FILE in "${FILES[@]}"; do
    # Do something with "$FILE"...
done
查看更多
登录 后发表回答