`find -name` with regex pattern and filename repla

2019-07-12 02:08发布

Currently I'm using the command in cron to make copy of *.data from source to target path:

find /source_path -name *.data -exec cp {} /target_path \;

The source structure is:

    /source_path/category1/001.data
    /source_path/category1/002.data
    /source_path/category2/003.data
    /source_path/category3/004.data
    /source_path/categorya/005.data
    /source_path/categoryb/006.data

After the above cron command, the target will contain:

    /target_path/001.data
    /target_path/002.data
    /target_path/003.data
    /target_path/004.data
    /target_path/005.data
    /target_path/006.data

I need a one-line solution to replace my current cron command, so that after execution, the target will contain:

    /target_path/category1_001.data
    /target_path/category1_002.data
    /target_path/category2_003.data
    /target_path/category3_004.data
    /target_path/categorya_005.data
    /target_path/categoryb_006.data

To append sub-directory name as a prefix of the target filename.

Thanks.

1条回答
我欲成王,谁敢阻挡
2楼-- · 2019-07-12 02:15

Check this command which only prints strings:

$ find /source_path -name \*.data  | while read -r filename; do printf "print version: cp %s %s\n" "${filename}" "$(printf "%s\n" "${filename}" | sed "s/^.*[/]\(category[^/]*\)[/]\(.*[.]data\)$/\/target_path\/\1_\2/")"; done

find command prints the filenames found, one per line.

read -r filename read one line of text and store it into filename variable.

find ... | while read -r filename all together, write a list of filenames, one per line, into the pipe. Only one filename is read at a time. For each filename read, the command into the while block is executed.

The sed command changes a pathname /source_path/category1/001.data into /target_path/category1_001.data.

I tried my best to explain the string argument of sed in the lines below, but if you are interresting in these topics you should read:

s/ is the search and replace sed command and it is followed with 3 elements: "s/regex pattern/replacement/flag"

^ at the very start means, start of the line.

. means any one char.

* means 0 or infinite number of the char specified just before.

[/] means one char, the char /. [] are used to escape / otherwise it is interpreted as a delimiter between regex pattern, replacement, and flag.

Alltogether ^.*[/], means a line starting with any zero or more chars. This starting sequence must end with /.

[^/] means one char, ^ at start means not part of the char listed. So, it means any one char except the /.

[abc] between [], means one char: either a either b either c.

The first \(.*\) encountered in the regex pattern can be referenced with \1 in replacement. The second \(.*\) encountered in the regex pattern can be referenced with \2 in replacement. etc. Without \ escape char, ( means a single char (, and the content cannot be referenced.

When done use cp instead to effectively copy the files:

find /source_path -name \*.data  | while read -r filename; do cp "${filename}" "$(printf "%s\n" "${filename}" | sed "s/^.*[/]\(category[^/]*\)[/]\(.*[.]data\)$/\/target_path\/\1_\2/")"; done
查看更多
登录 后发表回答