a lot of files I download have crap/spam in their filenames, e.g.
[ www.crap.com ] file.name.ext
www.crap.com - file.name.ext
I've come up with two ways for dealing with them but they both seem pretty clunky:
with parameter expansion:
if [[ ${base_name} != ${base_name//\[+([^\]])\]} ]]
then
mv -v "${dir_name}/${base_name}" "${dir_name}/${base_name//\[+([^\]])\]}" &&
base_name="${base_name//\[+([^\]])\]}"
fi
if [[ ${base_name} != ${base_name//www.*.com - /} ]]
then
mv -v "${dir_name}/${base_name}" "${dir_name}/${base_name//www.*.com - /}" &&
base_name="${base_name//www.*.com - /}"
fi
# more of these type of statements; one for each type of frequently-encountered pattern
and then with echo/sed:
tmp=`echo "${base_name}" | sed -e 's/\[[^][]*\]//g' | sed -e 's/\s-\s//g'`
mv "${base_name}" "{tmp}"
I feel like the parameter expansion is the worse of the two but I like it because I'm able to keep the same variable assigned to the file for further processing after the rename (the above code is used in a script that's called for each file after the file download is complete).
So anyway I was hoping there's a better/cleaner way to do the above that someone more knowledgeable than myself could show me, preferably in a way that would allow me to easily reassign the old/original variable to the new/renamed file.
Thanks
Take advantage of the following classical pattern:
where
job_select
is responsible for selecting the objects of your job,job_strategy
prepares a processing plan for these objects andjob_process
eventually executes the plan.This assumes that filenames do not contain a vertical bar
|
nor a newline character.The job_select function
The
find
command can examine all properties of the file maintained by the file system, like creation time, access time, modification time. It is also possible to control how the filesystem is explored by tellingfind
not to descend into mounted filesystems, how much recursions levels are allowed. It is common to append pipes to thefind
command to perform more complicated selections based on the filename.Avoid the common pitfall of including the contents of hidden directories in the output of the
job_select
function. For instance, the directoriesCVS
,.svn
,.svk
and.git
are used by the corresponding source control management tools and it is almost always wrong to include their contents in the output of thejob_select
function. By inadvertently batch processing these files, one can easily make the affected working copy unusable.The job_strategy function
This commands reads the output of
job_select
and makes a plan for our renaming job. The plan is represented by text lines having two fields separated by the character|
, the first field being the old name of the file and the second being the new computed file of the file, it looks likeThe particular program used to produce the plan is essentially irrelevant, but it is common to use
sed
as in the example;awk
orperl
for this. Let us walk through thesed
-script used here:It can be easier to use several filters to prepare the plan. Another common case is the use of the
stat
command to add creation times to file names.The job_process function
The input field separator IFS is adjusted to let the function read the output of
job_strategy
. Declaringoldname
andnewname
as local is useful in large programs but can be omitted in very simple scripts. Thejob_process
function can be adjusted to avoid overwriting existing files and report the problematic items.About data structures in shell programs Note the use of pipes to transfer data from one stage to the other: apprentices often rely on variables to represent such information but it turns out to be a clumsy choice. Instead, it is preferable to represent data as tabular files or as tabular data streams moving from one process to the other, in this form, data can be easily processed by powerful tools like
sed
,awk
,join
,paste
andsort
— only to cite the most common ones.Two answer: using perl rename or using pure bash
As there are some people who dislike perl, I wrote my bash only version
Renaming files by using the
rename
command.Introduction
Yes, this is a typical job for
rename
command which was precisely designed for:More oriented samples
Simply drop all spaces and square brackets:
Rename all
.jpg
by numbering from1
:Demo:
Full syntax for matching SO question, in safe way
There is a strong and safe way using
rename
utility:As this is perl common tool, we have to use perl syntax:
Testing rule:
... and so on...
... and it's safe while you don't use
-f
flag torename
command: file won't be overwrited and you will get an error message if something goes wrong.Renaming files by using bash and so called bashisms:
I prefer doing this by using dedicated utility, but this could even be done by using pure bash (aka without any fork)
There is no use of any other binary than bash (no
sed
,awk
,tr
or other):To be run with files as argument, for sample:
.-
,-.
,--
or..
by only one-
.If you want to use something not depending on perl, you can use the following code (let's call it
sanitizeNames.sh
). It is only showing a few cases, but it's easily extensible using string substitution, tr (and sed too).And use it:
If you are using Ubunntu/Debian os use rename command to rename multiple files at time.
You can use rnm
The above will remove
[crap]
or[spam]
from filename.You can pass multiple regex pattern by terminating them with
;
or overloading the-rs
option.The general format of this replace string is
/search_part/replace_part/modifier
uppercase/lowercase:
A replace string of the form
/search_part/\c/modifier
will make the selected part of the filename (by the regexsearch_part
) lowercase while\C
(capital \C) in replace part will make it uppercase.If you have many regex patterns that need to be dealt with, then put those patterns in a file and pass the file with
-rs/f
option.You can find some other examples here.
Note:
rnm -u
P.S: I am the author of this tool.