I am writing a bash script that needs to parse filenames.
It will need to remove all special characters (including space): "!?.-_ and change all uppercase letters to lowercase. Something like this:
Some_randoM data1-A
More Data0
to:
somerandomdata1a
moredata0
I have seen lots of questions to do this in many different programming languages, but not in bash. Is there a good way to do this?
cat yourfile.txt | tr -dc '[:alnum:]\n\r' | tr '[:upper:]' '[:lower:]'
The first tr
deletes special characters. d
means delete, c
means complement (invert the character set). So, -dc
means delete all characters except those specified. The \n
and \r
are included to preserve linux or windows style newlines, which I assume you want.
The second one translates uppercase characters to lowercase.
Pure BASH 4+ solution:
$ filename='Some_randoM data1-A'
$ f=${filename//[^[:alnum:]]/}
$ echo "$f"
SomerandoMdata1A
$ echo "${f,,}"
somerandomdata1a
A function for this:
clean() {
local a=${1//[^[:alnum:]]/}
echo "${a,,}"
}
Try it:
$ clean "More Data0"
moredata0
if you are using mkelement0 and Dan Bliss approach. You can also look into sed + POSIX regular expression.
cat yourfile.txt | sed 's/[^a-zA-Z0-9]//g'
Sed matches all other characters that are not contained within the brackets except letters and numbers and remove them.
I've used tr
to remove any characters that are not part of [:print:]
class
cat file.txt | tr -dc '[:print:]'
or
echo "..." | tr -dc '[:print:]'
Additionally you might want to |
(pipe) the output to od -c
to confirm the result
cat file.txt | tr -dc '[:print:]' | od -c