Objective
Change these filenames:
- F00001-0708-RG-biasliuyda
- F00001-0708-CS-akgdlaul
- F00001-0708-VF-hioulgigl
to these filenames:
- F0001-0708-RG-biasliuyda
- F0001-0708-CS-akgdlaul
- F0001-0708-VF-hioulgigl
Shell Code
To test:
ls F00001-0708-*|sed 's/\(.\).\(.*\)/mv & \1\2/'
To perform:
ls F00001-0708-*|sed 's/\(.\).\(.*\)/mv & \1\2/' | sh
My Question
I don't understand the sed code. I understand what the substitution command
$ sed 's/something/mv'
means. And I understand regular expressions somewhat. But I don't understand what's happening here:
\(.\).\(.*\)
or here:
& \1\2/
The former, to me, just looks like it means: "a single character, followed by a single character, followed by any length sequence of a single character"--but surely there's more to it than that. As far as the latter part:
& \1\2/
I have no idea. I really want to understand this code. Please help me out here, guys.
I wrote a small post with examples on batch renaming using
sed
couple of years ago:http://www.guyrutenberg.com/2009/01/12/batch-renaming-using-sed/
For example:
If the regex contains groups (e.g.
\(subregex\
) then you can use them in the replacement text as\1\
,\2
etc.If all you're really doing is removing the second character, regardless of what it is, you can do this:
but your command is building a
mv
command and piping it to the shell for execution.This is no more readable than your version:
The fourth character is removed because
find
is prepending each filename with "./".The parentheses capture particular strings for use by the backslashed numbers.
The
sed
commandmeans to replace:
with:
just like a regular
sed
command. However, the parentheses,&
and\n
markers change it a little.The search string matches (and remembers as pattern 1) the single character at the start, followed by a single character, follwed by the rest of the string (remembered as pattern 2).
In the replacement string, you can refer to these matched patterns to use them as part of the replacement. You can also refer to the whole matched portion as
&
.So what that
sed
command is doing is creating amv
command based on the original file (for the source) and character 1 and 3 onwards, effectively removing character 2 (for the destination). It will give you a series of lines along the following format:and so on.
The backslash-paren stuff means, "while matching the pattern, hold on to the stuff that matches in here." Later, on the replacement text side, you can get those remembered fragments back with "\1" (first parenthesized block), "\2" (second block), and so on.
The easiest way would be:
or, portably,
This replaces the
F00001
prefix in the filenames withF0001
. credits to mahesh here: http://www.debian-administration.org/articles/150