That title doesn't explain much but I couldn't summarize it quickly. Let's say I have files like this (all in the same directory)...
abc_foo_file1_morestuff.ext
abc_foo_file2_morestuff.ext
efg_goo_file1_morestuff.ext
jkl_zoo_file0_morestuff.ext
jkl_zoo_file1_morestuff.ext
jkl_zoo_file4_morestuff.ext
xyz_roo_file6_morestuff.ext
And I want them renamed to:
abc-1.ext
abc-2.ext
efg-1.ext
jkl-1.ext
jkl-2.ext
jkl-3.ext
xyz-1.ext
So basically some files in sets (abc, jkl, xyz) got removed and some got renamed to have a zero in them so they'd be listed first. But I want to resequence them to start at 1 and not have any gaps in the sequence.
I tagged this with Python because that's what I've attempted before, but if there's a simpler or cleaner approach, I'm all for it!
My general approach to this problem would be the following steps:
- Get a list of all of the files you want renamed
- Create a list of the starting sequences
- Go through a list of the files that start with each sequence changing their names
- Optionally, sort the list of files in each sequence
So, in python that would look like:
from os import listdir
from os.path import isfile, join, splitext
import shutil
import re
mypath = "./somewhere/"
# this function taken from an answer to http://stackoverflow.com/questions/4836710/does-python-have-a-built-in-function-for-string-natural-sort
def natural_sort_key(s, _nsre=re.compile('([0-9]+)')):
return [int(text) if text.isdigit() else text.lower()
for text in re.split(_nsre, s)]
# Get a list of all files in a directory
infiles = [f for f in listdir(mypath) if isfile(join(mypath, f))]
# Create a set of starting sequences
sequences = {splitext(file)[0].split('_')[0] for file in infiles}
# Now go about changing the files
for seq in sequences:
files = sorted([f for f in infiles if f.startswith(seq)],
key=natural_sort_key) # sort the files, if you like
for idx, file in enumerate(files):
new_name = seq + '-' + str(idx + 1) + '.' + splitext(file)[1][1:]
print(file, " -> ", new_name)
# copy or move
#shutil.copy2(join(mypath,file), join(mypath,new_name))
#shutil.move(join(mypath,file), join(mypath,new_name))
You can do this with a batch file:
@ECHO OFF
SETLOCAL EnableExtensions EnableDelayedExpansion
REM Process each EXT file in the specified directory.
FOR /F "usebackq tokens=* delims=" %%A IN (`DIR /B /S "C:\Path\To\Files\*.ext"`) DO (
REM Extract the prefix.
FOR /F "usebackq tokens=1 delims=_" %%X IN ('%%~nA') DO SET Prefix=%%X
REM Running tally of each prefix.
SET /A Count[!Prefix!] += 1
REM Rename the file using the prefix.
CALL SET NewName=!Prefix!-%%Count[!Prefix!]%%
REN "%%A" "!NewName!%%~xA"
)
ENDLOCAL
Here is a pure batch-file solution which regards all of your requirements (see also rem
comments):
@echo off
setlocal EnableExtensions
rem definition of temporary file:
set "TEMPFILE=%TMP%\~AlphaNum.tmp"
rem first loop structure for listing the files with `dir`
rem and creating the temporary file for later sorting:
> "%TEMPFILE%" (
for /F "eol=| tokens=1,2,3,* delims=_" %%I in ('
dir /B /A:-D "*_*_file*_*.ext"
') do (
rem store different parts of the file name into variables:
setlocal DisableDelayedExpansion
set "LINI=%%~I" & rem this is the very first part (prefix)
set "LINL=%%~J" & rem this is the part left to `file*`
set "LINM=%%~K" & rem this is the part `file*` (`*` is numeric)
set "LINR=%%~L" & rem this is the part right to `file*`
setlocal EnableDelayedExpansion
rem extract the numeric part of `file*` and pad with leading `0`s;
rem so the index is now of fixed width (12 figures at most here):
set "LINN=000000000000!LINM:*file=!" & set "LINN=!LINN:~-12!"
rem write file name with replaced fixed-width index and without
rem word `file`, followed by `|`, followed by original file name,
rem to the temporary file (the `echo` is redirected `>`):
echo !LINI!_!LINL!_!LINN!_!LINR!^|!LINI!_!LINL!_!LINM!_!LINR!
endlocal
endlocal
)
)
rem second loop structure for reading the temporary file,
rem sorting its lines with `sort`, generating new indexes
rem (sorting the previously built fixed-width indexes as text
rem result in the same order as sorting them as numbers) and
rem building the new file names (original first part + new index):
set "PREV="
for /F "eol=| tokens=2 delims=|" %%I in ('
sort "%TEMPFILE%"
') do (
setlocal DisableDelayedExpansion
rem get the full original file name, and its extension:
set "FILE=%%~I"
set "FEXT=%%~xI"
rem this loop iterates once only and extracts the part before the first `_`:
for /F "eol=| tokens=1 delims=_" %%X in ("%%~I") do (
set "CURR=%%~X"
)
setlocal EnableDelayedExpansion
rem if the current prefix equals the previous one, increment the index;
rem otherwise, reset the index to `1`:
if /I "!CURR!"=="!PREV!" (
set /A SNUM+=1
) else (
set /A SNUM=1
)
rem remove `ECHO` from the following line to actually rename files:
ECHO ren "!FILE!" "!CURR!-!SNUM!!FEXT!"
rem this loop iterates once only and transports the values of
rem some variable values past the `setlocal`/`endlocal` barrier:
for /F "tokens=1,* delims=|" %%X in ("!SNUM!|"!CURR!"") do (
endlocal
endlocal
set "SNUM=%%X"
set "PREV=%%~Y"
)
)
rem remove `REM` from the following line to delete temporary file:
REM del "%TEMPFILE%"
endlocal
The toggling of EnableDelayedExpansion
and DisableDelayedExpansion
is required to make the script robust for any special characters that might occur in file names, like %
, !
, (
, )
, &
and ^
. Type set /?
into the command prompt to find brief information about delayed variable expansion.
This approach relies on the following assumptions:
- the files match the pattern
*_*_file*_*.ext
;
- none of the file name parts
*
contain _
characters on its own;
- the indexes after the word
file
are decimal numbers with up to 12 digits;
- the part between first and second
_
takes precedence over the index after the word file
with respect to the sort order; so for instance, supposing there are two files abc_AAA_file8_*.ext
and abc_BBB_file5_*.ext
, abc_AAA_file8_*.ext
will be renamed to abc-1.ext
and abc_BBB_file5_*.ext
to abc-2.ext
, because AAA
comes before BBB
; if this is not the desired behaviour, exchange the echo
command line in the first loop structure by this one:
echo !LINI!_!LINN!_!LINL!_!LINR!^|!LINI!_!LINL!_!LINM!_!LINR!
;
- the newly generated indexes are not padded with leading zeros;
With the sample files from your question, the temporary file contains the following lines:
abc_foo_000000000001_morestuff.ext|abc_foo_file1_morestuff.ext
abc_foo_000000000002_morestuff.ext|abc_foo_file2_morestuff.ext
efg_goo_000000000001_morestuff.ext|efg_goo_file1_morestuff.ext
jkl_zoo_000000000000_morestuff.ext|jkl_zoo_file0_morestuff.ext
jkl_zoo_000000000001_morestuff.ext|jkl_zoo_file1_morestuff.ext
jkl_zoo_000000000004_morestuff.ext|jkl_zoo_file4_morestuff.ext
xyz_roo_000000000006_morestuff.ext|xyz_roo_file6_morestuff.ext