I need the help of you programming savants in creating a batch script or powershell script that will move and divide a group of files from one directory into 4 subdirectories based on an average total filesize. After the sort, the sub-directories should be roughly equal in terms of folder size.
Why do I need this?
I have 4 computers that I would like to utilize for encoding via FFMPEG and it would be helpful for a script to divide a folder into 4 parts (sub-directories) based on a total average size.
So lets say there are an assortment of movie files with varying different file sizes totaling to 100 GB, the script would divy the movie files and move them into 4 sub folders; each folder having around 25 GB. Doing this will allow the 4 machines to encode the sum of the data equally and efficiently.
After all that encoding I'll have 2 files, XYZ.(original Extension) and XYZ.264, A script that could compare the 2 files and delete the larger file would be extremely helpful and cut down on manual inspection.
Thank you, I hope this is possible.
@ECHO Off
SETLOCAL ENABLEDELAYEDEXPANSION
SET "sourcedir=U:\sourcedir"
SET "destdir=U:\destdir"
PUSHD "%sourcedir%"
:: number of subdirectories
SET /a parts=4
:: make subdirs and initialise totalsizes
FOR /L %%a IN (1,1,%parts%) DO MD "%destdir%\sub%%a" 2>nul&SET /a $%%a=0
:: directory of sourcefiles, sort in reverse-size order
FOR /f "tokens=1*delims=" %%a IN (
'dir /b /a-d /o-s * '
) DO (
REM find smallest subdir by size-transferred-in
SET /a smallest=2000000000
FOR /L %%p IN (1,1,%parts%) DO IF !$%%p! lss !smallest! SET /a smallest=!$%%p!&SET part=%%p
REM transfer the file and count the size
ECHO(MOVE "%%a" "%destdir%\sub!part!"
REM divide by 100 as actual filelength possibly gt 2**31
SET "size=%%~za"
IF "!size:~0,-2!" equ "" (SET /a $!part!+=1) ELSE (SET /a $!part!=!size:~0,-2! + $!part!)
)
popd
GOTO :EOF
I believe the remarks should explain the method. The principle is to record the length-transferred to each subdirectory and select the least-filled as the destination for the file (processed in reverse-size order)
Since batch has a limit of 2^31, I chose to roughly divide the filesize by 100 by lopping of the last 2 digits. For files <100 bytes, I arbitrarily recorded that as 100 bytes.
You would need to change the settings of sourcedir
and destdir
to suit your circumstances.
The required MOVE commands are merely ECHO
ed for testing purposes. After you've verified that the commands are correct, change ECHO(MOVE
to MOVE
to actually move the files. Append >nul
to suppress report messages (eg. 1 file moved
)
@ECHO OFF
SETLOCAL ENABLEDELAYEDEXPANSION
SET "destdir=U:\destdir"
SET "spaces= "
FOR /f "delims=" %%a IN (
'dir /b /ad "%destdir%\*"'
) DO (
PUSHD "%destdir%\%%a"
FOR /f "delims=" %%f IN (
'dir /b /a-d "*.xyz" 2^>nul'
) DO (
IF EXIST "%%f.264" (
FOR %%k IN ("%%f.264") DO (
SET "sizexyz=%spaces%%%~zf"
SET "size264=%spaces%%%~zk"
IF "!sizexyz:~-15!" gtr "!size264:~-15!" (ECHO(DEL /F /Q "%%f") ELSE (ECHO(DEL /F /Q "%%f.264")
)
)
)
popd
)
GOTO :EOF
This second batch scans the directorynames into %%a
then switches teporarily to the detination directory %destfile\%%a
.
Once there, we look for .xyz
files and for each one found, find the corresponding .xyz.264
file.
If that exists, then we find the sizes of the files (%%~zk
or %%~zf
) and append that to a long string of spaces. By comparing the last 15 characters of the result as a string, we can determine which is longer.
The required DEL commands are merely ECHO
ed for testing purposes. After you've verified that the commands are correct, change ECHO(DEL
to DEL
to actually delete the files.
If the .264
file is filename.264
instead of filename.xyz.264
then replace each "%%f.264"
with "%%~nf.264"
(the ~n
selects the name-part only).
To manually enter a source directoryname, use
SET /p "sourcedir=Source directory "
To accept the source directoryname as a parameter, use
SET "sourcedir=%%~1"
To process all files, except .h264
files, change
FOR /f "delims=" %%f IN (
'dir /b /a-d "*.xyz" 2^>nul'
) DO (
to
FOR /f "delims=" %%f IN (
'dir /b /a-d "*.*" 2^>nul'
) DO if /i "%%~xf" neq ".h264" (
where *.*
means "all files" and the extra if
statement checks whether the extension to the filename %%f
(%%~xf
) is not equal to (neq
) .h264
and the /i
directs "regardless of case (case-Insensitive)"
This might seem like a simple request, but exact partitioning is actually a really hard problem.
The easiest way to approximate a somewhat fair partitioning is simply to sort all files (from biggest to smallest) and then distribute them one-by-one into n groups (a bit like if you were giving out cards for a card game):
# Define number of subgroups/partitions
$n = 4
# Create your destination folders:
$TargetFolders = 1..$n |ForEach-Object {
mkdir "C:\path\to\movies\sub$_"
}
# Find the movie files sort by length, descending
$Files = Get-ChildItem "C:\path\to\movies" -Recurse |Where-Object {'.mp4','.mpg','.xyz' -contains $_.Extension} |Sort-Object Length -Descending
for($i = 0; $i -lt $Files.Count; $i++)
{
# Move files into sub folders, using module $n to "rotate" target folder
Move-Item $Files[$i].FullName -Destination $TargetFolders[$i % $n]
}
If you have multiple file types that you want to include, use a Where-Object
filter instead of the Filter
parameter with Get-ChildItem
:
$Files = Get-ChildItem "C:\path\to\movies" -File -Recurse |Where-Object {'.mp4','.mpg','.xyz' -contains $_.Extension} |Sort-Object Length -Descending
#!/bin/bash
nbr_of_dirs=4
# Go to directory if specified, otherwise execute in current directory
if [ -n "$1" ]; then
cd $1
fi
# Create output directories and store them in an array
for i in $(seq 1 $nbr_of_dirs); do
dir=dir_$i
mkdir $dir
dirs[i]=$dir
done
# For every non-directory, in decreasing size:
# find out the current smallest directory and move the file there
ls -pS | grep -v / | while read line; do
smallest_dir=$(du -S ${dirs[@]} | sort -n | head -1 | cut -f2)
mv "$line" $smallest_dir
done
Remember to keep the script file in a different directory when executing this. The script iterates over every file, so if the script was in the directory too it would be moved to one of the sub-directories.