I have data that is mostly organized in a way that I can convert and import into a spreadsheet. But certain lines have carriage returns and text that my current batch file won't use.
Good Data:
Pipers Cove × 2 $25.00
Pipers Cove Petite × 2 $25.00
Pipers Cove Plus × 2 $25.00
Nordic Club × 2 $25.00
Whiteout × 1 $12.50
Bad Data:
Pipers Cove Kids × 2
Size:
Large - ages 10 to 12
$20.00
Pipers Cove Kids × 2
Size:
Medium - ages 6 to 8
$20.00
Pipers Cove Kids × 2
Size:
Small - ages 2 to 4
$20.00
I need to remove the 2 lines starting with Size
, Small
, Medium
, or Large
and have the dollar amount follow the quantity number so my batch file can convert it to a CSV file and so on.
@ECHO OFF
SETLOCAL
SET "sourcedir=U:\sourcedir"
SET "destdir=U:\destdir"
SET "filename1=%sourcedir%\q40953616.txt"
SET "outfile=%destdir%\outfile.txt"
SET "part1="
(
FOR /f "usebackqdelims=" %%i IN ("%filename1%") DO (
ECHO %%i|FIND "$" >NUL
IF ERRORLEVEL 1 (
REM $ not found - set part1 on first such line
IF NOT DEFINED part1 SET "part1=%%i"
) ELSE (
REM $ found - see whether at start or not
FOR /f "tokens=1*delims=$" %%a IN ("%%i") DO (
IF "%%b"=="" (
REM at start - combine and output and reset part1
CALL ECHO %%part1%% %%i
SET "part1="
) ELSE (
ECHO %%i
)
)
)
)
)>"%outfile%"
GOTO :EOF
You would need to change the settings of sourcedir
and destdir
to suit your circumstances.
I used a file named q40953616.txt
containing your data for my testing.
Produces the file defined as %outfile%
Scan each line of the file. If the line does not contain $
then save the first such line in part1
.
Otherwise, tokenise the line. If there is only 1 token, then the $
is at the start of the line, so it needs to be output combined with part1
Otherwise, just regurgitate the line.
Although you did not show any own efforts, I decided to provide a solution as the task at hand appears not that trivial to me.
The following script -- let us call it clean-up-text-file.bat
-- ignores only lines that begin with the words you specified. Any other lines are appended to the previous one until a $
sign is encountered, in which case a new ine is started. With this method, no lines can get lost unintentionally.
@echo off
setlocal EnableExtensions DisableDelayedExpansion
rem // Define constants here:
set _WORDS="Size","Small","Medium","Large"
for %%F in (%*) do (
set "COLL=" & set "FILE=%%~F"
for /F delims^=^ eol^= %%L in ('type "%%~F" ^& ^> "%%~F" rem/') do (
set "LINE=%%L"
(echo("%%L" | > nul find "$") && (
setlocal EnableDelayedExpansion
>> "!FILE!" echo(!COLL!!LINE!
endlocal
set "COLL="
) || (
set "FLAG="
for %%K in (%_WORDS%) do (
(echo("%%L" | > nul findstr /I /R /B /C:^^^"\"%%~K\>") && (
set "FLAG=#"
)
)
if not defined FLAG (
setlocal EnableDelayedExpansion
rem // The following line contains a TAB character!
for /F "delims=" %%E in (^""!COLL!!LINE! "^") do (
endlocal
set "COLL=%%~E"
)
)
)
)
)
endlocal
exit /B
To use the script, provide your text file(s) as (a) command line argument(s):
clean-up-text-file.bat "good.txt" "bad.txt"
Every specified file is modified directly, so take care when testing!