I have data that is mostly organized in a way that I can convert and import into a spreadsheet. But certain lines have carriage returns and text that my current batch file won't use.
Good Data:
Pipers Cove × 2 $25.00 Pipers Cove Petite × 2 $25.00 Pipers Cove Plus × 2 $25.00 Nordic Club × 2 $25.00 Whiteout × 1 $12.50
Bad Data:
Pipers Cove Kids × 2 Size: Large - ages 10 to 12 $20.00 Pipers Cove Kids × 2 Size: Medium - ages 6 to 8 $20.00 Pipers Cove Kids × 2 Size: Small - ages 2 to 4 $20.00
I need to remove the 2 lines starting with Size
, Small
, Medium
, or Large
and have the dollar amount follow the quantity number so my batch file can convert it to a CSV file and so on.
Although you did not show any own efforts, I decided to provide a solution as the task at hand appears not that trivial to me.
The following script -- let us call it
clean-up-text-file.bat
-- ignores only lines that begin with the words you specified. Any other lines are appended to the previous one until a$
sign is encountered, in which case a new ine is started. With this method, no lines can get lost unintentionally.To use the script, provide your text file(s) as (a) command line argument(s):
Every specified file is modified directly, so take care when testing!
You would need to change the settings of
sourcedir
anddestdir
to suit your circumstances.I used a file named
q40953616.txt
containing your data for my testing.Produces the file defined as %outfile%
Scan each line of the file. If the line does not contain
$
then save the first such line inpart1
.Otherwise, tokenise the line. If there is only 1 token, then the
$
is at the start of the line, so it needs to be output combined withpart1
Otherwise, just regurgitate the line.