Still new to cmd batch scripting...
I've got a batch to remove tab characters from a file. This usually works great with this code:
setlocal DisableDelayedExpansion
for /f "delims=" %%A in ('"findstr /n ^^ %FILENAME%"') do (
set "line=%%A"
setlocal EnableDelayedExpansion
set "line=!line:*:=!"
if defined line (
set "line=!line: =!"
(echo(!line!)>>%TEMPFILE%
) ELSE echo(
endlocal
)
But recently it didn't simply delete the tab character, but the whole line! I figuered out it must have something to do with the unusual length of the line (>9500 characters). If I split the line manually, it works as usual.
Right now I'm looking for a way to either
- make the code above work for any line lenght or
- insert a check for lines that are to long to process, so the batch can stop the process and display an appropiate message.
cmd.exe
can process lines up to 8k characters. I also need to process longer lines and after some research I found the easiest way is to use an external program. I usesed
from UnxUtils.This
sed
command should remove all tab characters:The problem with long lines in Batch files is that environment variables can only store a maximum of 8 KB. However, it is possible to process longer lines in smaller chunks because when
set /P
command read a long line, it reads up to 1022 characters and the remaining characters will be read by the nextset /P
command. The Batch file below use this method (combined withfindstr /O "^"
that allows to know the length of the lines) to copy a file with lines of unlimited size:This method requires that the input lines ends in CR+LF characters (Windows standard) and have the problems inherent to
set /P
: it may eliminate control characters from the end of the line or from the end of each chunk of 1022 characters, or spaces from the beginning of the line/chunk; further details at this post. You may modify this program changingset /P "=!chunk!" < NUL
by the correspondingset /P "=!chunk: =!" < NUL
one in order to eliminate tab characters.VBS theoretical line length is 2,000,000,000 bytes (or 1 x 2^30 characters). You'll never get anywhere near that (the actual is largest block of free contigious memory - it will be millions of characters).
How to use.
Replace
Finds and replaces text using regular expressions.
Also used to extract substrings from a file.
Ampersands and brackets in expression must be escaped with the caret. Do not escape carets. Use hexidecimal code \x22 for quotes.
SearchOptions
Expression
https://msdn.microsoft.com/en-us/library/ae5bf541(v%3Dvs.90).aspx
Replace
The text to replace. Use $1, $2, $..., $n to specify sub matches in the replace string
Example
This searches for text within square brackets and replaces the line with cat followed by the text within brackets
This searches for any text and prints from the 11th character to the end of the line.
This searches a CSV file and prints the second and fourth field
Filter reads and writes standard in and standard out only. These are only available in a command prompt.
Download full source here https://skydrive.live.com/redir?resid=E2F0CE17A268A4FA!121