I have a file that uses \x01
as line terminator. That is line terminator is NOT newline but the bytevalue of 001
. Here is the ascii representation for it which ^A
.
I want to split file to size of 10 MB each. Here is what I came up with
size=10000 #10 MB
i=0
with open("in-file", "rb") as ifile:
ofile = open("output0.txt","wb")
data = ifile.read(size)
while data:
ofile.write(data)
ofile.close()
data = ifile.read(size)
i+=1
ofile = open("output%d.txt"%(i),"wb")
ofile.close()
However, this would result in files that are broken at arbitrary places.
I want the files to be terminated only at the byte value of 001
and next read resumes from the next byte.
if its just one byte terminal you can do something like
then make a helper function that will read all the lines in a file
then make a function that will chunk it up
then just do something like
there might be some way to do this with libraries (or even an awesome builtin way) but im not aware of any offhand