I read a large file in the code below which has a special structure - among others two blocks that need be processed at the same time. Instead of seeking back and forth in the file I load these two blocks wrapped in memoryview
calls
with open(abs_path, 'rb') as bsa_file:
# ...
# load the file record block to parse later
file_records_block = memoryview(bsa_file.read(file_records_block_size))
# load the file names block
file_names_block = memoryview(bsa_file.read(total_file_name_length))
# close the file
file_records_index = names_record_index = 0
for folder_record in folder_records:
name_size = struct.unpack_from('B', file_records_block, file_records_index)[0]
# discard null terminator below
folder_path = struct.unpack_from('%ds' % (name_size - 1),
file_records_block, file_records_index + 1)[0]
file_records_index += name_size + 1
for __ in xrange(folder_record.files_count):
file_name_len = 0
for b in file_names_block[names_record_index:]:
if b != '\x00': file_name_len += 1
else: break
file_name = unicode(struct.unpack_from('%ds' % file_name_len,
file_names_block,names_record_index)[0])
names_record_index += file_name_len + 1
The file is correctly parsed, but as it's my first use of the mamoryview interface I am not sure I do it right. The file_names_block is composed as seen by null terminated c strings.
- Is my trick
file_names_block[names_record_index:]
using the memoryview magic or do I create some n^2 slices ? Would I need to useislice
here ? - As seen I just look for the null byte manually and then proceed to
unpack_from
. But I read in How to split a byte string into separate bytes in python that I can usecast()
(docs ?) on the memory view - any way to use that (or another trick) to split the view in bytes ? Could I just callsplit('\x00')
? Would this preserve the memory efficiency ?
I would appreciate insight on the one right way to do this (in python 2).