I have this resource file which I need to process, wich packs a set of files.
First, the resource file lists all the files contained within, plus some other data, such as in this struct:
struct FileEntry{
byte Value1;
char Filename[12];
byte Value2;
byte FileOffset[3];
float whatever;
}
So I would need to read blocks exactly this size.
I am using the Read function from FileStream, but how can I specify the size of the struct? I used:
int sizeToRead = Marshal.SizeOf(typeof(Header));
and then pass this value to Read, but then I can only read a set of byte[] which I do not know how to convert into the specified values (well I do know how to get the single byte values... but not the rest of them).
Also I need to specify an unsafe context which I don't know whether it's correct or not...
It seems to me that reading byte streams is tougher than I thought in .NET :)
Thanks!
If you can use unsafe code:
The fixed keyword embeds the array in the struct. Since it is fixed, this can cause GC issues if you are constantly creating these and never letting them go. Keep in mind that the constant sizes are the n*sizeof(t). So the Filename[12] is allocating 24 bytes (each char is 2 bytes unicode) and FileOffset[3] is allocating 3 bytes. This matters if you're not dealing with unicode data on disk. I would recommend changing it to a byte[] and converting the struct to a usable class where you can convert the string.
If you can't use unsafe, you can do the whole BinaryReader approach:
The unsafe way is nearly instant, far faster, especially when you're converting a lot of structs at once. The question is do you want to use unsafe. My recommendation is only use the unsafe method if you absolutely need the performance boost.
Base on this article, only I have made it generic, this is how to marshal the data directly to the struct. Very useful on longer data types.
Example Usage:
Not a full answer (it's been covered I think), but a specific note on the filename:
The
Char
type is probably not a one-byte thing in C#, since .Net characters are unicode, meaning they support character values far beyond 255, so interpreting your filename data asChar[]
array will give problems. So the first step is definitely to read that asByte[12]
, notChar[12]
.A straight conversion from byte array to char array is also not advised, though, since in binary indices like this, filenames that are shorter than the allowed 12 characters will probably be padded with 0 bytes, so a straight conversion will result in a string that's always 12 characters long and might end on these zero characters.
However, simply trimming these zeroes off is not advised, since reading systems for such data usually simply read up to the first encountered zero, and the data behind that in the array might actually contain garbage if the writing system doesn't bother to specifically clean its buffer with zeroes before putting the string into it. It's something a lot of programs don't bother doing, since they assume the reading system will only interpret the string up to the first zero anyway.
So, assuming this is indeed such a typical zero-terminated (C-style) string, saved in a one-byte-per-character text encoding (like ASCII or Win-1252), the second step is to cut off the string on the first zero. You can easily do this with Linq's
TakeWhile
function. Then the third and final step is to convert the resulting byte array to string with whatever that one-byte-per-character text encoding it's written with happens to be:As I said, the encoding will probably be something like pure ASCII, which can be accessed from
Encoding.ASCII
, or Windows-1252, the standard US / western Europe Windows text encoding, which you can retrieve withEncoding.GetEncoding("Windows-1252")
.Wrapping your
FileStream
with aBinaryReader
will give you dedicatedRead*()
methods for primitive types: http://msdn.microsoft.com/en-us/library/system.io.binaryreader.aspxOut of my head, you could probably mark your
struct
with[StructLayout(LayoutKind.Sequential)]
(to ensure proper representation in memory) and use a pointer inunsafe
block to actually fill the struct C-style. Goingunsafe
is not recommended if you don't really need it (interop, heavy operations like image processing and so on) however.Assuming this is C#, I wouldn't create a struct as a FileEntry type. I would replace char[20] with strings and use a BinaryReader - http://msdn.microsoft.com/en-us/library/system.io.binaryreader.aspx to read individual fields. You must read the data in the same order as it was written.
Something like:
If you insist having a struct, you should make your struct immutable and create a constructor with arguments for each of your field.