Why does TStringStream.Bytes differ from what one

2019-08-10 07:32发布

问题:

With TStringStream, the bytes using its Bytes property differs from the bytes extracted using TStream.Read. As shown below:

  1. the bytes extracted using TStream.Read represents correct data.
  2. the bytes using its Bytes property contains more data. (the last byte of correct bytes is different from that of wrong bytes)

Could you help to comment about the possible reason? Thank you very much for your help!

PS: Delphi XE, Windows 7. (It seems TStringStream back in Delphi 7 doesn't have LoadFromFile or SaveToFile.)

PS: The sample files can be download from SkyDrive: REF_EncodedSample & REF_DecodedSample. (Zlib-compressed image file.).

procedure CompareBytes_2;
var
  ss_1: TStringStream;
  ss_2: TStringStream;
  sbytes_Read: TBytes;
  sbytes_Property: TBytes;
  len_sbytes_Read: Integer;
  len_sbytes_Property: Integer;
  filename: string;
begin
  filename := 'REF_EncodedSample';  // textual data
//  filename := 'REF_DecodedSample';  // non-textual data

  ss_1 := TStringStream.Create;
  ss_1.LoadFromFile(filename);
  ss_2 := TStringStream.Create;
  ss_2.LoadFromFile(filename);

  ss_1.SaveToFile(filename+ '_CopyByStringStream_1');
  ss_2.SaveToFile(filename+ '_CopyByStringStream_2');

  len_sbytes_Read := ss_1.Size;
  SetLength(sbytes_Read, len_sbytes_Read);
  ss_1.Read(sbytes_Read[0], len_sbytes_Read);

  sbytes_Property := ss_2.Bytes;

  ShowMessage(
    BoolToStr(
      Length(sbytes_Read) = Length(sbytes_Property),
      True));

  ShowMessage(
    BoolToStr(
      sbytes_Read[len_sbytes_Read - 1] = sbytes_Property[len_sbytes_Read - 1],
      True));

  ss_1.Free;
  ss_2.Free;
end;

回答1:

The string stream documentation states:

The Bytes property returns the buffer in which the data is stored. Use the Size property to find the actual amount of data in the buffer.

Presumably the buffer has been allocated to hold more space than it actually needs. Only the first Size bytes of the buffer contain valid content.


Also, the call to ss_1.Read is a little pointless since Length(sbytes_Read) does not change after the call to SetLength. And when reading from a stream you are to use ReadBuffer rather than Read. Likewise for WriteBuffer.



回答2:

It is impossible for Read() to return different bytes than what is in the Bytes property, as Read() reads from the same TBytes object in memory that the Bytes property uses.

In D2009+, TStringStream was changed to derive from TBytesStream, which in turn derives from TMemoryStream. That is why TStringStream has LoadFromFile() and SaveToFile() methods available now. In earlier versions, TStringStream derived directly from TStream instead. TStringStream now stores encoded bytes, not a String like it did in earlier versions. The TStringStream contructor takes a String as input and encodes it to a TBytes using the specified TEncoding (where TEncoding.Default is used if you do not specify a TEncoding). The DataString property getter decodes that same TBytes back to a String using the TEncoding that was specified in the constructor. The LoadFromFile() and SaveToFile() methods load/save the TBytes as-is without any encoding/decoding performed at all.

So, if you call LoadFromFile(), it will store the file data as-is in the stream. If the encoding of that data does not match the TEncoding that is passed to the constructor, the DataStream property is going to return garbage, but the Bytes property and the Read() method will work just fine.

Your problem is you are not taking into account that the Bytes property returns the entire memory block that TMemoryStream allocates for its data storage. TMemoryStream allocates memory in chunks based on deltas, so it is possible for it to allocate more memory than it actually needs, that is why it has separate Capacity and Size properties. The Capacity property indicates the total size of the allocated memory, whereas the Size property indicates how many bytes have been stored in the allocated memory. When you call LoadFromFile(), the Bytes property will almost always be larger than the size of the file that was loaded.

So basically, your code is comparing the proverbial "apples" to "oranges", which is why you are getting bad results. You need to fix your code accordingly, to account for the difference between the Capacity and Size properties. The contents of the two TBytes variables will be identical up to Size bytes. You are comparing more than that, which is why your code is failing.