Inserting bytes in the middle of binary file

2019-02-08 18:53发布

问题:

I want to add some string in the middle of image metadata block. Under some specific marker. I have to do it on bytes level since .NET has no support for custom metadata fields.

The block is built like 1C 02 XX YY YY ZZ ZZ ZZ ... where XX is the ID of the field I need to append and YY YY is the size of it, ZZ = data.

I imagine it should be more or less possible to read all the image data up to this marker (1C 02 XX) then increase the size bytes (YY YY), add data at the end of ZZ and then add the rest of the original file? Is this correct?

How should I go on with it? It needs to work as fast as possible with 4-5 MB JPEG files.

回答1:

In general there is no way to speed up this operation. You have to read at least portion that needs to be moved and write it again in updated file. Creating new file and copying content to it may be faster if you can parallelize read and write operations.

Note: In you particular case it may not be possible to just insert content in the middle of the file as most of file formats are not designed with such modifcations in mind. Often there are offsets to portions of the file that will be invalid when you shift part of the file. Specifying what file format you trying to work with may help other people to provide better approaches.



回答2:

Solved the problem with this code:

            List<byte> dataNew = new List<byte>();
            byte[] data = File.ReadAllBytes(jpegFilePath);

            int j = 0;
            for (int i = 1; i < data.Length; i++)
            {
                if (data[i - 1] == (byte)0x1C) // 1C IPTC
                {
                    if (data[i] == (byte)0x02) // 02 IPTC
                    {
                        if (data[i + 1] == (byte)fileByte) // IPTC field_number, i.e. 0x78 = IPTC_120
                        {
                            j = i;
                            break;
                        }
                    }
                }
            }

            for (int i = 0; i < j + 2; i++) // add data from file before this field
                dataNew.Add(data[i]); 

            int countOld = (data[j + 2] & 255) << 8 | (data[j + 3] & 255); // curr field length
            int countNew = valueToAdd.Length; // new string length
            int newfullSize = countOld + countNew; // sum
            byte[] newSize = BitConverter.GetBytes((Int16)newfullSize); // Int16 on 2 bytes (to use 2 bytes as size)
            Array.Reverse(newSize); // changes order 10 00 to 00 10
            for (int i = 0; i < newSize.Length; i++) // add changed size
                dataNew.Add(newSize[i]);

            for (int i = j + 4; i < j + 4 + countOld; i++) // add old field value
                dataNew.Add(data[i]);

            byte[] newString = ASCIIEncoding.ASCII.GetBytes(valueToAdd);
            for (int i = 0; i < newString.Length; i++) // append with new field value
                dataNew.Add(newString[i]);

            for (int i = j + 4 + newfullSize; i < data.Length; i++) // add rest of the file
                dataNew.Add(data[i]);

            byte[] finalArray = dataNew.ToArray();
            File.WriteAllBytes(Path.Combine(Path.GetDirectoryName(jpegFilePath), "newfile.jpg"), finalArray);                


回答3:

Here is an easy and quite fast solution. It moves all bytes after given offset to their new position according to given extraBytes, so you can insert your data.

public void ExpandFile(FileStream stream, long offset, int extraBytes)
{
  // http://stackoverflow.com/questions/3033771/file-io-with-streams-best-memory-buffer-size
  const int SIZE = 4096;
  var buffer = new byte[SIZE];
  var length = stream.Length;
  // Expand file
  stream.SetLength(length + extraBytes);
  var pos = length;
  int to_read;
  while (pos > offset)
  {
    to_read = pos - SIZE >= offset ? SIZE : (int)(pos - offset);
    pos -= to_read;
    stream.Position = pos;
    stream.Read(buffer, 0, to_read);
    stream.Position = pos + extraBytes;
    stream.Write(buffer, 0, to_read);
  }

Need to be checked, though...