Embedding binary data in a script efficiently

2019-02-24 23:10发布

I have seen some installation files (huge ones, install.sh for Matlab or Mathematica, for example) for Unix-like systems, they must have embedded quite a lot of binary data, such as icons, sound, graphics, etc, into the script. I am wondering how that can be done, since this can be potentially useful in simplifying file structure.

I am particularly interested in doing this with Python and/or Bash.

Existing methods that I know of in Python:

  1. Just use a byte string: x = b'\x23\xa3\xef' ..., terribly inefficient, takes half a MB for a 100KB wav file.
  2. base64, better than option 1, enlarge the size by a factor of 4/3.

I am wondering if there are other (better) ways to do this?

2条回答
乱世女痞
2楼-- · 2019-02-24 23:48

You can use base64 + compression (using bz2 for instance) if that suits your data (e.g., if you're not embedding already compressed data).

For instance, to create your data (say your data consist of 100 null bytes followed by 200 bytes with value 0x01):

>>> import bz2
>>> bz2.compress(b'\x00' * 100 + b'\x01' * 200).encode('base64').replace('\n', '')
'QlpoOTFBWSZTWcl9Q1UAAABBBGAAQAAEACAAIZpoM00SrccXckU4UJDJfUNV'

And to use it (in your script) to write the data to a file:

import bz2
data = 'QlpoOTFBWSZTWcl9Q1UAAABBBGAAQAAEACAAIZpoM00SrccXckU4UJDJfUNV'
with open('/tmp/testfile', 'w') as fdesc:
    fdesc.write(bz2.decompress(data.decode('base64')))
查看更多
等我变得足够好
3楼-- · 2019-02-24 23:48

Here's a quick and dirty way. Create the following script called MyInstaller:

#!/bin/bash

dd if="$0" of=payload bs=1 skip=54

exit

Then append your binary to the script, and make it executable:

cat myBinary >> myInstaller
chmod +x myInstaller

When you run the script, it will copy the binary portion to a new file specified in the path of=. This could be a tar file or whatever, so you can do additional processing (unarchiving, setting execute permissions, etc) after the dd command. Just adjust the number in "skip" to reflect the total length of the script before the binary data starts.

查看更多
登录 后发表回答