compress text before storing in mysql database

2019-08-02 00:30发布

I am developing a web application using php and MySQL. I am compressing the text using gzcompress() php function before storing in MySQL database / storing compressed form of text in database. My Question is that is this OK? to store compressed form? Or this method will create trouble or not for me? I am compressing text for saving disk space.

3条回答
神经病院院长
2楼-- · 2019-08-02 01:12

My approach. I needed to store email's text in MySQL table. Column type: TEXT.

Compress:

$compressed_string_for_db = base64_encode(gzcompress('big email text', 9));

Uncompress:

$email_text = gzuncompress(base64_decode($compressed_string_from_db));

My compression result:

income string length: 41072

compressed string length: 5312

查看更多
老娘就宠你
3楼-- · 2019-08-02 01:17

I honestly don't see any practical benefits of storing compressed texts. Sure you could probably save some space, but if your text is so big that you get a practical benefit out of compressing it, I think you need to worry more about your server's CPU having to compress / uncompress the humongous text whenever a visitor visits your pages (and not to mention searchable text means you gotta uncompress it to search).

查看更多
一纸荒年 Trace。
4楼-- · 2019-08-02 01:18

I can definitely see a LOT of practical benefits of storing compressed text ...
gz compression is fast, can be streamed and saves a LOT of space in some scenarios.

In my practical real life use I have data usage of about 50MB per 1000 records and I am getting millions of records.
Compressed I am able to reduce this to 10MB per 1000 records which means I pay 1/5th of the usual data storage fee to my cloud provider.
If that's not a practical benefit, what else ?;)

Also, did you consider that the CPU load of gzip is minimal ? gzip is already a HTTP standard, stackoverflow is being sent to your browser in a gzip stream just like almost any website if you are not using a browser from 1985.
I would guess that disk access (especially hard disks) have more impact than gzip on data retrieval. So by reducing the data through compression size you might actually gain performance.

Here is the code I use to compress the data in a mysql compatible format:
Here is my code for that scenario, for uncompress you can also use PHP and just substr the first 4 bytes away.
Output of mysql:

mysql : "select hex(compress('1234512345'))"
0A000000789C3334323631350411000AEB01FF

The php equivalent:

$string="1234512345";
$data=gzcompress($string);
$len=mb_strlen($string);
$head=pack('V',$len);

echo($head);
echo($data);

Output of PHP:

php test.php  | hexdump -C
00000000  0a 00 00 00 78 9c 33 34  32 36 31 35 04 11 00 0a
查看更多
登录 后发表回答