Store mongodb data in compressed format

2019-02-13 16:55发布

I am using mongodb to store raw HTML data of web pages using scrapy framework. In one day of web scraping 25GB disk space is filled up. Is there a way to store raw data in compressed format.

标签： mongodb compression

3条回答

仙女界的扛把子

2楼-- · 2019-02-13 17:20

Starting with 2.8 version of Mongo, you can use compression. You will have 3 levels of compression with WiredTiger engine, mmap (which is default in 2.6 does not provide compression):

None
snappy (by default)
zlib

Here is an example of how much space will you be able to save for 16 GB of data:

enter image description here

data is taken from this article.

0人赞添加讨论(0) 举报

萌系小妹纸

3楼-- · 2019-02-13 17:39

There's nothing built in for compression. Some operating systems offer disk/file compression, but if you want more control, I'd suggest you compress it using a library for whatever programming language you're using and manually control the compression.

For example, NodeJs offers simple convenience methods for this: http://nodejs.org/api/zlib.html#zlib_examples

3.0 Update

If you choose to switch to the new storage engine WiredTiger which ships with 3.0, you can choose between several types of compression as documented here. Of course, you'll want to test this change in production workloads to find if the additional CPU utilization is worth the compression received.

0人赞添加讨论(0) 举报

一夜七次

4楼-- · 2019-02-13 17:45

You can store your string like this to compress it: myhtml.encode('zlib')

0人赞添加讨论(0) 举报

Store mongodb data in compressed format

3.0 Update

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间