Database for web crawler in python?

2019-04-03 00:26发布

Hi im writing a web crawler in python to extract news articles from news websites like nytimes.com. i want to know what would be a good db to use as a backend for this project?

Thanks in advance!

4条回答
别忘想泡老子
2楼-- · 2019-04-03 00:59

This could be a great project to use a document database like CouchDB, MongoDB, or SimpleDB.

MongoDB has a hosted solution: http://mongohq.com. There is also a binding for Python (Pymongo).

SimpleDB is a great choice if you are hosting this on Amazon Web Services

CouchDB is an open source package from the Apache Foundation.

查看更多
虎瘦雄心在
3楼-- · 2019-04-03 00:59

You can take a look at Firebird

Firebird python driver are developped by the core team

查看更多
放我归山
4楼-- · 2019-04-03 01:17

Personally, I love PostGreSQL -- but other free DBs such as MySql (or, if you have reasonably small amounts of data -- a few GB at most -- even the SQLite that comes with Python) will be fine too.

查看更多
相关推荐>>
5楼-- · 2019-04-03 01:20

I think the database itself will probably be one of the easier aspects of a web crawler like this.

If expect high load reading or writing the database (for example if you intend to run many crawlers at the same time) then you will want to steer in the direction of MySql, otherwise something like Sqlite will probably do you just fine.

查看更多
登录 后发表回答