Is there any Python module that helps to crawl dat

2019-05-27 04:18发布

I want to scrape data from a page which loads DOM elements using Ajax call.

I have tried with the old solution line PyQt4-based scraping, which loads the DOM after it's fully loaded, but the problem is that I need to do a POST request and it's only available for GET.

The new Python module ghost.py has time out issues: when it fetches a large DOM tree it raises a time out exception.

If anyone knows any specific way or tools that can help me to do a POST request and grab the data after fully loaded DOM, that will help me a lot.

2条回答
做自己的国王
2楼-- · 2019-05-27 04:59

You can use Selenium to automate browser and access dom. Selenium has python driver hence you can write code in python to navigate to the page. click buttons and wait for ajax call to complete before you start scrapping.

查看更多
Lonely孤独者°
3楼-- · 2019-05-27 04:59

For emulating Javascript and automate browser, I recommend `Spynner. You can run it with or without a Xserver and the syntax is quite simple to use. You can load jquery too.

查看更多
登录 后发表回答