Any idea on how to scrape pages which are behind _

2020-07-25 10:34发布

问题:

I am working on this php base scraper/crawler, which works fine until it get .net generated herf link __doPostBack(...), any idea how to deal with this and crawl page behind those links ?

回答1:

Instead of trying to automate clicking the JavaScript button, which requires additional libraries in PHP, try replicating what request is sent by your browser after clicking the button. There are various firefox extensions that will help you examine the request, such as TamperData, Firebug, and LiveHttp.