headless internet browser? [closed]

2018-12-31 21:25发布

I would like to do the following. Log into a website, click a couple of specific links, then click a download link. I'd like to run this as either a scheduled task on windows or cron job on Linux. I'm not picky about the language I use, but I'd like this to run with out putting a browser window up on the screen if possible.

14条回答
余生无你
2楼-- · 2018-12-31 21:45

.NET contains System.Windows.Forms.WebBrowser. You can create an instance of this, send it to a URL, and then easily parse the html on that page. You could then follow any links you found, etc.

I have worked with this object only minimally, so I'm no expert, but if you're already familiar with .NET then it would probably be worth looking into.

查看更多
无色无味的生活
3楼-- · 2018-12-31 21:46

Node.js with YUI on the server. Check out this video: http://www.yuiblog.com/blog/2010/09/29/video-glass-node/

The guy in this video Dav Glass shows an example of how he uses node to fetch a page from Digg. He then attached YUI to the DOM he grabbed and can completely manipulate it.

查看更多
春风洒进眼中
4楼-- · 2018-12-31 21:46

If you use PHP - try http://mink.behat.org/

查看更多
不流泪的眼
5楼-- · 2018-12-31 21:46

Also you can use Live Http Headers (Firefox extension) to record headers which are sent to site (Login -> Links -> Download Link) and then replicate them with php using fsockopen. Only thing which you'll probably need to variate is the cookie's value which you receive from login page.

查看更多
一个人的天荒地老
6楼-- · 2018-12-31 21:49

Check out twill, a very convenient scripting language for precisely what you're looking for. From the examples:

setlocal username <your username>
setlocal password <your password>

go http://www.slashdot.org/
formvalue 1 unickname $username
formvalue 1 upasswd $password
submit

code 200     # make sure form submission is correct!

There's also a Python API if you're looking for more flexibility.

查看更多
低头抚发
7楼-- · 2018-12-31 21:49

Except for the auto-download of the file (as that is a dialog box) a win form with the embedded webcontrol will do this.

You could look at Watin and Watin Recorder. They may help with C# code that can login to your website, navigate to a URL and possibly even help automate the file download.

YMMV though.

查看更多
登录 后发表回答