How to Programmatically take Snapshot of Crawled W

2019-03-17 08:13发布

What is the best solution to programmatically take a snapshot of a webpage?

The situation is this: I would like to crawl a bunch of webpages and take thumbnail snapshots of them periodically, say once every few months, without having to manually go to each one. I would also like to be able to take jpg/png snapshots of websites that might be completely Flash/Flex, so I'd have to wait until it loaded to take the snapshot somehow.

It would be nice if there was no limit to the number of thumbnails I could generate (within reason, say 1000 per day).

Any ideas how to do this in Ruby? Seems pretty tough.

Browsers to do this in: Safari or Firefox, preferably Safari.

Thanks so much.

5条回答
Luminary・发光体
2楼-- · 2019-03-17 08:19

There is no built in library in Ruby for rendering a web page.

查看更多
可以哭但决不认输i
3楼-- · 2019-03-17 08:21

as viewed by.... ie? firefox? opera? one of the myriad webkit engines?

if only it were possible to automate http://browsershots.org :)

查看更多
走好不送
4楼-- · 2019-03-17 08:22

Use selenium-rc, it comes with snapshot capabilities.

查看更多
狗以群分
5楼-- · 2019-03-17 08:29

With jruby you can use SWT's browser library.

查看更多
我命由我不由天
6楼-- · 2019-03-17 08:37

This really depends on your operating system. What you need is a way to hook into a web browser and save that to an image.

If you are on a Mac - I would imagine your best bet would be to use MacRuby (or RubyCocoa - although I believe this is going to be deprecated in the near future) and then to use the WebKit framework to load the page and render it as an image.

This is definitely possible, for inspiration you may wish to look at the Paparazzi! and webkit2png projects.

Another option, which isn't dependent on the OS, might be to use the BrowserShots API.

查看更多
登录 后发表回答