Scripting HTTP more effeciently-第2页回答

Often times I want to automate http queries. I currently use Java(and commons http client), but would probably prefer a scripting based approach. Something really quick and simple. Where I can set a header, go to a page and not worry about setting up the entire OO lifecycle, setting each header, calling up an html parser... I am looking for a solution in ANY language, preferable scripting

标签： python ruby perl http scripting

12条回答

淡お忘

2楼-- · 2019-03-11 15:42

If you have simple needs (fetch a page and then parse it), it is hard to beat LWP::Simple and HTML::TreeBuilder.

use strict;
use warnings;

use LWP::Simple;
use HTML::TreeBuilder;

my $url = 'http://www.example.com';
my $content = get( $url) or die "Couldn't get $url";

my $t = HTML::TreeBuilder->new_from_content( $content );
$t->eof;
$t->elementify;

# Get first match:
my $thing = $t->look_down( _tag => 'p', id => qr/match_this_regex/ );

print $thing ? $thing->as_text : "No match found\n";

# Get all matches:
my @things = $t->look_down( _tag => 'p', id => qr/match_this_regex/ );

print $_ ? $_->as_text : "No match found" for @things;

0人赞添加讨论(0) 举报

别忘想泡老子

3楼-- · 2019-03-11 15:47

Python urllib may be what you're looking for.

Alternatively powershell exposes the full .NET http library in a scripting environment.

0人赞添加讨论(0) 举报

Viruses.

4楼-- · 2019-03-11 15:49

Depending on exactly what you're doing the easiest solution looks to be bash + curl.

The man page for the latter is available here:

http://curl.haxx.se/docs/manpage.html

You can do posts as well as gets, HTTPS, show headers, work with cookies, basic and digest HTTP authentication, tunnel through all sorts of proxies, including NTLM on *nix amongst other things.

curl is also available as shared library with C and PHP support.

HTH

0人赞添加讨论(0) 举报

Juvenile、少年°

5楼-- · 2019-03-11 15:49

Twill is pretty good and made for testing. It can be used as script, in an interactive session or within a Python program.

0人赞添加讨论(0) 举报

【Aperson】

6楼-- · 2019-03-11 15:49

Perl and WWW::Mechanize can make web scraping etc simple and easy, including easy handling of forms (let's say you want to go to a login page, fill in a username and password and submit the form, handling cookies / hidden session identifiers just as a browser would...)

Similarly, finding or extracting links from the fetched page is trivial.

If you need to parse stuff out of the resulting pages that WWW::Mechanize can't easily help with, then feed the result to HTML::TreeBuilder to make parsing easy.

0人赞添加讨论(0) 举报

孤傲高冷的网名

7楼-- · 2019-03-11 15:51

I'm testing ReST APIs at the moment and found the ReST Client very nice. It's a GUI program, but nonetheless you can save and restore queries as XML files (or let them be generated), embed, write test scripts, and so on. And it's Java based (which is not an ad-hoc advantage, but you mentioned it).

Minus points for recording sessions. The ReST Client is good for stateless "one-shots".

If it doesn't suit your needs, I'd go for the already mentioned Mechanize (or WWW-Mechanize, as it is called at CPAN).

0人赞添加讨论(0) 举报

上一页 1 2

Scripting HTTP more effeciently

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间