我通常不Perl的编码器。 但是我必须完成这个任务。
下面的代码对我的作品:
#!/usr/bin/perl
use LWP::UserAgent;
use JSON;
use strict;
my $md5 = $ARGV[0];
$md5 =~ s/[^A-Fa-f0-9 ]*//g;
die "invalid MD5" unless ( length($md5) == 32 );
my $ua = LWP::UserAgent->new(ssl_opts => { verify_hostname => 1 }, timeout => 10);
my $key="12345...7890";
my $url='https://www.virustotal.com/vtapi/v2/file/report';
my $response = $ua->post( $url, ['apikey' => $key, 'resource' => $md5] );
die "$url error: ", $response->status_line unless $response->is_success;
my $results=$response->content;
my $json = JSON->new->allow_nonref;
my $decjson = $json->decode( $results);
print "md5: ",$md5,"\n";
print "positives: ", $decjson->{"positives"}, "\n";
print "total: ", $decjson->{"total"}, "\n";
print "date: ", $decjson->{"scan_date"}, "\n";
现在我想重新编写以上对使用魔异步HTTP。 我想这样的:
#!/usr/bin/perl
use warnings;
use strict;
use Mojo;
use Mojo::UserAgent;
my $md5 = $ARGV[0];
$md5 =~ s/[^A-Fa-f0-9 ]*//g;
die "invalid MD5" unless ( length($md5) == 32 );
my ($vt_positives, $vt_scandate, $response_vt);
my $url='https://www.virustotal.com/vtapi/v2/file/report';
my $key="12345...7890";
my $ua = Mojo::UserAgent->new;
my $delay = Mojo::IOLoop->delay;
$ua->max_redirects(0)->connect_timeout(3)->request_timeout(6);
$ua->max_redirects(5);
$delay->begin;
$response_vt = $ua->post( $url => ['apikey' => $key, 'resource' => $md5] => sub {
my ($ua, $tx) = @_;
$vt_positives=$tx->res->json->{"positives"};
print "Got response: $vt_positives\n";
});
Mojo::IOLoop->start unless Mojo::IOLoop->is_running;
第一个代码是OK,第二个是行不通的。 我必须在发送请求时做错了什么,因为我似乎得到403响应(不正确的API使用)。 我也试过 - > json的电话,但它没有工作。
即使我已经做了正确的要求,我不知道如果我与魔的JSON结果正确解码。
帮助将不胜感激!
编辑
看来我们错过了真正的问题,如何发布形式。 哎呀抱歉。
发布形式取决于您所使用的Mojolicious的版本。 直到最近 (v3.85 - 2013年2月13日)有一个post_form
方法。 经过思考然而,就决定应该或者是*_form
为每个请求类型的方法,还是应该做一些更聪明,因而form
发电机诞生了。
$response_vt = $ua->post(
$url,
form => {'apikey' => $key, 'resource' => $md5},
sub { ... }
);
它可以添加到任何请求方法,使比旧的形式,它更加一致。 另外请注意,它应该是一个hashref,而不是一个数组引用作为LWP允许。 BTW也有一个json
发生器太是这样工作的,或者你甚至可以添加你自己的 !
我要离开我原来的答复,显示出无阻塞使用,这你现在可以修改给出的上方。
原版的
建立关从creaktive的逻辑,这是我会怎么开始。 主要的区别是,没有一个监控器观看,以确保有作品去,而当一个人完成它检查,以确保没有闲人。
我也取得了解析逻辑的一些变化,但没有重大。
#!/usr/bin/env perl
use Mojo::Base -strict;
use utf8::all;
use Mojo::URL;
use Mojo::UserAgent;
# FIFO queue
my @urls = qw(
http://sysd.org/page/1/
http://sysd.org/page/2/
http://sysd.org/page/3/
);
# User agent following up to 5 redirects
my $ua = Mojo::UserAgent
->new(max_redirects => 5)
->detect_proxy;
start_urls($ua, \@urls, \&get_callback);
sub start_urls {
my ($ua, $queue, $cb) = @_;
# Limit parallel connections to 4
state $idle = 4;
state $delay = Mojo::IOLoop->delay(sub{say @$queue ? "Loop ended before queue depleated" : "Finished"});
while ( $idle and my $url = shift @$queue ) {
$idle--;
print "Starting $url, $idle idle\n\n";
$delay->begin;
$ua->get($url => sub{
$idle++;
print "Got $url, $idle idle\n\n";
$cb->(@_, $queue);
# refresh worker pool
start_urls($ua, $queue, $cb);
$delay->end;
});
}
# Start event loop if necessary
$delay->wait unless $delay->ioloop->is_running;
}
sub get_callback {
my ($ua, $tx, $queue) = @_;
# Parse only OK HTML responses
return unless
$tx->res->is_status_class(200)
and $tx->res->headers->content_type =~ m{^text/html\b}ix;
# Request URL
my $url = $tx->req->url;
say "Processing $url";
parse_html($url, $tx, $queue);
}
sub parse_html {
my ($url, $tx, $queue) = @_;
state %visited;
my $dom = $tx->res->dom;
say $dom->at('html title')->text;
# Extract and enqueue URLs
$dom->find('a[href]')->each(sub{
# Validate href attribute
my $link = Mojo::URL->new($_->{href});
return unless eval { $link->isa('Mojo::URL') };
# "normalize" link
$link = $link->to_abs($url)->fragment(undef);
return unless grep { $link->protocol eq $_ } qw(http https);
# Don't go deeper than /a/b/c
return if @{$link->path->parts} > 3;
# Access every link only once
return if $visited{$link->to_string}++;
# Don't visit other hosts
return if $link->host ne $url->host;
push @$queue, $link;
say " -> $link";
});
say '';
return;
}
看看这个并发请求基于Mojolicious,网络爬虫,我写来说明我的文章网络与现代的Perl刮 :
#!/usr/bin/env perl
use 5.010;
use open qw(:locale);
use strict;
use utf8;
use warnings qw(all);
use Mojo::UserAgent;
# FIFO queue
my @urls = map { Mojo::URL->new($_) } qw(
http://sysd.org/page/1/
http://sysd.org/page/2/
http://sysd.org/page/3/
);
# Limit parallel connections to 4
my $max_conn = 4;
# User agent following up to 5 redirects
my $ua = Mojo::UserAgent
->new(max_redirects => 5)
->detect_proxy;
# Keep track of active connections
my $active = 0;
Mojo::IOLoop->recurring(
0 => sub {
for ($active + 1 .. $max_conn) {
# Dequeue or halt if there are no active crawlers anymore
return ($active or Mojo::IOLoop->stop)
unless my $url = shift @urls;
# Fetch non-blocking just by adding
# a callback and marking as active
++$active;
$ua->get($url => \&get_callback);
}
}
);
# Start event loop if necessary
Mojo::IOLoop->start unless Mojo::IOLoop->is_running;
sub get_callback {
my (undef, $tx) = @_;
# Deactivate
--$active;
# Parse only OK HTML responses
return
if not $tx->res->is_status_class(200)
or $tx->res->headers->content_type !~ m{^text/html\b}ix;
# Request URL
my $url = $tx->req->url;
say $url;
parse_html($url, $tx);
return;
}
sub parse_html {
my ($url, $tx) = @_;
say $tx->res->dom->at('html title')->text;
# Extract and enqueue URLs
for my $e ($tx->res->dom('a[href]')->each) {
# Validate href attribute
my $link = Mojo::URL->new($e->{href});
next if 'Mojo::URL' ne ref $link;
# "normalize" link
$link = $link->to_abs($tx->req->url)->fragment(undef);
next unless grep { $link->protocol eq $_ } qw(http https);
# Don't go deeper than /a/b/c
next if @{$link->path->parts} > 3;
# Access every link only once
state $uniq = {};
++$uniq->{$url->to_string};
next if ++$uniq->{$link->to_string} > 1;
# Don't visit other hosts
next if $link->host ne $url->host;
push @urls, $link;
say " -> $link";
}
say '';
return;
}
LWP ::用户代理接受参数后如任一裁判阵列或ref散列格式。
http://search.cpan.org/~gaas/libwww-perl-6.04/lib/LWP/UserAgent.pm#REQUEST_METHODS
$ua->post( $url, \%form )
$ua->post( $url, \@form )
你在裁判阵列格式的第一脚本提供“\ @form”
my $response = $ua->post( $url, ['apikey' => $key, 'resource' => $md5] );
是因为它是这可能是更好的写在哈希结构哈希“\%的形式”
my $response = $ua->post( $url, {'apikey' => $key, 'resource' => $md5} );
在魔::用户代理的参数后是一个稍微复杂一些,但本质上似乎是一个哈希裁判的“串”到哈希键,与我是陌生的。 然而,你可能会发现使用散列裁判格式正确提供预期的参数。
http://search.cpan.org/~sri/Mojolicious-3.87/lib/Mojo/UserAgent.pm#post
POST
my $tx = $ua->post('kraih.com');
my $tx = $ua->post('http://kraih.com' => {DNT => 1} => 'Hi!');
my $tx = $ua->post('http://kraih.com' => {DNT => 1} => form => {a => 'b'});
my $tx = $ua->post('http://kraih.com' => {DNT => 1} => json => {a => 'b'});
试试这个 ?:
$response_vt = $ua->post( $url => form => {'apikey' => $key, 'resource' => $md5} => sub {... });