Cookies in perl lwp

2019-05-30 11:00发布

问题:

I once wrote a simple 'crawler' to download http pages for me in JAVA. Now I'm trying to rewrite to same thing to Perl, using LWP module.

This is my Java code (which works fine):

String referer = "http://example.com";
String url = "http://example.com/something/cgi-bin/something.cgi";
String params= "a=0&b=1";

HttpState initialState = new HttpState(); HttpClient httpclient = new HttpClient(); httpclient.setState(initialState); httpclient.getParams().setCookiePolicy(CookiePolicy.NETSCAPE);

PostMethod postMethod = new PostMethod(url); postMethod.addRequestHeader("Referer", referer); postMethod.addRequestHeader("User-Agent", " Mozilla/5.0 (Windows; U; Windows NT 6.1; pl; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13"); postMethod.addRequestHeader("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8"); postMethod.addRequestHeader("Content-Type", "application/x-www-form-urlencoded");

String length = String.valueOf(params.length()); postMethod.addRequestHeader("Content-Length", length); postMethod.setRequestBody(params);

httpclient.executeMethod(postMethod);

And this is the Perl version:

my $referer = "http://example.com/something/cgi-bin/something.cgi?module=A";
my $url = "http://example.com/something/cgi-bin/something.cgi";
my @headers = (
  'User-Agent' => 'Mozilla/5.0 (Windows; U; Windows NT 6.1; pl; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13',
  'Accept' => 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
  'Referer' => $referer,
  'Content-Type' => 'application/x-www-form-urlencoded',
);

my @params = (
    'a' => '0',
    'b' => '1',
);

my $browser = LWP::UserAgent->new( );
$browser->cookie_jar({});

$response = $browser->post($url, @params, @headers);
print $response->content;

The post request executes correctly, but I get another (main) webpage. As if cookies were not working properly...

Any guesses what is wrong? Why I'm getting different result from JAVA and perl programs?

回答1:

You want to be creating hashes, not arrays - e.g. instead of:

my @params = ( 'a' => '0', 'b' => '1', );

You should use:

my %params = ( a => 0, b => 1, );

When passing the params to the LWP::UserAgent post method, you need to pass a reference to the hash, e.g.:

$response = $browser->post($url, \%params, %headers);

You could also look at the request you're sending to the server with:

print $response->request->as_string;

You can also use a handler to automatically dump requests and responses for debugging purposes:

$ua->add_handler("request_send", sub { shift->dump; return }); $ua->add_handler("response_done", sub { shift->dump; return });



回答2:

You can also use WWW::Mechanize, which is a wrapper around LWP::UserAgent. It gives you the cookie jar automatically.



回答3:

I believe it has to do with $response = $browser->post($url, @params, @headers);

From the doc of LWP::UserAgent

$ua->post( $url, \%form )
$ua->post( $url, \@form )
$ua->post( $url, \%form, $field_name => $value, ... )
$ua->post( $url, $field_name => $value,... Content => \%form )
$ua->post( $url, $field_name => $value,... Content => \@form )
$ua->post( $url, $field_name => $value,... Content => $content )

Since your params and headers are as hashes, I would try this:

my $referer = "http://example.com/something/cgi-bin/something.cgi?module=A";
my $url = "http://example.com/something/cgi-bin/something.cgi";
my %headers = (
  'User-Agent' => 'Mozilla/5.0 (Windows; U; Windows NT 6.1; pl; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13',
  'Accept' => 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
  'Referer' => $referer,
  'Content-Type' => 'application/x-www-form-urlencoded',
);

my %params = (
    'a' => '0',
    'b' => '1',
);

my $browser = LWP::UserAgent->new( );
$browser->cookie_jar({});

$response = $browser->post($url, \%params, %headers);