Call getPage from htmlunit WebClient with JavaScri

2020-02-06 10:51发布

问题:

I'm having problems with Htmlunit, I disabled JavaScript and set timeout to 10000 before calling getpage, I expected an exception after timeout but htmlunit waits forever.

After some search I realized someone in 2009 had the same problem (Connection timeout not working), he was complaining about "Connection timeout not working" and about some values in timeout not working but until now in 2011 didn't get any answer.

Someone here was asking about what exception is thrown but I think it doesn't throw it always. I can't get an answer from Apache HttpClient setTimeout, either. You can see another person asking about stop in timeout in Terminate or Stop HtmlUnit.

You can see how crazy it is if you try:

milisecReqTimeout = 10;
while(true)
{
    _webclient.setTimeout(milisecReqTimeout);
    milisecReqTimeout = milisecReqTimeout + 10;
    _htmlpage = _webclient.getPage(url);
}

回答1:

     _thewebclient.setWebConnection(new HttpWebConnection(_thewebclient) {
     @Override
     protected synchronized AbstractHttpClient getHttpClient() {
         AbstractHttpClient client = super.getHttpClient();
         if (_TimeoutCliSocket > 0) {
             //Sets the socket timeout (SO_TIMEOUT) in milliseconds to
             //be used when executing the method.
             //A timeout value of zero is interpreted as an infinite timeout.
             //Time that a read operation will block for, before generating 
             //an java.io.InterruptedIOException
             client.getParams().setParameter("http.socket.timeout", 
                                                      _TimeoutCliSocket);
         }
         if (_TimeoutCliConnection > 0) {
             //The timeout in milliseconds used when retrieving an
             // HTTP connection from the HTTP connection manager.
             // Zero means to wait indefinitely.
             client.getParams().setParameter("http.connection-manager.timeout", 
                                                     _TimeoutCliConnection);
         }
         client.getParams().setParameter("http.tcp.nodelay", true);
         return client;
     }
 });

Bye



回答2:

I found, with HttpUnit 1.6.2 setting these

    final HttpClient client = new HttpClient();
    final GetMethod method = new GetMethod(pUrl);

    client.setConnectionTimeout((int) timeout);
    client.setTimeout((int) timeout);

    final int statusCode = client.executeMethod(method);

Seemed to do the trick. Both are deprecated methods. :(