Reactive WebClient not emitting a response

2019-02-14 11:20发布

问题:

I have a question about Spring Reactive WebClient... Few days ago I decided to play with the new reactive stuff in Spring Framework and I made one small project for scraping data only for personal purposes. (making multiple requests to one webpage and combining the results).

I started using the new reactive WebClient for making requests but the problem I found is that the client not emitting response for every request. Sounds strange. Here is what I did for fetching data:

private Mono<String> fetchData(String uri) {
    return this.client
            .get()
            .uri(uri)
            .header("X-Fsign","SW9D1eZo")
            .retrieve()
            .bodyToMono(String.class)
            .timeout(Duration.ofSeconds(35))
            .log("category", Level.ALL, SignalType.ON_ERROR, SignalType.ON_COMPLETE, SignalType.CANCEL, SignalType.REQUEST);
}

And the function that calls fetchData:

public Mono<List<Stat>> fetch() {
    return fetchData(URL)
            .map(this::extractUrls)
            .doOnNext(System.out::println)
            .doOnNext(s-> System.out.println("all ids are "+s.size()))
            .flatMapIterable(q->q)
            .map(s -> s.substring(7, 15))
            .map(s -> "http://d.flashscore.com/x/feed/d_hh_" + s + "_en_1") // list of N-length urls
            .flatMap(this::fetchData)
            .map(this::extractHeadToHead)
            .collectList();
}

and the subscriber:

    FlashScoreService bean = ctx.getBean(FlashScoreService.class);
    bean.fetch().subscribe(s->{
        System.out.println("finished !!! " + s.size()); //expecting same N-length list size
    },Throwable::printStackTrace);

The problem is if I made a little bit more requests > 100. I didn't get responses for all of them, no error is thrown or error response code is returned and subscribe method is invoked with size different from the number of requests.

The requests I made are based on List of Strings (urls) and after all responses are emitted I should receive all of them as list because I'm using collectList(). When I execute 100 requests, I expect to receive list of 100 responses but actually I'm receiving sometimes 100, sometimes 96 etc ... May be something fails silently. This is easy reproducible here is my github project link.

Sample output:

all ids are 176
finished !!! 171

Please give me suggestions how I can debug or what I'm doing wrong. Help is appreciated.

Update:

The log shows if I pass 126 urls for example:

onNext(ReactorClientHttpResponse{request=[GET/some_url],status=200}) is called 121 times. May be here is the problem.
onComplete() is called 126 times which is the exact same length of the passed list of urls

but how it's possible some of the requests to be completed without calling onNext() or onError( ) ? (success and error in Mono)

I think the problem is not in the WebClient but somewhere else. Environment or server blocking the request, but may be I should receive some error log.

ps. Thanks for the help !

回答1:

This is a tricky one. Debugging the actual HTTP frames received, it seems we're really not getting responses for some requests. Debugging a little more with Wireshark, it looks like the remote server is requesting the end of the connection with a FIN, ACK TCP packet and that the client acknowledges it. The problem is this connection is still taken from the pool to send another GET request after the first FIN, ACK TCP packet.

Maybe the remote server is closing connections after they've served a number of requests; in any case it's perfectly legal behavior. Note that I'm not reproducing this consistently.

Workaround

You can disable connection pooling on the client; this will be slower and apparently doesn't trigger this issue. For that, use the following:

this.client = WebClient.builder()
                .clientConnector(new ReactorClientHttpConnector(new Consumer<HttpClientOptions.Builder>() {
                    @Override
                    public void accept(HttpClientOptions.Builder builder) {
                        builder.disablePool();
                    }
                }))
                .build();

Underlying issue

The root problem is that the HTTP client should not onComplete when the TCP connection is closed without sending a response. Or better, the HTTP client should not reuse a connection while it's being closed. I'll report back here when I'll know more.