Incrementally add requests to a Guzzle 5.0 Pool (R

2020-06-29 06:28发布

问题:

I'm using Guzzle to fetch a large number of URLs in parallel (or asynchronously) using a pool:

$client = new GuzzleHttp\Client([
    'base_url' => 'http://httpbin.org',
]);

$requests = [];

for ($i = 0; $i < 8; ++$i) {
    $requests[] = $client->createRequest('GET', '/get');
}

$pool = new GuzzleHttp\Pool($client, $requests, [
    'pool_size' => 4,
    'complete' => function (GuzzleHttp\Event\CompleteEvent $event) {
        var_dump($event->getRequest()->getUrl());
    },
]);

$pool->wait();

var_dump(count($requests));

If I run the above in the console it displays the expected output:

string(22) "http://httpbin.org/get"
string(22) "http://httpbin.org/get"
string(22) "http://httpbin.org/get"
string(22) "http://httpbin.org/get"
string(22) "http://httpbin.org/get"
string(22) "http://httpbin.org/get"
string(22) "http://httpbin.org/get"
string(22) "http://httpbin.org/get"
int(8)

Now, I would like to be able to add additional requests to the same pool based on some condition, I believe this behavior is usually known as rolling [parallel] requests, but after reading and re-reading the documentation I haven't managed to figure it out. Here's something I tried:

$client = new GuzzleHttp\Client([
    'base_url' => 'http://httpbin.org',
]);

$requests = [];

for ($i = 0; $i < 8; ++$i) {
    $requests[] = $client->createRequest('GET', '/get');
}

$i = 0;
$pool = new GuzzleHttp\Pool($client, $requests, [
    'pool_size' => 4,
    'complete' => function (GuzzleHttp\Event\CompleteEvent $event) use (&$i, $client, &$requests) {
        var_dump($event->getRequest()->getUrl());

        if (++$i % 3 == 0) {
            $requests[] = $client->createRequest('GET', '/ip');
        }
    },
]);

$pool->wait();

var_dump(count($requests));

Every third request to /get should add a new request to /ip, the $requests array is actually growing (to 10 elements and not 11 as would be expected) but the requests are never really executed. Is there a way of making a Guzzle pool execute post-initialization requests?

回答1:

It is possible, see my comment at the guzzle issue Suggestions to GuzzleHttp\Pool #946 for a full example or this gist for a more in-depth example of a comparison between generator, retry and sequential send with guzzle.

Regarding your example, see my inline comments:

$client = new GuzzleHttp\Client([
    'base_url' => 'http://httpbin.org',
]);

$requests = [];

for ($i = 0; $i < 8; ++$i) {
    $requests[] = $client->createRequest('GET', '/get');
}

$generator = new ArrayIterator($requests); // use an iterator instead of an array

$i = 0;
$pool = new GuzzleHttp\Pool($client, $generator, [ // use the iterator in the pool
    'pool_size' => 4,
    'complete' => function (GuzzleHttp\Event\CompleteEvent $event) use (&$i, $client, &$generator) {
        var_dump($event->getRequest()->getUrl());

        if (++$i % 3 == 0) {
            $generator->append($client->createRequest('GET', '/ip')); // append new requests on the fly
        }
    },
]);

$pool->wait();

This yields your expected output:

string(22) "http://httpbin.org/get"
string(22) "http://httpbin.org/get"
string(22) "http://httpbin.org/get"
string(22) "http://httpbin.org/get"
string(22) "http://httpbin.org/get"
string(22) "http://httpbin.org/get"
string(22) "http://httpbin.org/get"
string(22) "http://httpbin.org/get"
string(21) "http://httpbin.org/ip"
string(21) "http://httpbin.org/ip"
string(21) "http://httpbin.org/ip"

Please note, that the requests get appended at the end. This is contrary to the workings of AbstractRetryableEvent::retry which will squeeze the retry somewhere in between the current queue instead of appending it at the end.