PHP - json_encode a generator object (using yield)

2019-04-21 00:53发布

I have a very large array in PHP (5.6), generated dynamically, which I want to convert to JSON. The problem is that the array is too large that it doesn't fit in memory - I get a fatal error when I try to process it (exhausted memory). So I figured out that, using generators, the memory problem will disappear.

This is the code I've tried so far (this reduced example obvisously doesn't produce the memory error):

<?php 
function arrayGenerator()// new way using generators
{
    for ($i = 0; $i < 100; $i++) {
        yield $i;
    }
}

function getArray()// old way, generating and returning the full array
{
    $array = [];
    for ($i = 0; $i < 100; $i++) {
        $array[] = $i;
    }
    return $array;
}

$object = [
    'id' => 'foo',
    'type' => 'blah',
    'data' => getArray(),
    'gen'  => arrayGenerator(),
];

echo json_encode($object);

But PHP seems to not JSON-encode the values from the generator. This is the output I get from the previuos script:

{
    "id": "foo",
    "type": "blah",
    "data": [// old way - OK
        0,
        1,
        2,
        3,
        //...
    ],
    "gen": {}// using generator - empty object!
}

Is it even possible to JSON-encode an array produced by a generator without generating the full sequence before I call to json_encode?

2条回答
Root(大扎)
2楼-- · 2019-04-21 01:18

Unfortunately, json_encode cannot generate a result from a generator function. Using iterator_to_array will still try to create the whole array, which will still cause memory issues.

You will need to create your function that will generate the json string from the generator function. Here's an example of how that could look:

function json_encode_generator(callable $generator) {
    $result = '[';

    foreach ($generator as $value) {
        $result .= json_encode($value) . ',';
    }

    return trim($result, ',') . ']';
}

Instead of encoding the whole array at once, it encodes only one object at a time and concatenates the results into one string.

The above example only takes care of encoding an array, but it can be easily extended to recursively encoding whole objects.

If the created string is still too big to fit in the memory, then your only remaining option is to directly use an output stream. Here's how that could look:

function json_encode_generator(callable $generator, $outputStream) {
    fwrite($outputStream, '[');

    foreach ($generator as $key => $value) {
        if ($key != 0) {
            fwrite($outputStream, ','); 
        }

        fwrite($outputStream, json_encode($value));
    }

    fwrite($outputStream, ']');
}

As you can see, the only difference is that we now use fwrite to write to the passed in stream instead of concatenating strings, and we also need to take care of the trailing comma in a different way.

查看更多
地球回转人心会变
3楼-- · 2019-04-21 01:26

What is a generator function?

A generator function is effectively a more compact and efficient way to write an Iterator. It allows you to define a function that will calculate and return values while you are looping over it:

Also as per document from http://php.net/manual/en/language.generators.overview.php

Generators provide an easy way to implement simple iterators without the overhead or complexity of implementing a class that implements the Iterator interface.

A generator allows you to write code that uses foreach to iterate over a set of data without needing to build an array in memory, which may cause you to exceed a memory limit, or require a considerable amount of processing time to generate. Instead, you can write a generator function, which is the same as a normal function, except that instead of returning once, a generator can yield as many times as it needs to in order to provide the values to be iterated over.

What is yield?

The yield keyword returns data from a generator function:

The heart of a generator function is the yield keyword. In its simplest form, a yield statement looks much like a return statement, except that instead of stopping execution of the function and returning, yield instead provides a value to the code looping over the generator and pauses execution of the generator function.

So in your case to generate expected output you need to iterate output of arrayGenerator() function by using foreach loop or iterator before processind it to json (as suggested by @apokryfos)

查看更多
登录 后发表回答