-->

Convert JSON string to array WITHOUT json_decode

2019-09-06 00:45发布

问题:

I am using PHP on shared server to access external site via API that is returning JSON containing 2 levels of data (Level 1: Performer & Level 2: Category array inside performer). I want to convert this to multidimensional associative array WITHOUT USING json_decode function (it uses too much memory for this usage!!!)

Example of JSON data:

[
{
    "performerId": 99999,
    "name": " Any performer name",
    "category": {
        "categoryId": 99,
        "name": "Some category name",
        "eventType": "Category Event"
    },
    "eventType": "Performer Event",
    "url": "http://www.novalidsite.com/something/performerspage.html",
    "priority": 0
},
{
    "performerId": 88888,
    "name": " Second performer name",
    "category": {
        "categoryId": 88,
        "name": "Second Category name",
        "eventType": "Category Event 2"
    },
    "eventType": "Performer Event 2",
    "url": "http://www.novalidsite.com/somethingelse/performerspage2.html",
    "priority": 7
}
]

I have tried to use substr and strip the "[" and "]".

Then performed the call:

preg_match_all('/\{([^}]+)\}/', $input, $matches);

This gives me the string for each row BUT truncates after the trailing "}" of the category data.

How can I return the FULL ROW of data AS AN ARRAY using something like preg_split, preg_match_all, etc. INSTEAD of the heavy handed calls like json_decode on the overall JSON string?

Once I have the array with each row identified correctly, I CAN THEN perform json_decode on that string without overtaxing the memory on the shared server.


For those wanting more detail about json_decode usage causing error:

$aryPerformersfile[ ] = file_get_contents('https://subdomain.domain.com/dir/getresults?id=1234');
$aryPerformers = $aryPerformersfile[0];
unset($aryPerformersfile);
$mytmpvar = json_decode($aryPerformers);
print_r($mytmpvar);
exit;

回答1:

If you have a limited amount of memory, you could read the data as a stream and parse the JSON one piece at a time, instead of parsing everything at once.

getresults.json:

[
    {
        "performerId": 99999,
        "name": " Any performer name",
        "category": {
            "categoryId": 99,
            "name": "Some category name",
            "eventType": "Category Event"
        },
        "eventType": "Performer Event",
        "url": "http://www.novalidsite.com/something/performerspage.html",
        "priority": 0
    },
    {
        "performerId": 88888,
        "name": " Second performer name",
        "category": {
            "categoryId": 88,
            "name": "Second Category name",
            "eventType": "Category Event 2"
        },
        "eventType": "Performer Event 2",
        "url": "http://www.novalidsite.com/somethingelse/performerspage2.html",
        "priority": 7
    }
]

PHP:

$stream = fopen('getresults.json', 'rb');

// Read one character at a time from $stream until
// $count number of $char characters is read
function readUpTo($stream, $char, $count)
{
    $str = '';
    $foundCount = 0;
    while (!feof($stream)) {
        $readChar = stream_get_contents($stream, 1);

        $str .= $readChar;
        if ($readChar == $char && ++$foundCount == $count)
            return $str;
    }
    return false;
}

// Read one JSON performer object
function readOneJsonPerformer($stream)
{
    if ($json = readUpTo($stream, '{', 1))
        return '{' . readUpTo($stream, '}', 2);
    return false;
}

while ($json = readOneJsonPerformer($stream)) {
    $performer = json_decode($json);

    echo 'Performer with ID ' . $performer->performerId
        . ' has category ' . $performer->category->name, PHP_EOL;
}
fclose($stream);

Output:

Performer with ID 99999 has category Some category name
Performer with ID 88888 has category Second Category name

This code could of course be improved by using a buffer for faster reads, take into account that string values may themselves include { and } chars etc.



回答2:

You have two options here, and neither of them include you writing your own decoder; don't over-complicate the solution with an unnecessary work-around.

1) Decrease the size of the json that is being decoded, or 2) Increase the allowed memory on your server.

The first option would require access to the json that is being created. This may or may not be possible depending on if you're the one originally creating the json. The easiest way to do this is to unset() any useless data. For example, maybe there is some debug info you won't need, so you can do unset($json_array['debug']); on the useless data. http://php.net/manual/en/function.unset.php

The second option requires you to have access to the php.ini file on your server. You need to find the line with something like memory_limit = 128M and make the 128M part larger. Try increasing this to double the value already within the file (so it would be 256M in this case). This might not solve your problem though, since large json data could still be the core of your problem; this only provides a work-around for inefficient code.