Preferred method to store PHP arrays (json_encode

2018-12-31 08:59发布

I need to store a multi-dimensional associative array of data in a flat file for caching purposes. I might occasionally come across the need to convert it to JSON for use in my web app but the vast majority of the time I will be using the array directly in PHP.

Would it be more efficient to store the array as JSON or as a PHP serialized array in this text file? I've looked around and it seems that in the newest versions of PHP (5.3), json_decode is actually faster than unserialize.

I'm currently leaning towards storing the array as JSON as I feel its easier to read by a human if necessary, it can be used in both PHP and JavaScript with very little effort, and from what I've read, it might even be faster to decode (not sure about encoding, though).

Does anyone know of any pitfalls? Anyone have good benchmarks to show the performance benefits of either method?

19条回答
人气声优
2楼-- · 2018-12-31 09:00

JSON is simpler and faster than PHP's serialization format and should be used unless:

  • You're storing deeply nested arrays: json_decode(): "This function will return false if the JSON encoded data is deeper than 127 elements."
  • You're storing objects that need to be unserialized as the correct class
  • You're interacting with old PHP versions that don't support json_decode
查看更多
回忆,回不去的记忆
3楼-- · 2018-12-31 09:01

I've tested this very thoroughly on a fairly complex, mildly nested multi-hash with all kinds of data in it (string, NULL, integers), and serialize/unserialize ended up much faster than json_encode/json_decode.

The only advantage json have in my tests was it's smaller 'packed' size.

These are done under PHP 5.3.3, let me know if you want more details.

Here are tests results then the code to produce them. I can't provide the test data since it'd reveal information that I can't let go out in the wild.

JSON encoded in 2.23700618744 seconds
PHP serialized in 1.3434419632 seconds
JSON decoded in 4.0405561924 seconds
PHP unserialized in 1.39393305779 seconds

serialized size : 14549
json_encode size : 11520
serialize() was roughly 66.51% faster than json_encode()
unserialize() was roughly 189.87% faster than json_decode()
json_encode() string was roughly 26.29% smaller than serialize()

//  Time json encoding
$start = microtime( true );
for($i = 0; $i < 10000; $i++) {
    json_encode( $test );
}
$jsonTime = microtime( true ) - $start;
echo "JSON encoded in $jsonTime seconds<br>";

//  Time serialization
$start = microtime( true );
for($i = 0; $i < 10000; $i++) {
    serialize( $test );
}
$serializeTime = microtime( true ) - $start;
echo "PHP serialized in $serializeTime seconds<br>";

//  Time json decoding
$test2 = json_encode( $test );
$start = microtime( true );
for($i = 0; $i < 10000; $i++) {
    json_decode( $test2 );
}
$jsonDecodeTime = microtime( true ) - $start;
echo "JSON decoded in $jsonDecodeTime seconds<br>";

//  Time deserialization
$test2 = serialize( $test );
$start = microtime( true );
for($i = 0; $i < 10000; $i++) {
    unserialize( $test2 );
}
$unserializeTime = microtime( true ) - $start;
echo "PHP unserialized in $unserializeTime seconds<br>";

$jsonSize = strlen(json_encode( $test ));
$phpSize = strlen(serialize( $test ));

echo "<p>serialized size : " . strlen(serialize( $test )) . "<br>";
echo "json_encode size : " . strlen(json_encode( $test )) . "<br></p>";

//  Compare them
if ( $jsonTime < $serializeTime )
{
    echo "json_encode() was roughly " . number_format( ($serializeTime / $jsonTime - 1 ) * 100, 2 ) . "% faster than serialize()";
}
else if ( $serializeTime < $jsonTime )
{
    echo "serialize() was roughly " . number_format( ($jsonTime / $serializeTime - 1 ) * 100, 2 ) . "% faster than json_encode()";
} else {
    echo 'Unpossible!';
}
    echo '<BR>';

//  Compare them
if ( $jsonDecodeTime < $unserializeTime )
{
    echo "json_decode() was roughly " . number_format( ($unserializeTime / $jsonDecodeTime - 1 ) * 100, 2 ) . "% faster than unserialize()";
}
else if ( $unserializeTime < $jsonDecodeTime )
{
    echo "unserialize() was roughly " . number_format( ($jsonDecodeTime / $unserializeTime - 1 ) * 100, 2 ) . "% faster than json_decode()";
} else {
    echo 'Unpossible!';
}
    echo '<BR>';
//  Compare them
if ( $jsonSize < $phpSize )
{
    echo "json_encode() string was roughly " . number_format( ($phpSize / $jsonSize - 1 ) * 100, 2 ) . "% smaller than serialize()";
}
else if ( $phpSize < $jsonSize )
{
    echo "serialize() string was roughly " . number_format( ($jsonSize / $phpSize - 1 ) * 100, 2 ) . "% smaller than json_encode()";
} else {
    echo 'Unpossible!';
}
查看更多
裙下三千臣
4楼-- · 2018-12-31 09:05

JSON is better if you want to backup Data and restore it on a different machine or via FTP.

For example with serialize if you store data on a Windows server, download it via FTP and restore it on a Linux one it could not work any more due to the charachter re-encoding, because serialize stores the length of the strings and in the Unicode > UTF-8 transcoding some 1 byte charachter could became 2 bytes long making the algorithm crash.

查看更多
临风纵饮
5楼-- · 2018-12-31 09:06

Really nice topic and after reading the few answers, I want to share my experiments on the subject.

I got a use case where some "huge" table needs to be queried almost every time I talk to the database (don't ask why, just a fact). The database caching system isn't appropriate as it'll not cache the different requests, so I though about php caching systems.

I tried apcu but it didn't fit the needs, memory isn't enough reliable in this case. Next step was to cache into a file with serialization.

Table has 14355 entries with 18 columns, those are my tests and stats on reading the serialized cache:

JSON:

As you all said, the major inconvenience with json_encode/json_decode is that it transforms everything to an StdClass instance (or Object). If you need to loop it, transforming it to an array is what you'll probably do, and yes it's increasing the transformation time

average time: 780.2 ms; memory use: 41.5MB; cache file size: 3.8MB

Msgpack

@hutch mentions msgpack. Pretty website. Let's give it a try shall we?

average time: 497 ms; memory use: 32MB; cache file size: 2.8MB

That's better, but requires a new extension; compiling sometimes afraid people...

IgBinary

@GingerDog mentions igbinary. Note that I've set the igbinary.compact_strings=Offbecause I care more about reading performances than file size.

average time: 411.4 ms; memory use: 36.75MB; cache file size: 3.3MB

Better than msg pack. Still, this one requires compiling too.

serialize/unserialize

average time: 477.2 ms; memory use: 36.25MB; cache file size: 5.9MB

Better performances than JSON, the bigger the array is, slower json_decode is, but you already new that.

Those external extensions are narrowing down the file size and seems great on paper. Numbers don't lie*. What's the point of compiling an extension if you get almost the same results that you'd have with a standard PHP function?

We can also deduce that depending on your needs, you will choose something different than someone else:

  • IgBinary is really nice and performs better than MsgPack
  • Msgpack is better at compressing your datas (note that I didn't tried the igbinary compact.string option).
  • Don't want to compile? Use standards.

That's it, another serialization methods comparison to help you choose the one!

*Tested with PHPUnit 3.7.31, php 5.5.10 - only decoding with a standard hardrive and old dual core CPU - average numbers on 10 same use case tests, your stats might be different

查看更多
有味是清欢
6楼-- · 2018-12-31 09:06

Before you make your final decision, be aware that the JSON format is not safe for associative arrays - json_decode() will return them as objects instead:

$config = array(
    'Frodo'   => 'hobbit',
    'Gimli'   => 'dwarf',
    'Gandalf' => 'wizard',
    );
print_r($config);
print_r(json_decode(json_encode($config)));

Output is:

Array
(
    [Frodo] => hobbit
    [Gimli] => dwarf
    [Gandalf] => wizard
)
stdClass Object
(
    [Frodo] => hobbit
    [Gimli] => dwarf
    [Gandalf] => wizard
)
查看更多
千与千寻千般痛.
7楼-- · 2018-12-31 09:06

just an fyi -- if you want to serialize your data to something easy to read and understand like JSON but with more compression and higher performance, you should check out messagepack.

查看更多
登录 后发表回答