Neo4jPhp too slow

2019-09-17 15:53发布

问题:

Today i have written first basic program for Neo4j from PHP. This was basically done to check out if we could use Neo4j in our new project from PHP by using Neo4jPhp. https://github.com/jadell/neo4jphp

here is my code

<!DOCTYPE html>
<html>
<body>

<h1>My first PHP page</h1>

<?php

include 'neo4jphp.phar';
echo "Hello World!";

// Connecting to the default port 7474 on localhost
$client = new Everyman\Neo4j\Client();
$queryString = 
    "MATCH (n)".
    "RETURN n";
$query = new Everyman\Neo4j\Cypher\Query($client, $queryString);
$result = $query->getResultSet();


foreach ($result as $row) {
    echo $row['n']->getProperty('name') . "\n";
}

?>

</body>
</html>

Now here i am just retrieving all the nodes with their property. Pretty simple.

if i run this from graphical console of Neo4j, it takes 86 ms. I have only 200 nodes and almost same property.

match (n)
return n


Returned 50 rows in 86 ms

If i run this from above PHP file, it takes 2-4 seconds in total to dump data in browser. Neo4j is running in same machine.

Please note that i have not done any changes in the configuration of both PHP and Neo4j. Everything is default. Please tell me if this is the expected behaviour of Neo4j with PHP or something is really wrong with my code or configuration.

Thanks a lot

回答1:

I have seen your question in the Neo4j Google Group as well and I asked if you could measure in PHP the execution time if instead of using the

echo $row['n']->getProperty('name') . "\n";.

you use

print_r($result);

Let me explain below why. When I started to play around with Neo4j and PHP I had some concerns over the effectiveness of PHP in terms of speed. I recreated your issue like so. First I created 200 random nodes. Each node has a Label, 10 properties and each properties has a value of 10 characters. This is te script I used.

for ($x=1; $x<=200; $x++)
  {
  $queryString = "CREATE (n:User { name : '".substr(md5(rand()), 0, 10)."' , city : '".substr(md5(rand()), 0, 10)."' , date : '".substr(md5(rand()), 0, 10)."', age : '".substr(md5(rand()), 0, 10)."', country : '".substr(md5(rand()), 0, 10)."', language : '".substr(md5(rand()), 0, 10)."', origin : '".substr(md5(rand()), 0, 10)."', preference : '".substr(md5(rand()), 0, 10)."', color : '".substr(md5(rand()), 0, 10)."', graduate : '".substr(md5(rand()), 0, 10)."'})";
            $query = new Everyman\Neo4j\Cypher\Query($client, $queryString);
            $result = $query->getResultSet();
  } 

Using the foreach loop I got the result like you did

foreach ($result as $row) {
    echo $row['n']->getProperty('name') . "\n";
}

and I measured the time executed using this code

$time_start = microtime(true);

$queryString = "MATCH (n) RETURN n";
            $query = new Everyman\Neo4j\Cypher\Query($client, $queryString);
            $result = $query->getResultSet();

foreach ($result as $row) {
    echo $row['n']->getProperty('name') . "\n";
}

$time_end = microtime(true);


$execution_time = ($time_end - $time_start)*1000;

//execution time of the script
echo '<b>Total Execution Time:</b> '.$execution_time.' ms';

With 200 nodes I got both on webadmin and the php around 85ms. The amount of data is not enough to get accurate results so I increased my nodes to 500. Time execution went up to 115ms both the webadmin and the php script. Increasing my nodes to 2000 I had execution time of 200ms but no significant differences between webadmin and php. Finally I got my nodes up to 10000. Ok now we have some results. Webadmin returns to me 10000 nodes in 1020ms. Php is way too slow though.

Total Execution Time: 1635.6329917908 ms

I think this is not what I expect. Instead of using the $row['x'] method I print_r the results and the time increased to

Total Execution Time: 2452.4049758911 ms

So I think lets not print all the properties on the screen but just return the nodes and the count(n) and see what we have if we print the count of each one which will be a "1".

$queryString = "MATCH (n) RETURN n AS n, count(n) AS x";
            $query = new Everyman\Neo4j\Cypher\Query($client, $queryString);
            $result = $query->getResultSet();
foreach ($result as $row) {
    echo $row['x'];
}

The result of the above code will be something like this.

1111111111111111111111...... Total Execution Time: 1084.1178894043 ms

As you can see php and webadmin return 10000 results in the same time (for 10000 nodes I do not think that 60 ms are a major difference) and to conclude my big answer with this:in php and Neo4j we do not loose time to retrieve a large amount of data but we loose a lot of time to render this data on our browser from PHP.



回答2:

Can you debug and measure what the REST request to neo4j server is actually taking? It should be something like 86ms, rest should be in the PHP code? Also, please use parameters so you don't have the overhead of creating query plans for repeating cypher queries.