Is there a way to detect circular arrays in pure P

2020-05-19 05:50发布

问题:

I'm trying to implement my own serialization / var_dump style function in PHP. It seems impossible if there is the possibility of circular arrays (which there is).

In recent PHP versions, var_dump seems to detect circular arrays:

php > $a = array();
php > $a[] = &$a;
php > var_dump($a);
array(1) {
  [0]=>
  &array(1) {
    [0]=>
    *RECURSION*
  }
}

How would I implement my own serialization type of method in PHP that can detect similarly? I can't just keep track of which arrays I've visited, because strict comparison of arrays in PHP returns true for different arrays that contain the same elements and comparing circular arrays causes a Fatal Error, anyways.

php > $b = array(1,2);
php > $c = array(1,2);
php > var_dump($b === $c);
bool(true)
php > $a = array();
php > $a[] = &$a;
php > var_dump($a === $a);
PHP Fatal error:  Nesting level too deep - recursive dependency? in php shell code on line 1

I've looked for a way to find a unique id (pointer) for an array, but I can't find one. spl_object_hash only works on objects, not arrays. If I cast multiple different arrays to objects they all get the same spl_object_hash value (why?).

EDIT:

Calling print_r, var_dump, or serialize on each array and then using some mechanism to detect the presence of recursion as detected by those methods is an algorithmic complexity nightmare and will basically render any use too slow to be practical on large nested arrays.

ACCEPTED ANSWER:

I accepted the answer below that was the first to suggest temporarily altering the an array to see if it is indeed the same as another array. That answers the "how do I compare two arrays for identity?" from which recursion detection is trivial.

回答1:

The isRecursiveArray(array) method below detects circular/recursive arrays. It keeps track of which arrays have been visited by temporarily adding an element containing a known object reference to the end of the array.

If you want help writing the serialization method, please update your topic question and provide a sample serialization format in your question.

function removeLastElementIfSame(array & $array, $reference) {
    if(end($array) === $reference) {
        unset($array[key($array)]);
    }
}

function isRecursiveArrayIteration(array & $array, $reference) {
    $last_element   = end($array);
    if($reference === $last_element) {
        return true;
    }
    $array[]    = $reference;

    foreach($array as &$element) {
        if(is_array($element)) {
            if(isRecursiveArrayIteration($element, $reference)) {
                removeLastElementIfSame($array, $reference);
                return true;
            }
        }
    }

    removeLastElementIfSame($array, $reference);

    return false;
}

function isRecursiveArray(array $array) {
    $some_reference = new stdclass();
    return isRecursiveArrayIteration($array, $some_reference);
}



$array      = array('a','b','c');
var_dump(isRecursiveArray($array));
print_r($array);



$array      = array('a','b','c');
$array[]    = $array;
var_dump(isRecursiveArray($array));
print_r($array);



$array      = array('a','b','c');
$array[]    = &$array;
var_dump(isRecursiveArray($array));
print_r($array);



$array      = array('a','b','c');
$array[]    = &$array;
$array      = array($array);
var_dump(isRecursiveArray($array));
print_r($array);


回答2:

Funny method (I know it is stupid :)), but you can modify it and track the "path" to the recursive element. This is just an idea :) Based on the property of the serialized string, when recursion starts in will be the same as the string for the original array. As you can see - I tried it on many different variations and might be something is able to 'fool' it, but it 'detects' all listed recursions. And I did not try recursive arrays with objects.

$a = array('b1'=>'a1','b2'=>'a2','b4'=>'a3','b5'=>'R:1;}}}');
$a['a1'] = &$a;
$a['b6'] = &$a;
$a['b6'][] = array(1,2,&$a);
$b = serialize($a); 
print_r($a);
function WalkArrayRecursive(&$array_name, &$temp){
    if (is_array($array_name)){
        foreach ($array_name as $k => &$v){
           if (is_array($v)){
                if (strpos($temp, preg_replace('#R:\d+;\}+$#', '', 
                               serialize($v)))===0) 
                { 
                  echo "\n Recursion detected at " . $k ."\n"; 
                  continue; 
                }
                WalkArrayRecursive($v, $temp);
            }
        }
    }
}
WalkArrayRecursive($a, $b);

regexp is for the situation when element with recursion is at the 'end' of the array. and, yes, this recursion is related to the whole array. It is possible to make recursion of the subelements, but it is too late for me to think about them. Somehow every element of the array should be checked for the recursion in its subelements. The same way, like above, through the output of the print_r function, or looking for specific record for recursion in serialized string (R:4;} something like this). And tracing should start from that element, comparing everything below by my script. All that is only if you want to detect where recursion starts, not just whether you have it or not.

ps: but the best thing should be, as I think, to write your own unserialize function from serailized string created by php itself.



回答3:

My approach is to have a temp array that holds a copy of all objects that were already iterated. like this here:

// We use this to detect recursion.
global $recursion;
$recursion = [];

function dump( $data, $label, $level = 0 ) {
    global $recursion;

    // Some nice output for debugging/testing...
    echo "\n";
    echo str_repeat( "  ", $level );
    echo $label . " (" . gettype( $data ) . ") ";

    // -- start of our recursion detection logic
    if ( is_object( $data ) ) {
        foreach ( $recursion as $done ) {
            if ( $done === $data ) {
                echo "*RECURSION*";
                return;
            }
        }

        // This is the key-line: Remember that we processed this item!
        $recursion[] = $data;
    }
    // -- end of recursion check

    if ( is_array( $data ) || is_object( $data ) ) {
        foreach ( (array) $data as $key => $item ) {
            dump( $item, $key, $level + 1 );
        }
    } else {
        echo "= " . $data;
    }
}

And here is some quick demo code to illustrate how it works:

$obj = new StdClass();
$obj->arr = [];
$obj->arr[] = 'Foo';
$obj->arr[] = $obj;
$obj->arr[] = 'Bar';
$obj->final = 12345;
$obj->a2 = $obj->arr;

dump( $obj, 'obj' );

This script will generate the following output:

obj (object) 
  arr (array) 
    0 (string) = Foo
    1 (object) *RECURSION*
    2 (string) = Bar
  final (integer) = 12345
  a2 (array) 
    0 (string) = Foo
    1 (object) *RECURSION*
    2 (string) = Bar


回答4:

It's not elegant, but solves your problem (at least if you dont have someone using *RECURSION* as a value).

<?php
$a[] = &$a;
if(strpos(print_r($a,1),'*RECURSION*') !== FALSE) echo 1;