Invalid JSON parsing using PHP

2020-01-26 11:40发布

I'm pulling a JSON feed that is invalid JSON. It's missing quotes entirely. I've tried a few things, like explode() and str_replace(), to get the string looking a little bit more like valid JSON, but with an associate JSON string inside, it generally gets screwed up.

Here's an example:

id:43015,name:'John Doe',level:15,systems:[{t:6,glr:1242,n:'server',s:185,c:9}],classs:0,subclass:5

Are there any JSON parsers for php out there that can handle invalid JSON like this?

Edit: I'm trying to use json_decode() on this string. It returns nothing.

标签: php json
6条回答
贼婆χ
2楼-- · 2020-01-26 11:54

From my experience Marko's answer doesnt work anymore. For newer php versions use this istead:

$a = "{id:43015,name:'John Doe',level:15,systems:[{t:6,glr:1242,n:'server',s:185,c:988}],classs:0,subclass:5}";
$a = preg_replace('/(,|\{)[ \t\n]*(\w+)[ ]*:[ ]*/','$1"$2":',$a);
$a = preg_replace('/":\'?([^\[\]\{\}]*?)\'?[ \n\t]*(,"|\}$|\]$|\}\]|\]\}|\}|\])/','":"$1"$2',$a);
print_r($a);
查看更多
够拽才男人
3楼-- · 2020-01-26 11:54

I'd say your best bet is to download the source of a JSON decoder (they're not huge) and fiddle with it, especially if you know what's wrong with the JSON you're trying to decode.

The example you provided needs { } around it, too, which may help.

查看更多
萌系小妹纸
4楼-- · 2020-01-26 11:56
  1. All the quotes should be double quotes " and not single quotes '.
  2. All the keys should be quoted.
  3. The whole element should be an object.
    function my_json_decode($s) {
        $s = str_replace(
            array('"',  "'"),
            array('\"', '"'),
            $s
        );
        $s = preg_replace('/(\w+):/i', '"\1":', $s);
        return json_decode(sprintf('{%s}', $s));
    }
查看更多
forever°为你锁心
5楼-- · 2020-01-26 11:56

I know this question is old, but I hope this helps someone.

I had a similar problem, in that I wanted to accept JSON as a user input, but didn't want to require tedious "quotes" around every key. Furthermore, I didn't want to require quotes around the values either, but still parse valid numbers.

The simplest way seemed to be writing a custom parser.

I came up with this, which parses to nested associative / indexed arrays:

function loose_json_decode($json) {
    $rgxjson = '%((?:\{[^\{\}\[\]]*\})|(?:\[[^\{\}\[\]]*\]))%';
    $rgxstr = '%("(?:[^"\\\\]*|\\\\\\\\|\\\\"|\\\\)*"|\'(?:[^\'\\\\]*|\\\\\\\\|\\\\\'|\\\\)*\')%';
    $rgxnum = '%^\s*([+-]?(\d+(\.\d*)?|\d*\.\d+)(e[+-]?\d+)?|0x[0-9a-f]+)\s*$%i';
    $rgxchr1 = '%^'.chr(1).'\\d+'.chr(1).'$%';
    $rgxchr2 = '%^'.chr(2).'\\d+'.chr(2).'$%';
    $chrs = array(chr(2),chr(1));
    $escs = array(chr(2).chr(2),chr(2).chr(1));
    $nodes = array();
    $strings = array();

    # escape use of chr(1)
    $json = str_replace($chrs,$escs,$json);

    # parse out existing strings
    $pieces = preg_split($rgxstr,$json,-1,PREG_SPLIT_DELIM_CAPTURE);
    for($i=1;$i<count($pieces);$i+=2) {
        $strings []= str_replace($escs,$chrs,str_replace(array('\\\\','\\\'','\\"'),array('\\','\'','"'),substr($pieces[$i],1,-1)));
        $pieces[$i] = chr(2) . (count($strings)-1) . chr(2);
    }
    $json = implode($pieces);

    # parse json
    while(1) {
        $pieces = preg_split($rgxjson,$json,-1,PREG_SPLIT_DELIM_CAPTURE);
        for($i=1;$i<count($pieces);$i+=2) {
            $nodes []= $pieces[$i];
            $pieces[$i] = chr(1) . (count($nodes)-1) . chr(1);
        }
        $json = implode($pieces);
        if(!preg_match($rgxjson,$json)) break;
    }

    # build associative array
    for($i=0,$l=count($nodes);$i<$l;$i++) {
        $obj = explode(',',substr($nodes[$i],1,-1));
        $arr = $nodes[$i][0] == '[';

        if($arr) {
            for($j=0;$j<count($obj);$j++) {
                if(preg_match($rgxchr1,$obj[$j])) $obj[$j] = $nodes[+substr($obj[$j],1,-1)];
                else if(preg_match($rgxchr2,$obj[$j])) $obj[$j] = $strings[+substr($obj[$j],1,-1)];
                else if(preg_match($rgxnum,$obj[$j])) $obj[$j] = +trim($obj[$j]);
                else $obj[$j] = trim(str_replace($escs,$chrs,$obj[$j]));
            }
            $nodes[$i] = $obj;
        } else {
            $data = array();
            for($j=0;$j<count($obj);$j++) {
                $kv = explode(':',$obj[$j],2);
                if(preg_match($rgxchr1,$kv[0])) $kv[0] = $nodes[+substr($kv[0],1,-1)];
                else if(preg_match($rgxchr2,$kv[0])) $kv[0] = $strings[+substr($kv[0],1,-1)];
                else if(preg_match($rgxnum,$kv[0])) $kv[0] = +trim($kv[0]);
                else $kv[0] = trim(str_replace($escs,$chrs,$kv[0]));
                if(preg_match($rgxchr1,$kv[1])) $kv[1] = $nodes[+substr($kv[1],1,-1)];
                else if(preg_match($rgxchr2,$kv[1])) $kv[1] = $strings[+substr($kv[1],1,-1)];
                else if(preg_match($rgxnum,$kv[1])) $kv[1] = +trim($kv[1]);
                else $kv[1] = trim(str_replace($escs,$chrs,$kv[1]));
                $data[$kv[0]] = $kv[1];
            }
            $nodes[$i] = $data;
        }
    }

    return $nodes[count($nodes)-1];
}

Note that it does not catch errors or bad formatting...

For your situation, it looks like you'd want to add {}'s around it (as json_decode also requires):

$data = loose_json_decode('{' . $json . '}');

which for me yields:

array(6) {
  ["id"]=>
  int(43015)
  ["name"]=>
  string(8) "John Doe"
  ["level"]=>
  int(15)
  ["systems"]=>
  array(1) {
    [0]=>
    array(5) {
      ["t"]=>
      int(6)
      ["glr"]=>
      int(1242)
      ["n"]=>
      string(6) "server"
      ["s"]=>
      int(185)
      ["c"]=>
      int(9)
    }
  }
  ["classs"]=>
  int(0)
  ["subclass"]=>
  int(5)
}
查看更多
【Aperson】
6楼-- · 2020-01-26 12:04

This regex will do the trick

$json = preg_replace('/([{,])(\s*)([A-Za-z0-9_\-]+?)\s*:/','$1"$3":',$json);
查看更多
迷人小祖宗
7楼-- · 2020-01-26 12:05
$json = preg_replace('/([{,])(\s*)([A-Za-z0-9_\-]+?)\s*:/','$1"$3":',$json);// adding->(")
$json = str_replace("'",'"', $json);// replacing->(')

This solution seems to be enough for most common purposes.

查看更多
登录 后发表回答