How to solve JSON_ERROR_UTF8 error in php json_dec

2020-01-24 22:59发布

问题:

I am trying this code

$json = file_get_contents("http://www.google.com/alerts/preview?q=test&t=7&f=1&l=0&e");
print_r(json_decode(utf8_encode($json), true));

        //////////////

// Define the errors.
$constants = get_defined_constants(true);
$json_errors = array();
foreach ($constants["json"] as $name => $value) {
    if (!strncmp($name, "JSON_ERROR_", 11)) {
        $json_errors[$value] = $name;
    }
}

// Show the errors for different depths.
foreach (range(4, 3, -1) as $depth) {
    var_dump(json_decode($json, true, $depth));
    echo 'Last error: ', $json_errors[json_last_error()], PHP_EOL, PHP_EOL;
}

I've tried a lot of functions, html_entities_decode, utf8_encode and decode, decoding the hex codes, but I always get the error "JSON_ERROR_UTF8".

How could I solve this?

回答1:

There is a good function to sanitize your arrays.

I suggest you use a json_encode wrapper like this :

function safe_json_encode($value, $options = 0, $depth = 512, $utfErrorFlag = false) {
    $encoded = json_encode($value, $options, $depth);
    switch (json_last_error()) {
        case JSON_ERROR_NONE:
            return $encoded;
        case JSON_ERROR_DEPTH:
            return 'Maximum stack depth exceeded'; // or trigger_error() or throw new Exception()
        case JSON_ERROR_STATE_MISMATCH:
            return 'Underflow or the modes mismatch'; // or trigger_error() or throw new Exception()
        case JSON_ERROR_CTRL_CHAR:
            return 'Unexpected control character found';
        case JSON_ERROR_SYNTAX:
            return 'Syntax error, malformed JSON'; // or trigger_error() or throw new Exception()
        case JSON_ERROR_UTF8:
            $clean = utf8ize($value);
            if ($utfErrorFlag) {
                return 'UTF8 encoding error'; // or trigger_error() or throw new Exception()
            }
            return safe_json_encode($clean, $options, $depth, true);
        default:
            return 'Unknown error'; // or trigger_error() or throw new Exception()

    }
}

function utf8ize($mixed) {
    if (is_array($mixed)) {
        foreach ($mixed as $key => $value) {
            $mixed[$key] = utf8ize($value);
        }
    } else if (is_string ($mixed)) {
        return utf8_encode($mixed);
    }
    return $mixed;
}

In my application utf8_encode() works better than iconv()



回答2:

You need simple line of code:

$input = iconv('UTF-8', 'UTF-8//IGNORE', utf8_encode($input));
$json = json_decode($input);

Credit: Sang Le, my teamate gave me this code. Yeah!



回答3:

The iconv function is pretty worthless unless you can guarantee the input is valid. Use mb_convert_encoding instead.

mb_convert_encoding($value, "UTF-8", "auto");

You can get more explicit than "auto", and even specify a comma-separated list of expected input encodings.

Most importantly, invalid characters will be handled without causing the entire string to be discarded (unlike iconv).



回答4:

Decoding JSON in PHP Decoding JSON is as simple as encoding it. PHP provides you a handy json_decode function that handles everything for you. If you just pass a valid JSON string into the method, you get an object of type stdClass back. Here’s a short example:

<?php
$string = '{"foo": "bar", "cool": "attr"}';
$result = json_decode($string);

// Result: object(stdClass)#1 (2) { ["foo"]=> string(3) "bar" ["cool"]=> string(4) "attr" }
var_dump($result);

// Prints "bar"
echo $result->foo;

// Prints "attr"
echo $result->cool;
?>

If you want to get an associative array back instead, set the second parameter to true:

<?php
$string = '{"foo": "bar", "cool": "attr"}';
$result = json_decode($string, true);

// Result: array(2) { ["foo"]=> string(3) "bar" ["cool"]=> string(4) "attr" }
var_dump($result);

// Prints "bar"
echo $result['foo'];

// Prints "attr"
echo $result['cool'];
?>

If you expect a very large nested JSON document, you can limit the recursion depth to a certain level. The function will return null and stops parsing if the document is deeper than the given depth.

<?php
$string = '{"foo": {"bar": {"cool": "value"}}}';
$result = json_decode($string, true, 2);

// Result: null
var_dump($result);
?>

The last argument works the same as in json_encode, but there is only one bitmask supported currently (which allows you to convert bigints to strings and is only available from PHP 5.4 upwards).We’ve been working with valid JSON strings until now (aside fromt the null depth error). The next part shows you how to deal with errors.

Error-Handling and Testing If the JSON value could not be parsed or a nesting level deeper than the given (or default) depth is found, NULL is returned from json_decode. This means that no exception is raised by json_encode/json_deocde directly.

So how can we identify the cause of the error? The json_last_error function helps here. json_last_error returns an integer error code that can be one of the following constants (taken from here):

JSON_ERROR_NONE: No error has occurred. JSON_ERROR_DEPTH: The maximum stack depth has been exceeded. JSON_ERROR_STATE_MISMATCH: Invalid or malformed JSON. JSON_ERROR_CTRL_CHAR: Control character error, possibly incorrectly encoded. JSON_ERROR_SYNTAX: Syntax error. JSON_ERROR_UTF8: Malformed UTF-8 characters, possibly incorrectly encoded (since PHP 5.3.3). With those information at hand, we can write a quick parsing helper method that raises a descriptive exception when an error is found.

<?php
class JsonHandler {

    protected static $_messages = array(
        JSON_ERROR_NONE => 'No error has occurred',
        JSON_ERROR_DEPTH => 'The maximum stack depth has been exceeded',
        JSON_ERROR_STATE_MISMATCH => 'Invalid or malformed JSON',
        JSON_ERROR_CTRL_CHAR => 'Control character error, possibly incorrectly encoded',
        JSON_ERROR_SYNTAX => 'Syntax error',
        JSON_ERROR_UTF8 => 'Malformed UTF-8 characters, possibly incorrectly encoded'
    );

    public static function encode($value, $options = 0) {
        $result = json_encode($value, $options);

        if($result)  {
            return $result;
        }

        throw new RuntimeException(static::$_messages[json_last_error()]);
    }

    public static function decode($json, $assoc = false) {
        $result = json_decode($json, $assoc);

        if($result) {
            return $result;
        }

        throw new RuntimeException(static::$_messages[json_last_error()]);
    }

}
?>

We can now use the exception testing function from the last post about exception handling to test if our exception works correctly.

// Returns "Correctly thrown"
assertException("Syntax error", function() {
    $string = '{"foo": {"bar": {"cool": NONUMBER}}}';
    $result = JsonHandler::decode($string);
});

Note that since PHP 5.3.3, there is a JSON_ERROR_UTF8 error returned when an invalid UTF-8 character is found in the string. This is a strong indication that a different charset than UTF-8 is used. If the incoming string is not under your control, you can use the utf8_encode function to convert it into utf8.

<?php echo utf8_encode(json_encode($payload)); ?>

I’ve been using this in the past to convert data loaded from a legacy MSSQL database that didn’t use UTF-8.

source