Unit testing for unicode support

2019-06-05 05:27发布

问题:

I'm trying to convert to unicode and create some unit tests to ensure that unicode is working.

Here is my current code, which fails on the mb_detect_encoding() line, and which I'm also not sure whether it is a valid test of unicode support:

    function testMultiLingualEncodings(){
        // Create this string via a heredoc.
        $original = '
        A good day, World!
Schönen Tag, Welt!
Une bonne journée, tout le monde!
يوم جيد، العالم
좋은 일, 세계!
Một ngày tốt lành, thế giới!
こんにちは、世界!
'; // Contains international characters from utf-8
        $this->assertTrue(mb_detect_encoding($original, 'UTF-8', true) === true); // Fails regardless of whether strict is true or not.
        $returned = query_item("select :multi limit 10", array(':multi'=>$original)); // Select this exact string, parameterized, from the database
        //debug($returned, string_diff($returned, $original));
        $this->assertTrue((bool)$original); // test original isn't null.
        $this->assertTrue((bool)$returned); // Test returned string isn't null.
        $this->assertTrue($original === $returned); // Test original exactly matches returned string
    }

So mb_detect_encoding() says that the initial string above isn't UTF-8. I'm also trying to pass that string into the database and get it out, and then compare with the original string. I'm not sure whether that is a valid test of the database connection's encoding, however.

So in general, how can I create a unit test for utf-8 support, and is the method above something that can be modified to resolve that goal?

回答1:

Sorry but that doesn't make sense. Your test file is encoded in one format. Whatever you put into the test string will be encoded in the same way as the file is. I wouldn't also rely on the mb_detect_encoding function. Let's take following string: "abcde". It can be ASCII or UTF-8. You can't judge because there is no special character. Encoding is a way how you intemperate a data.

//EDIT

To make your test work do $this->assertTrue(mb_detect_encoding($original, 'UTF-8') === 'UTF-8')