Character encoding problems with reading a csv fil

2019-08-04 23:35发布

问题:

My csv file contains special characters like 'æ', 'å' etc. When I read and print the file, special characters in the file gets converted into '�'. I tried setting page encodig to UTF-8 and ISO 8859-1. But none of these helped.

Could smb advice a solution?

回答1:

I think you have to detect and change the original encoding as folows (if you are using php):

  <?php
        header( "Content-Type: text/html; charset=utf-8");
        $csvContent = file_get_contents( $fileName );
        $encoding = mb_detect_encoding( $csvContent, 
                                        array("UTF-8","UTF-32","UTF-32BE","UTF-32LE","UTF-16","UTF-16BE","UTF-16LE"), 
                                        TRUE );

        if( $fileEncoding !== "UTF-8" ) {
             $csvContent = mb_convert_encoding($csvContent, "UTF-8", $fileEncoding );
        }

        foreach( explode( PHP_EOL, $csvContent ) as $item ) {
           var_dump($item );
        }
 ?>