Select MySQL rows with Japanese characters

2019-03-20 05:59发布

问题:

Would anyone know of a reliable method (with mySQL or otherwise) to select rows in a database that contain Japanese characters? I have a lot of rows in my database, some of which only have alphanumeric characters, some of which have Japanese characters.

回答1:

Rules when you want to have problem with character sets:

  1. when creating database use utf8 encoding:

    CREATE DATABASE  _test DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
    
  2. Make sure all text fileds (varchar and text) are use UTF-8:

    CREATE TABLE _test.test (
      id INT NOT NULL AUTO_INCREMENT,
      name VARCHAR(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
      PRIMARY KEY (`id`)
    ) ENGINE = MyISAM;
    
  3. When you make a connection do this before you query/update the database:

    SET NAMES utf8;
    
  4. With phpMyAdmin - Choose UTF-8 when you login.

  5. set web page encoding to utf-8 to make sure all post/get data will be in UTF-8 (or you'll have to converting is painful..). PHP code (first line in the php file or at least before any output):

    header('Content-Type: text/html; charset=UTF-8');
    
  6. Make sure all your queries are written in UTF8 encoding. If using PHP:

    6.1. If PHP supports code in UTF-8 - just write your files in UTF-8. 6.2. If php is compiled without UTF-8 support - convert your strings to UTF-8 like this:

    $str = mb_convert_encoding($str, 'UTF-8', '<put your file encoding here');
    $query = 'SELECT * FROM test WHERE name = "' . $str . '"';
    

    6.3.

That should do it.



回答2:

Following on to the helpful answer NickSoft, i had to set the encoding on the db connection to get it to work.

&characterEncoding=UTF8

Then the SET NAMES utf8; seemed to be redundant



回答3:

As teneff stated, just use SELECT.

When installing MySQL, use UTF-8 as charset. Then, choosing utf8_general_ci as collation should do the work.



回答4:

There is limited number of japanese characters. You can search for these using

SELECT ... LIKE '%カ%'

Alternatively you can try their hexadecimal denomination -

SELECT ...LIKE CONCAT('%',CHAR(0x30ab),'%')

You may find useful this UTF-8 Japanese subset http://www.utf8-chartable.de/unicode-utf8-table.pl?start=12448

Supposing you're using UTF-8 character set for fields, queries, results...



回答5:

As Frosty stated, just use SELECT.

Look up the lowest and highest valued Japanese characters in the Unicode charts at http://www.unicode.org/roadmaps/bmp/ and use REGEXP. It may use several different regions of characters to get the whole Japanese character set. As long as you use the UTF-8 charset and utf8_general_ci collation, you should be able to use a REGEXP '[a-gk-nt-z]' where a-g represents one range of Unicode characters from the charts, k-n represents another range, etc.