I'm trying to rewrite an old website .
it's in persian which uses perso/arabic characters .
CREATE DATABASE `db` DEFAULT CHARACTER SET utf8 COLLATE utf8_persian_ci;
USE `db`;
Almost all my table/columns COLLATE are set to utf8_persian_ci
I'm using codeigniter for my new script and i have
'char_set' => 'utf8',
'dbcollat' => 'utf8_persian_ci',
In the database settings , so there is no problem there .
So here is the strange part
The old script is using some sort of database engine called TUBADBENGINE
or TUBA DB ENGINE
... nothing special .
When i enter some data in the database (in persian) using the old script , when i look into database , characters are stored like عمران
.
The old script fetch/shows that data fine , but the new script shows them with the same weird font/charset as database
So when i enter اااا
, database stored data looks like عمراÙ
, when i fetch it in the new script i see عمراÙ
but in the old script i see اااا
CREATE TABLE IF NOT EXISTS `tnewsgroups` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`fName` varchar(200) COLLATE utf8_persian_ci DEFAULT NULL,
PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_persian_ci AUTO_INCREMENT=11 ;
--
-- Dumping data for table `tnewsgroups`
--
INSERT INTO `tnewsgroups` (`ID`, `fName`) VALUES
(1, 'عمران'),
(2, 'معماری'),
(3, 'برق'),
(4, 'مکانیک'),
(5, 'test'),
(6, 'test2');
In the other hand when i enter ااااا
directly in the database
Of course i have the same اااا
stored in the database
The new script is showing it fine
But in the old script i get ????
Can anyone make any sense of this ?
Here is the tuba engin
https://github.com/maxxxir/mz-codeigniter-crud/blob/master/tuba.php
Usage example from old script :
define("database_type" , "MYSQL");
define("database_ip" , "localhost");
define("database_un" , "root");
define("database_pw" , "");
define("database_name" , "nezam2");
define("database_connectionstring" , "");
$db = new TUBADBENGINE(database_type , database_ip , database_un , database_pw , database_name , database_connectionstring);
$db->Select("SELECT * FROM tnews limit 3");
if ($db->Lasterror() != "") { echo "<B><Font color=red>ÎØÇ ! áØÝÇ ãÌÏøÏÇ ÊáÇÔ ˜äíÏ"; exit(); }
for ($i = 0 ; $i < $db->Count() ; $i++) {
$row = $db->Next();
var_dump($row);
}
In short, because this has been discussed a thousand times before:
"漢字"
, encoded in UTF-8. The bytes for this areE6 BC A2 E5 AD 97
.latin1
.E6 BC A2 E5 AD 97
, thinking those representlatin1
characters.æ¼¢å
(the characters thatE6 BC A2 E5 AD 97
maps to inlatin1
).So the problem here was that the database connection was set incorrectly when the data was entered into the database. You'll have to convert the data in the database to the correct characters. Try this:
Maybe
utf8
isn't what you need here, experiment. If that works, change this into anUPDATE
statement to update the data permanently.