PHP: case-insensitive preg_replace of a cyrillic s

2020-04-08 12:47发布

I have a PHP 5.3 script displaying users of my web site and would like to replace a certain Russian city (stored in UTF8 in PostgreSQL 8.4.7 database + CentOS 5.5/64 bits Linux) by its older name (it is an insider joke):

preg_replace('/Волгоград/iu', 'Сталинград', $city);

Unfortunately this only works for exact matches: Волгоград.

This does not work for other cases, like ВОЛГОГРАД or волгоград.

If I modify my source code to

preg_replace('/[Вв]олгоград/iu', 'Сталинград', $city);

then it will catch the 2nd case above.

Does anybody know what it going on and how to fix it (assuming I don't want to write [Xx] for every letter)?

Thank you! Alex

UPDATE:

# rpm -qa|grep php
php53-bcmath-5.3.3-1.el5
php53-gd-5.3.3-1.el5
php53-common-5.3.3-1.el5
php53-pdo-5.3.3-1.el5
php53-mbstring-5.3.3-1.el5
php53-xml-5.3.3-1.el5
php53-5.3.3-1.el5
php53-cli-5.3.3-1.el5
php53-pgsql-5.3.3-1.el5

# rpm -qa|grep pcre
pcre-6.6-2.el5_1.7

9条回答
狗以群分
2楼-- · 2020-04-08 13:49

for those who support a huge legacy code base, struggling with charset & encoding issues, and without option to convert code charset - here's an answer:

//for 
setlocale(LC_ALL, 'ru_RU.cp1251');  
//(or any other locale) to take effect, 
//you MUST generate system locale, i.e.

sudo su
#view supported locales
#less /usr/share/i18n/SUPPORTED
echo "ru_RU.cp1251 CP1251" >> /var/lib/locales/supported.d/local
dpkg-reconfigure locales
exit

#and (for ubuntu/debian)

apt-get install php5-intl

while you can rewrite your regexp to use some utf tricks, convert your code to utf, it's not an option when you work with a huge codebase/database etc

查看更多
放我归山
3楼-- · 2020-04-08 13:51

This one solved the problem:

setlocale(LC_ALL, 'ru_RU.CP1251', 'rus_RUS.CP1251', 'Russian_Russia.1251');
查看更多
Evening l夕情丶
4楼-- · 2020-04-08 13:51

Perhaps try: mb_eregi_replace http://www.php.net/manual/en/function.mb-eregi-replace.php

mb_eregi_replace — Replace regular expression with multibyte support ignoring case

查看更多
登录 后发表回答