How to match with regex unicode text ignoring diac

2019-08-24 03:07发布

What I am trying to achieve is - I want to use a preg-replace to highlight searched string in suggestions but ignoring diacritics on characters, spaces or apostrophe. So when I will for example search for ha my search suggestions will look like this:

  • O'Hara
  • Ó an Cintighe
  • H'aSOMETHING

I have done a loads of research but did not come up with any code yet. I just have an idea that I could somehow convert the characters with diacritics (e.g.: Á, É...) to character and modifier (A+´, E+´) but I am not sure how to do it.

1条回答
虎瘦雄心在
2楼-- · 2019-08-24 03:30

I finally found working solution thanks to this Tibor's answer here: Regex to ignore accents? PHP

My function highlights text ignoring diacritics, spaces, apostrophes and dashes:

  function highlight($pattern, $string)
  {
    $array = str_split($pattern);

    //add or remove characters to be ignored
    $pattern=implode('[\s\'\-]*', $array);  

    //list of letters with diacritics
    $replacements = Array("a" => "[áa]", "e"=>"[ée]", "i"=>"[íi]", "o"=>"[óo]", "u"=>"[úu]", "A" => "[ÁA]", "E"=>"[ÉE]", "I"=>"[ÍI]", "O"=>"[ÓO]", "U"=>"[ÚU]");

    $pattern=str_replace(array_keys($replacements), $replacements, $pattern);  

    //instead of <u> you can use <b>, <i> or even <div> or <span> with css class
    return preg_replace("/(" . $pattern . ")/ui", "<u>\\1</u>", $string);
  }
查看更多
登录 后发表回答