PHP:写一个简单的removeEmoji功能(PHP : writing a simple rem

2019-06-18 11:34发布

我在寻找一个简单的函数,会从Instagram的评论删除绘文字字符。 我已经试过了,现在(从例子有很多代码,我对SO和其他网站上找到):

// PHP class
public static function removeEmoji($string)
{
    // split the string into UTF8 char array
    // for loop inside char array
        // if char is emoji, remove it
    // endfor
    // return newstring
}

任何帮助,将不胜感激

Answer 1:

我认为preg_replace函数的功能是simpliest解决方案。

作为EaterOfCode建议,我读了维基页面和编码新的正则表达式,因为没有任何的SO(或其他网站)的答案似乎为Instagram的照片说明工作(API返回的格式)。 注:/ U标识是强制性的匹配\ X unicode字符。

public static function removeEmoji($text) {

    $clean_text = "";

    // Match Emoticons
    $regexEmoticons = '/[\x{1F600}-\x{1F64F}]/u';
    $clean_text = preg_replace($regexEmoticons, '', $text);

    // Match Miscellaneous Symbols and Pictographs
    $regexSymbols = '/[\x{1F300}-\x{1F5FF}]/u';
    $clean_text = preg_replace($regexSymbols, '', $clean_text);

    // Match Transport And Map Symbols
    $regexTransport = '/[\x{1F680}-\x{1F6FF}]/u';
    $clean_text = preg_replace($regexTransport, '', $clean_text);

    // Match Miscellaneous Symbols
    $regexMisc = '/[\x{2600}-\x{26FF}]/u';
    $clean_text = preg_replace($regexMisc, '', $clean_text);

    // Match Dingbats
    $regexDingbats = '/[\x{2700}-\x{27BF}]/u';
    $clean_text = preg_replace($regexDingbats, '', $clean_text);

    return $clean_text;
}

该功能不会删除所有表情符号,因为有更多的人,但你明白了吧。

请参考Unicode.org的-全的表情符号列表 (感谢太平洋业务中心 )



Answer 2:

随着苹果继续增加表情符号,到IOS的新版本,我将更新和维护这个答案。

这个答案已经更新了IOS 12.1。 如果你有问题,那么请检查这个答案的先前版本的编辑历史(其在这个答案多个正则表达式高于如此的最大后体长)

对于iOS测试版12.1(月,2018)

public static function removeEmoji($string)
    return preg_replace('/[\x{1F3F4}](?:\x{E0067}\x{E0062}\x{E0077}\x{E006C}\x{E0073}\x{E007F})|[\x{1F3F4}](?:\x{E0067}\x{E0062}\x{E0073}\x{E0063}\x{E0074}\x{E007F})|[\x{1F3F4}](?:\x{E0067}\x{E0062}\x{E0065}\x{E006E}\x{E0067}\x{E007F})|[\x{1F3F4}](?:\x{200D}\x{2620}\x{FE0F})|[\x{1F3F3}](?:\x{FE0F}\x{200D}\x{1F308})|[\x{0023}\x{002A}\x{0030}\x{0031}\x{0032}\x{0033}\x{0034}\x{0035}\x{0036}\x{0037}\x{0038}\x{0039}](?:\x{FE0F}\x{20E3})|[\x{1F415}](?:\x{200D}\x{1F9BA})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F467}\x{200D}\x{1F467})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F467}\x{200D}\x{1F466})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F467})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F466}\x{200D}\x{1F466})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F466})|[\x{1F468}](?:\x{200D}\x{1F468}\x{200D}\x{1F467}\x{200D}\x{1F467})|[\x{1F468}](?:\x{200D}\x{1F468}\x{200D}\x{1F466}\x{200D}\x{1F466})|[\x{1F468}](?:\x{200D}\x{1F468}\x{200D}\x{1F467}\x{200D}\x{1F466})|[\x{1F468}](?:\x{200D}\x{1F468}\x{200D}\x{1F467})|[\x{1F468}](?:\x{200D}\x{1F468}\x{200D}\x{1F466})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F469}\x{200D}\x{1F467}\x{200D}\x{1F467})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F469}\x{200D}\x{1F466}\x{200D}\x{1F466})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F469}\x{200D}\x{1F467}\x{200D}\x{1F466})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F469}\x{200D}\x{1F467})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F469}\x{200D}\x{1F466})|[\x{1F469}](?:\x{200D}\x{2764}\x{FE0F}\x{200D}\x{1F469})|[\x{1F469}\x{1F468}](?:\x{200D}\x{2764}\x{FE0F}\x{200D}\x{1F468})|[\x{1F469}](?:\x{200D}\x{2764}\x{FE0F}\x{200D}\x{1F48B}\x{200D}\x{1F469})|[\x{1F469}\x{1F468}](?:\x{200D}\x{2764}\x{FE0F}\x{200D}\x{1F48B}\x{200D}\x{1F468})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F9BD})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F9BC})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F9AF})|[\x{1F575}\x{1F3CC}\x{26F9}\x{1F3CB}](?:\x{FE0F}\x{200D}\x{2640}\x{FE0F})|[\x{1F575}\x{1F3CC}\x{26F9}\x{1F3CB}](?:\x{FE0F}\x{200D}\x{2642}\x{FE0F})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F692})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F680})|[\x{1F468}\x{1F469}](?:\x{200D}\x{2708}\x{FE0F})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F3A8})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F3A4})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F4BB})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F52C})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F4BC})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F3ED})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F527})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F373})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F33E})|[\x{1F468}\x{1F469}](?:\x{200D}\x{2696}\x{FE0F})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F3EB})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F393})|[\x{1F468}\x{1F469}](?:\x{200D}\x{2695}\x{FE0F})|[\x{1F471}\x{1F64D}\x{1F64E}\x{1F645}\x{1F646}\x{1F481}\x{1F64B}\x{1F9CF}\x{1F647}\x{1F926}\x{1F937}\x{1F46E}\x{1F482}\x{1F477}\x{1F473}\x{1F9B8}\x{1F9B9}\x{1F9D9}\x{1F9DA}\x{1F9DB}\x{1F9DC}\x{1F9DD}\x{1F9DE}\x{1F9DF}\x{1F486}\x{1F487}\x{1F6B6}\x{1F9CD}\x{1F9CE}\x{1F3C3}\x{1F46F}\x{1F9D6}\x{1F9D7}\x{1F3C4}\x{1F6A3}\x{1F3CA}\x{1F6B4}\x{1F6B5}\x{1F938}\x{1F93C}\x{1F93D}\x{1F93E}\x{1F939}\x{1F9D8}](?:\x{200D}\x{2640}\x{FE0F})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F9B2})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F9B3})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F9B1})|[\x{1F468}\x{1F469}](?:\x{200D}\x{1F9B0})|[\x{1F471}\x{1F64D}\x{1F64E}\x{1F645}\x{1F646}\x{1F481}\x{1F64B}\x{1F9CF}\x{1F647}\x{1F926}\x{1F937}\x{1F46E}\x{1F482}\x{1F477}\x{1F473}\x{1F9B8}\x{1F9B9}\x{1F9D9}\x{1F9DA}\x{1F9DB}\x{1F9DC}\x{1F9DD}\x{1F9DE}\x{1F9DF}\x{1F486}\x{1F487}\x{1F6B6}\x{1F9CD}\x{1F9CE}\x{1F3C3}\x{1F46F}\x{1F9D6}\x{1F9D7}\x{1F3C4}\x{1F6A3}\x{1F3CA}\x{1F6B4}\x{1F6B5}\x{1F938}\x{1F93C}\x{1F93D}\x{1F93E}\x{1F939}\x{1F9D8}](?:\x{200D}\x{2642}\x{FE0F})|[\x{1F441}](?:\x{FE0F}\x{200D}\x{1F5E8}\x{FE0F})|[\x{1F1E6}\x{1F1E7}\x{1F1E8}\x{1F1E9}\x{1F1F0}\x{1F1F2}\x{1F1F3}\x{1F1F8}\x{1F1F9}\x{1F1FA}](?:\x{1F1FF})|[\x{1F1E7}\x{1F1E8}\x{1F1EC}\x{1F1F0}\x{1F1F1}\x{1F1F2}\x{1F1F5}\x{1F1F8}\x{1F1FA}](?:\x{1F1FE})|[\x{1F1E6}\x{1F1E8}\x{1F1F2}\x{1F1F8}](?:\x{1F1FD})|[\x{1F1E6}\x{1F1E7}\x{1F1E8}\x{1F1EC}\x{1F1F0}\x{1F1F2}\x{1F1F5}\x{1F1F7}\x{1F1F9}\x{1F1FF}](?:\x{1F1FC})|[\x{1F1E7}\x{1F1E8}\x{1F1F1}\x{1F1F2}\x{1F1F8}\x{1F1F9}](?:\x{1F1FB})|[\x{1F1E6}\x{1F1E8}\x{1F1EA}\x{1F1EC}\x{1F1ED}\x{1F1F1}\x{1F1F2}\x{1F1F3}\x{1F1F7}\x{1F1FB}](?:\x{1F1FA})|[\x{1F1E6}\x{1F1E7}\x{1F1EA}\x{1F1EC}\x{1F1ED}\x{1F1EE}\x{1F1F1}\x{1F1F2}\x{1F1F5}\x{1F1F8}\x{1F1F9}\x{1F1FE}](?:\x{1F1F9})|[\x{1F1E6}\x{1F1E7}\x{1F1EA}\x{1F1EC}\x{1F1EE}\x{1F1F1}\x{1F1F2}\x{1F1F5}\x{1F1F7}\x{1F1F8}\x{1F1FA}\x{1F1FC}](?:\x{1F1F8})|[\x{1F1E6}\x{1F1E7}\x{1F1E8}\x{1F1EA}\x{1F1EB}\x{1F1EC}\x{1F1ED}\x{1F1EE}\x{1F1F0}\x{1F1F1}\x{1F1F2}\x{1F1F3}\x{1F1F5}\x{1F1F8}\x{1F1F9}](?:\x{1F1F7})|[\x{1F1E6}\x{1F1E7}\x{1F1EC}\x{1F1EE}\x{1F1F2}](?:\x{1F1F6})|[\x{1F1E8}\x{1F1EC}\x{1F1EF}\x{1F1F0}\x{1F1F2}\x{1F1F3}](?:\x{1F1F5})|[\x{1F1E6}\x{1F1E7}\x{1F1E8}\x{1F1E9}\x{1F1EB}\x{1F1EE}\x{1F1EF}\x{1F1F2}\x{1F1F3}\x{1F1F7}\x{1F1F8}\x{1F1F9}](?:\x{1F1F4})|[\x{1F1E7}\x{1F1E8}\x{1F1EC}\x{1F1ED}\x{1F1EE}\x{1F1F0}\x{1F1F2}\x{1F1F5}\x{1F1F8}\x{1F1F9}\x{1F1FA}\x{1F1FB}](?:\x{1F1F3})|[\x{1F1E6}\x{1F1E7}\x{1F1E8}\x{1F1E9}\x{1F1EB}\x{1F1EC}\x{1F1ED}\x{1F1EE}\x{1F1EF}\x{1F1F0}\x{1F1F2}\x{1F1F4}\x{1F1F5}\x{1F1F8}\x{1F1F9}\x{1F1FA}\x{1F1FF}](?:\x{1F1F2})|[\x{1F1E6}\x{1F1E7}\x{1F1E8}\x{1F1EC}\x{1F1EE}\x{1F1F2}\x{1F1F3}\x{1F1F5}\x{1F1F8}\x{1F1F9}](?:\x{1F1F1})|[\x{1F1E8}\x{1F1E9}\x{1F1EB}\x{1F1ED}\x{1F1F1}\x{1F1F2}\x{1F1F5}\x{1F1F8}\x{1F1F9}\x{1F1FD}](?:\x{1F1F0})|[\x{1F1E7}\x{1F1E9}\x{1F1EB}\x{1F1F8}\x{1F1F9}](?:\x{1F1EF})|[\x{1F1E6}\x{1F1E7}\x{1F1E8}\x{1F1EB}\x{1F1EC}\x{1F1F0}\x{1F1F1}\x{1F1F3}\x{1F1F8}\x{1F1FB}](?:\x{1F1EE})|[\x{1F1E7}\x{1F1E8}\x{1F1EA}\x{1F1EC}\x{1F1F0}\x{1F1F2}\x{1F1F5}\x{1F1F8}\x{1F1F9}](?:\x{1F1ED})|[\x{1F1E6}\x{1F1E7}\x{1F1E8}\x{1F1E9}\x{1F1EA}\x{1F1EC}\x{1F1F0}\x{1F1F2}\x{1F1F3}\x{1F1F5}\x{1F1F8}\x{1F1F9}\x{1F1FA}\x{1F1FB}](?:\x{1F1EC})|[\x{1F1E6}\x{1F1E7}\x{1F1E8}\x{1F1EC}\x{1F1F2}\x{1F1F3}\x{1F1F5}\x{1F1F9}\x{1F1FC}](?:\x{1F1EB})|[\x{1F1E6}\x{1F1E7}\x{1F1E9}\x{1F1EA}\x{1F1EC}\x{1F1EE}\x{1F1EF}\x{1F1F0}\x{1F1F2}\x{1F1F3}\x{1F1F5}\x{1F1F7}\x{1F1F8}\x{1F1FB}\x{1F1FE}](?:\x{1F1EA})|[\x{1F1E6}\x{1F1E7}\x{1F1E8}\x{1F1EC}\x{1F1EE}\x{1F1F2}\x{1F1F8}\x{1F1F9}](?:\x{1F1E9})|[\x{1F1E6}\x{1F1E8}\x{1F1EA}\x{1F1EE}\x{1F1F1}\x{1F1F2}\x{1F1F3}\x{1F1F8}\x{1F1F9}\x{1F1FB}](?:\x{1F1E8})|[\x{1F1E7}\x{1F1EC}\x{1F1F1}\x{1F1F8}](?:\x{1F1E7})|[\x{1F1E7}\x{1F1E8}\x{1F1EA}\x{1F1EC}\x{1F1F1}\x{1F1F2}\x{1F1F3}\x{1F1F5}\x{1F1F6}\x{1F1F8}\x{1F1F9}\x{1F1FA}\x{1F1FB}\x{1F1FF}](?:\x{1F1E6})|[\x{00A9}\x{00AE}\x{203C}\x{2049}\x{2122}\x{2139}\x{2194}-\x{2199}\x{21A9}-\x{21AA}\x{231A}-\x{231B}\x{2328}\x{23CF}\x{23E9}-\x{23F3}\x{23F8}-\x{23FA}\x{24C2}\x{25AA}-\x{25AB}\x{25B6}\x{25C0}\x{25FB}-\x{25FE}\x{2600}-\x{2604}\x{260E}\x{2611}\x{2614}-\x{2615}\x{2618}\x{261D}\x{2620}\x{2622}-\x{2623}\x{2626}\x{262A}\x{262E}-\x{262F}\x{2638}-\x{263A}\x{2640}\x{2642}\x{2648}-\x{2653}\x{265F}-\x{2660}\x{2663}\x{2665}-\x{2666}\x{2668}\x{267B}\x{267E}-\x{267F}\x{2692}-\x{2697}\x{2699}\x{269B}-\x{269C}\x{26A0}-\x{26A1}\x{26AA}-\x{26AB}\x{26B0}-\x{26B1}\x{26BD}-\x{26BE}\x{26C4}-\x{26C5}\x{26C8}\x{26CE}-\x{26CF}\x{26D1}\x{26D3}-\x{26D4}\x{26E9}-\x{26EA}\x{26F0}-\x{26F5}\x{26F7}-\x{26FA}\x{26FD}\x{2702}\x{2705}\x{2708}-\x{270D}\x{270F}\x{2712}\x{2714}\x{2716}\x{271D}\x{2721}\x{2728}\x{2733}-\x{2734}\x{2744}\x{2747}\x{274C}\x{274E}\x{2753}-\x{2755}\x{2757}\x{2763}-\x{2764}\x{2795}-\x{2797}\x{27A1}\x{27B0}\x{27BF}\x{2934}-\x{2935}\x{2B05}-\x{2B07}\x{2B1B}-\x{2B1C}\x{2B50}\x{2B55}\x{3030}\x{303D}\x{3297}\x{3299}\x{1F004}\x{1F0CF}\x{1F170}-\x{1F171}\x{1F17E}-\x{1F17F}\x{1F18E}\x{1F191}-\x{1F19A}\x{1F201}-\x{1F202}\x{1F21A}\x{1F22F}\x{1F232}-\x{1F23A}\x{1F250}-\x{1F251}\x{1F300}-\x{1F321}\x{1F324}-\x{1F393}\x{1F396}-\x{1F397}\x{1F399}-\x{1F39B}\x{1F39E}-\x{1F3F0}\x{1F3F3}-\x{1F3F5}\x{1F3F7}-\x{1F3FA}\x{1F400}-\x{1F4FD}\x{1F4FF}-\x{1F53D}\x{1F549}-\x{1F54E}\x{1F550}-\x{1F567}\x{1F56F}-\x{1F570}\x{1F573}-\x{1F57A}\x{1F587}\x{1F58A}-\x{1F58D}\x{1F590}\x{1F595}-\x{1F596}\x{1F5A4}-\x{1F5A5}\x{1F5A8}\x{1F5B1}-\x{1F5B2}\x{1F5BC}\x{1F5C2}-\x{1F5C4}\x{1F5D1}-\x{1F5D3}\x{1F5DC}-\x{1F5DE}\x{1F5E1}\x{1F5E3}\x{1F5E8}\x{1F5EF}\x{1F5F3}\x{1F5FA}-\x{1F64F}\x{1F680}-\x{1F6C5}\x{1F6CB}-\x{1F6D2}\x{1F6D5}\x{1F6E0}-\x{1F6E5}\x{1F6E9}\x{1F6EB}-\x{1F6EC}\x{1F6F0}\x{1F6F3}-\x{1F6FA}\x{1F7E0}-\x{1F7EB}\x{1F90D}-\x{1F93A}\x{1F93C}-\x{1F945}\x{1F947}-\x{1F971}\x{1F973}-\x{1F976}\x{1F97A}-\x{1F9A2}\x{1F9A5}-\x{1F9AA}\x{1F9AE}-\x{1F9CA}\x{1F9CD}-\x{1F9FF}\x{1FA70}-\x{1FA73}\x{1FA78}-\x{1FA7A}\x{1FA80}-\x{1FA82}\x{1FA90}-\x{1FA95}]/u', '', $string);
}


Answer 3:

更新了更多的代码正确答案,只是几个表情符号被保留。

public static function removeEmoji($text) {

    $clean_text = "";

    // Match Emoticons
    $regexEmoticons = '/[\x{1F600}-\x{1F64F}]/u';
    $clean_text = preg_replace($regexEmoticons, '', $text);

    // Match Miscellaneous Symbols and Pictographs
    $regexSymbols = '/[\x{1F300}-\x{1F5FF}]/u';
    $clean_text = preg_replace($regexSymbols, '', $clean_text);

    // Match Transport And Map Symbols
    $regexTransport = '/[\x{1F680}-\x{1F6FF}]/u';
    $clean_text = preg_replace($regexTransport, '', $clean_text);

    // Match Miscellaneous Symbols
    $regexMisc = '/[\x{2600}-\x{26FF}]/u';
    $clean_text = preg_replace($regexMisc, '', $clean_text);

    // Match Dingbats
    $regexDingbats = '/[\x{2700}-\x{27BF}]/u';
    $clean_text = preg_replace($regexDingbats, '', $clean_text);

    // Match Flags
    $regexDingbats = '/[\x{1F1E6}-\x{1F1FF}]/u';
    $clean_text = preg_replace($regexDingbats, '', $clean_text);

    // Others
    $regexDingbats = '/[\x{1F910}-\x{1F95E}]/u';
    $clean_text = preg_replace($regexDingbats, '', $clean_text);

    $regexDingbats = '/[\x{1F980}-\x{1F991}]/u';
    $clean_text = preg_replace($regexDingbats, '', $clean_text);

    $regexDingbats = '/[\x{1F9C0}]/u';
    $clean_text = preg_replace($regexDingbats, '', $clean_text);

    $regexDingbats = '/[\x{1F9F9}]/u';
    $clean_text = preg_replace($regexDingbats, '', $clean_text);

    return $clean_text;
}


Answer 4:

我开发利用UTF-8解析器在PHP ISO-8859-1(谁返回在转换无效字符?字符)一个funtcion。

function removeEmojis( $string ) {
    $string = str_replace( "?", "{%}", $string );
    $string  = mb_convert_encoding( $string, "ISO-8859-1", "UTF-8" );
    $string  = mb_convert_encoding( $string, "UTF-8", "ISO-8859-1" );
    $string  = str_replace( array( "?", "? ", " ?" ), array(""), $string );
    $string  = str_replace( "{%}", "?", $string );
    return trim( $string );
}

说明:

  • 从UTF-8的字符串转换为ISO-8859-1

  • 回到返回UTF-8(MB_函数替换无效字符为“”?“”除去非有效的字符)

  • 更换? 首屈一指

  • 从原来的字符串返回回“”?“”字符

请确保您使用的是UTF-8的工作。



Answer 5:

我们在我的工作有使用表情符号很长的斗争中,我们找到了这个问题的一些正则表达式,但没有一次成功。 这一个是工作:

编辑:这并不包括所有的表情符号。 我还在寻找表情符号正则表达式的圣杯,但没有找到它。

return preg_replace('/([0-9|#][\x{20E3}])|[\x{00ae}\x{00a9}\x{203C}\x{2047}\x{2048}\x{2049}\x{3030}\x{303D}\x{2139}\x{2122}\x{3297}\x{3299}][\x{FE00}-\x{FEFF}]?|[\x{2190}-\x{21FF}][\x{FE00}-\x{FEFF}]?|[\x{2300}-\x{23FF}][\x{FE00}-\x{FEFF}]?|[\x{2460}-\x{24FF}][\x{FE00}-\x{FEFF}]?|[\x{25A0}-\x{25FF}][\x{FE00}-\x{FEFF}]?|[\x{2600}-\x{27BF}][\x{FE00}-\x{FEFF}]?|[\x{2900}-\x{297F}][\x{FE00}-\x{FEFF}]?|[\x{2B00}-\x{2BF0}][\x{FE00}-\x{FEFF}]?|[\x{1F000}-\x{1F6FF}][\x{FE00}-\x{FEFF}]?/u', '', $text);


Answer 6:

由于表情符使用Unicode私人使用的区域,您可以使用preg_replace()从去除编码字符的是整个区域U+E000U+F8FF

function removeEmoji($string) {
    return preg_replace('/&#x(e[0-9a-f][0-9a-f][0-9a-f]|f[0-8][0-9a-f][0-9a-f])/i', '', $string);
}


Answer 7:

function emojiFilter($text){
$text = json_encode($text);
preg_match_all("/(\\\\ud83c\\\\u[0-9a-f]{4})|(\\\\ud83d\\\u[0-9a-f]{4})|(\\\\u[0-9a-f]{4})/", $text, $matchs);
if(!isset($matchs[0][0])) { return json_decode($text, true); }

$emoji = $matchs[0];
foreach($emoji as $ec) {
    $hex = substr($ec, -4);
    if(strlen($ec)==6) {
        if($hex>='2600' and $hex<='27ff') {
            $text = str_replace($ec, '', $text);
        }
    } else {
        if($hex>='dc00' and $hex<='dfff') {
            $text = str_replace($ec, '', $text);
        }
    }
}

return json_decode($text, true);  }


Answer 8:

我已经使用相同的代码WordPress使用以取代图像表情符号,解决了这个问题

这里是我使用的代码,它完美地工作,因为它有最常用的表情符号的完整列表

完整的代码在这里存在https://pastebin.com/8MqGdD6p

这里是如何工作的,但请务必将代码复制引擎收录,因为这是不完整的代码

$content ='<span class="do">⚫</span> where emojis exist';
$partials = array('&#x1f469;&#x200d;); // the list of emojis 

foreach ( $partials as $emojum ) {
    if ( version_compare( phpversion(), '5.4', '<' ) ) {
        $emoji_char = html_entity_decode( $emojum, ENT_COMPAT, 'UTF-8' );
    } else {
        $emoji_char = html_entity_decode( $emojum );
    }
    if ( false !== strpos( $content, $emoji_char ) ) {
        $content = preg_replace( "/$emoji_char/", '', $content );
    }
}


Answer 9:

@sglessard由于代码已经过时,所有的表情符号为2018年7月12日在这里的完整列表,您将能够生成它,通过运行源代码我张贴

请让如果你发现任何问题,我知道了,谢谢。

public static function removeEmoji($text) {
    $regexEmoticons = [
        '/[\x{0023}]/u',
        '/[\x{002A}]/u',
        '/[\x{00A9}]/u',
        '/[\x{00AE}]/u',
        '/[\x{200D}]/u',
        '/[\x{203C}]/u',
        '/[\x{2049}]/u',
        '/[\x{20E3}]/u',
        '/[\x{2122}]/u',
        '/[\x{2139}]/u',
        '/[\x{2194}-\x{2199}]/u',
        '/[\x{21A9}-\x{21AA}]/u',
        '/[\x{231A}-\x{231B}]/u',
        '/[\x{2328}]/u',
        '/[\x{23CF}]/u',
        '/[\x{23E9}-\x{23F3}]/u',
        '/[\x{23F8}-\x{23FA}]/u',
        '/[\x{24C2}]/u',
        '/[\x{25AA}-\x{25AB}]/u',
        '/[\x{25B6}]/u',
        '/[\x{25C0}]/u',
        '/[\x{25FB}-\x{25FE}]/u',
        '/[\x{2600}-\x{2604}]/u',
        '/[\x{260E}]/u',
        '/[\x{2611}]/u',
        '/[\x{2614}-\x{2615}]/u',
        '/[\x{2618}]/u',
        '/[\x{261D}]/u',
        '/[\x{2620}]/u',
        '/[\x{2622}-\x{2623}]/u',
        '/[\x{2626}]/u',
        '/[\x{262A}]/u',
        '/[\x{262E}-\x{262F}]/u',
        '/[\x{2638}-\x{263A}]/u',
        '/[\x{2640}]/u',
        '/[\x{2642}]/u',
        '/[\x{2648}-\x{2653}]/u',
        '/[\x{265F}-\x{2660}]/u',
        '/[\x{2663}]/u',
        '/[\x{2665}-\x{2666}]/u',
        '/[\x{2668}]/u',
        '/[\x{267B}]/u',
        '/[\x{267E}-\x{267F}]/u',
        '/[\x{2692}-\x{2697}]/u',
        '/[\x{2699}]/u',
        '/[\x{269B}-\x{269C}]/u',
        '/[\x{26A0}-\x{26A1}]/u',
        '/[\x{26AA}-\x{26AB}]/u',
        '/[\x{26B0}-\x{26B1}]/u',
        '/[\x{26BD}-\x{26BE}]/u',
        '/[\x{26C4}-\x{26C5}]/u',
        '/[\x{26C8}]/u',
        '/[\x{26CE}-\x{26CF}]/u',
        '/[\x{26D1}]/u',
        '/[\x{26D3}-\x{26D4}]/u',
        '/[\x{26E9}-\x{26EA}]/u',
        '/[\x{26F0}-\x{26F5}]/u',
        '/[\x{26F7}-\x{26FA}]/u',
        '/[\x{26FD}]/u',
        '/[\x{2702}]/u',
        '/[\x{2705}]/u',
        '/[\x{2708}-\x{270D}]/u',
        '/[\x{270F}]/u',
        '/[\x{2712}]/u',
        '/[\x{2714}]/u',
        '/[\x{2716}]/u',
        '/[\x{271D}]/u',
        '/[\x{2721}]/u',
        '/[\x{2728}]/u',
        '/[\x{2733}-\x{2734}]/u',
        '/[\x{2744}]/u',
        '/[\x{2747}]/u',
        '/[\x{274C}]/u',
        '/[\x{274E}]/u',
        '/[\x{2753}-\x{2755}]/u',
        '/[\x{2757}]/u',
        '/[\x{2763}-\x{2764}]/u',
        '/[\x{2795}-\x{2797}]/u',
        '/[\x{27A1}]/u',
        '/[\x{27B0}]/u',
        '/[\x{27BF}]/u',
        '/[\x{2934}-\x{2935}]/u',
        '/[\x{2B05}-\x{2B07}]/u',
        '/[\x{2B1B}-\x{2B1C}]/u',
        '/[\x{2B50}]/u',
        '/[\x{2B55}]/u',
        '/[\x{3030}]/u',
        '/[\x{303D}]/u',
        '/[\x{3297}]/u',
        '/[\x{3299}]/u',
        '/[\x{FE0F}]/u',
        '/[\x{1F004}]/u',
        '/[\x{1F0CF}]/u',
        '/[\x{1F170}-\x{1F171}]/u',
        '/[\x{1F17E}-\x{1F17F}]/u',
        '/[\x{1F18E}]/u',
        '/[\x{1F191}-\x{1F19A}]/u',
        '/[\x{1F1E6}-\x{1F1FF}]/u',
        '/[\x{1F201}-\x{1F202}]/u',
        '/[\x{1F21A}]/u',
        '/[\x{1F22F}]/u',
        '/[\x{1F232}-\x{1F23A}]/u',
        '/[\x{1F250}-\x{1F251}]/u',
        '/[\x{1F300}-\x{1F321}]/u',
        '/[\x{1F324}-\x{1F393}]/u',
        '/[\x{1F396}-\x{1F397}]/u',
        '/[\x{1F399}-\x{1F39B}]/u',
        '/[\x{1F39E}-\x{1F3F0}]/u',
        '/[\x{1F3F3}-\x{1F3F5}]/u',
        '/[\x{1F3F7}-\x{1F3FA}]/u',
        '/[\x{1F400}-\x{1F4FD}]/u',
        '/[\x{1F4FF}-\x{1F53D}]/u',
        '/[\x{1F549}-\x{1F54E}]/u',
        '/[\x{1F550}-\x{1F567}]/u',
        '/[\x{1F56F}-\x{1F570}]/u',
        '/[\x{1F573}-\x{1F57A}]/u',
        '/[\x{1F587}]/u',
        '/[\x{1F58A}-\x{1F58D}]/u',
        '/[\x{1F590}]/u',
        '/[\x{1F595}-\x{1F596}]/u',
        '/[\x{1F5A4}-\x{1F5A5}]/u',
        '/[\x{1F5A8}]/u',
        '/[\x{1F5B1}-\x{1F5B2}]/u',
        '/[\x{1F5BC}]/u',
        '/[\x{1F5C2}-\x{1F5C4}]/u',
        '/[\x{1F5D1}-\x{1F5D3}]/u',
        '/[\x{1F5DC}-\x{1F5DE}]/u',
        '/[\x{1F5E1}]/u',
        '/[\x{1F5E3}]/u',
        '/[\x{1F5E8}]/u',
        '/[\x{1F5EF}]/u',
        '/[\x{1F5F3}]/u',
        '/[\x{1F5FA}-\x{1F64F}]/u',
        '/[\x{1F680}-\x{1F6C5}]/u',
        '/[\x{1F6CB}-\x{1F6D2}]/u',
        '/[\x{1F6E0}-\x{1F6E5}]/u',
        '/[\x{1F6E9}]/u',
        '/[\x{1F6EB}-\x{1F6EC}]/u',
        '/[\x{1F6F0}]/u',
        '/[\x{1F6F3}-\x{1F6F9}]/u',
        '/[\x{1F910}-\x{1F93A}]/u',
        '/[\x{1F93C}-\x{1F93E}]/u',
        '/[\x{1F940}-\x{1F945}]/u',
        '/[\x{1F947}-\x{1F970}]/u',
        '/[\x{1F973}-\x{1F976}]/u',
        '/[\x{1F97A}]/u',
        '/[\x{1F97C}-\x{1F9A2}]/u',
        '/[\x{1F9B0}-\x{1F9B9}]/u',
        '/[\x{1F9C0}-\x{1F9C2}]/u',
        '/[\x{1F9D0}-\x{1F9FF}]/u',
        '/[\x{E0062}-\x{E0063}]/u',
        '/[\x{E006C}]/u',
        '/[\x{E006E}]/u',
        '/[\x{E007F}]/u'
    ];

    return preg_replace($regexEmoticons, '', $text);
}

而这里的代码生成它:

<?php

$emojisAsHex = [];
$emojisasAsDecHex = [];

preg_match_all(
    "/(?:>|\s)+(U\+)(?'emojis'[0-9ABCDEF]{4,5})(?:<|\s)+/",
    file_get_contents('http://unicode.org/emoji/charts/full-emoji-list.html'),
    $emojisAsHex
);

//flip it, to remove duplication
$emojisAsHex = array_flip(array_flip($emojisAsHex['emojis']));


foreach ($emojisAsHex as $emojiAsHex) {
    $emojisasAsDecHex[hexdec($emojiAsHex)] = $emojiAsHex;
}

ksort($emojisasAsDecHex);




$outputHexa = '';
$else = '';

$startI = key($emojisasAsDecHex);
$endI =max(array_keys($emojisasAsDecHex)) + 1;

for ($i = $startI; $i < $endI; $i++) {
    if (isset($emojisasAsDecHex[$i]) && isset($emojisasAsDecHex[(1 + $i)])) {

        $outputHexa .=  "'/[\x{" . $emojisasAsDecHex[$i] . '}';
        while (isset($emojisasAsDecHex[(1 + $i)])) {

            $i++;
        }

        $outputHexa .=  '-\x{' . $emojisasAsDecHex[$i] . "}]/u'," . PHP_EOL;
    } else if (isset($emojisasAsDecHex[$i])) {
        $outputHexa .= "'/[\x{" . $emojisasAsDecHex[$i] . "}]/u'," . PHP_EOL;
    }
}


var_dump($outputHexa);


Answer 10:

你可以只使用str_replace()

$emojiArray = array("&0123","&0234",etc. for all emoji);
$strippedComment = str_replace($emojiArray,"",$originalComment);


文章来源: PHP : writing a simple removeEmoji function