Can't remove special characters with str_repla

2019-01-15 21:21发布

问题:

I have a very trivial problem with str_replace.

I have a string with the En Dash character ( - ) like this:

I want to remove - the dash

The html output is

I want to remove the – the dash

I want to do this:

$new_string = str_replace ('-','',$string);

I've tried to parse the string with html_entity_decode, to parse the character to remove with htmlspecialchars,but without any results.

What I'm doing wrong?

-EDIT- This is the full code of my script:

$title = 'Super Mario Galaxy 2 - Debut Trailer'; // Fetched from the DB, in the DB the character is - (minus) not –

$new_title = str_replace(' - ', '', $title);
$new_title = str_replace(" - ", '', $title);
$new_title = str_replace(html_entity_decode('–'),'',$title);

No one works. Basically the problem is that in the DB the dashes are stored as "minus" (I enter the value with the minus key) but for a strange reason the output is &ndash ;

I'm running on Wordpress and the charset is UTF-8, the same for the DB collation.

回答1:

try something like this:

str_replace(html_entity_decode('–', ENT_COMPAT, 'UTF-8'), '', $string);

My guess is it's not really an ndash, but a very similar character. I'd suggest pulling the byte values of each character in the string to see what it looks like:

function decodeString($str) {
    //Fix for mb overloading strlen option
    if (function_exists('mb_strlen')) { 
        $len = mb_strlen($str, '8bit');
    } else {
        $len = strlen($str);
    }
    $ret = '';
    for ($i = 0; $i < $len; $i++) {
        $ret .= dechex(ord($str[$i])).' ';
    }
    return trim($ret);
}

That'll convert the string into the individual byte encodings (turn it into a hex string like 48 65 6C 6C 6F (Hello). Check to see the dash in both cases is in fact the same character. If you see "2D" where the dash is, that's a literal minus sign... If you see the three byte sequence E2 80 93, that's &ndash;. Anything else means a different character...

EDIT: And if you see 26 6E 64 61 73 68 3B that mens a literal &ndash;, so you'd need to do str_replace('&ndash;', '', $str);



回答2:

i've managed to do this by calling remove_filter( 'the_title', 'wptexturize' ); in functions.php an then you perform a str_replace or whatever by "-" sign;



回答3:

There's &ndash; (–) and there's the minus sign (-). Make sure you are not trying to replace the wrong character.



回答4:

I tried everything and nothing worked. but in the end with the help of http://www.ascii.cl/htmlcodes.htm

this code did work for me

        $arr1 = explode(",","0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F");
        $arr2 = explode(",","B,C,D,E,F");

        foreach($arr2 as $t1){
            foreach($arr1 as $t2){
                $val = $t1.$t2;
                $desc = str_replace(chr(hexdec($val)),"",$desc);
            }   
        }

        // if need removing individual value
        $desc = str_replace(chr(hexdec('A2')),"",$desc);


回答5:

Try this:

$new_string = str_replace('&ndash;','',$string);

Or:

$new_string = str_replace(html_entity_decode('&ndash;'),'',$string);

It is basically same as:

$new_string = str_replace ('-','',$string);


回答6:

This was my solution for an invalid ndash:

$string = str_replace(chr(hexdec('3f')), '-', $string);


回答7:

Only this solution worked for me:

$string = str_replace("\x96", "-", $string);


回答8:

To anyone who has tried all of the above but still having no joy then this worked for me (from a WordPress get_the_title() function)

$new_string = str_replace('&#8211;', 'or', $string);