Replace with empty string replaces newChar around

2019-02-25 23:04发布

问题:

I was just working on one of my java code in which I am using Java String.replace method. So while testing the replace method as in one situation I am planning to put junk value of String.replace("","");

so on Testing I came to a condition of replacing blank value with some other value i.e String.replace("","p") which replaced "p" everywhere around all the characters of the original String

Example:

String strSample = "val";
strSample = strSample.replace("","p");
System.out.println(strSample);

Output:

pvpaplp

Can anyone please explain why it works like this?

回答1:

replace looks for each place that you have a String which starts with the replaced string. e.g. if you replace "a" in "banana" it finds "a" 3 times.

However, for empty string it finds it everywhere including before and after the last letter.



回答2:

Below is the definition from Java docs for the overloaded replace method of your case.

String java.lang.String.replace(CharSequence target, CharSequence replacement)

Replaces each substring of this string that matches the literal target sequence with the specified literal replacement sequence. The replacement proceeds from the beginning of the string to the end, for example, replacing "aa" with "b" in the string "aaa" will result in "ba" rather than "ab".

Parameters:
target The sequence of char values to be replaced
replacement The replacement sequence of char values

Now, since you are defining target value as "" i.e. empty, so it will pick each location in the string and replace it with value defined in replacement.

Good thing to note is the fact that if you will use strSample = strSample.replace(" ","p"); which means one white space character as target value then nothing will be replaced because now in this case replace method will try to search for a white space character.



回答3:

The native Java java.lang.String implementation (like Ruby and Python) considers empty string "" a valid character sequence while performing string operations. Therefore the "" character sequence is effectively everywhere between two chars including before and after the last character.

It works coherently with all java.lang.String operations. See :

String abc = "abc";
System.out.println(abc.replace("", "a"));  // aaabaca instead of "abc"
System.out.println(abc.indexOf("", "a"));  // 0 instead of -1
System.out.println(abc.contains("", "a")); // true instead of false

As a side note :

This behavior might be misleading because many other languages / implementations do not behave like this. For instance, SQL (MySQL, MSSQL, Oracle and PostgreSQL) and PHP do not considers "" like a valid character sequence for string replacement. .NET goes further and throws System.ArgumentException: String cannot be of zero length. when calling, for instance, abc.Replace("", "a").

Even the popular Apache Commons Lang Java library works differently :

org.apache.commons.lang3.StringUtils.replace("abc", "", "a")); /* abc */


回答4:

Take a look at this example:

"" + "abc" + ""

What is result of this code?
Answer: it is still "abc". So as you see we can say that all strings have some empty strings before and after it.

Same rule applies in-between characters like

"a"+""+"b"+""+"c"

will still create "abc"

So empty strings also exists between characters.

In your code

"val".replace("","p")

all these empty strings ware replaced with p which result in pvpaplp.


In case of ""+""+..+""+"" assume that Java is smart enough to see it as one "".