Regarding Java String Manipulation

2019-01-12 01:16发布

问题:

I have the string "MO""RET" gets stored in items[1] array after the split command. After it get's stored I do a replaceall on this string and it replaces all the double quotes. But I want it to be stored as MO"RET. How do i do it. In the csv file from which i process using split command Double quotes within the contents of a Text field are repeated (Example: This account is a ""large"" one"). So i want retain the one of the two quotes in the middle of string if it get's repeated and ignore the end quotes if present . How can i do it?

String items[] = line.split(",(?=([^\"]*\"[^\"]*\")*[^\"]*$)");
items[1] has "MO""RET"
String recordType = items[1].replaceAll("\"","");

After this recordType has MORET I want it to have MO"RET

回答1:

Don't use regex to split a CSV line. This is asking for trouble ;) Just parse it character-by-character. Here's an example:

public static List<List<String>> parseCsv(InputStream input, char separator) throws IOException {
    BufferedReader reader = null;
    List<List<String>> csv = new ArrayList<List<String>>();
    try {
        reader = new BufferedReader(new InputStreamReader(input, "UTF-8"));
        for (String record; (record = reader.readLine()) != null;) {
            boolean quoted = false;
            StringBuilder fieldBuilder = new StringBuilder();
            List<String> fields = new ArrayList<String>();
            for (int i = 0; i < record.length(); i++) {
                char c = record.charAt(i);
                fieldBuilder.append(c);
                if (c == '"') {
                    quoted = !quoted;
                }
                if ((!quoted && c == separator) || i + 1 == record.length()) {
                    fields.add(fieldBuilder.toString().replaceAll(separator + "$", "")
                        .replaceAll("^\"|\"$", "").replace("\"\"", "\"").trim());
                    fieldBuilder = new StringBuilder();
                }
                if (c == separator && i + 1 == record.length()) {
                    fields.add("");
                }
            }
            csv.add(fields);
        }
    } finally {
        if (reader != null) try { reader.close(); } catch (IOException logOrIgnore) {}
    }
    return csv;
}

Yes, there's little regex involved, but it only trims off ending separator and surrounding quotes of a single field.

You can however also grab any 3rd party Java CSV API.



回答2:

How about:

String recordType = items[1].replaceAll( "\"\"", "\"" );


回答3:

I prefer you to use replace instead of replaceAll. replaceAll uses REGEX as the first argument.

The requirement is to replace two continues QUOTES with one QUOTE

String recordType = items[1].replace( "\"\"", "\"" );

To see the difference between replace and replaceAll , execute bellow code

recordType = items[1].replace( "$$", "$" );
recordType = items[1].replaceAll( "$$", "$" );


回答4:

Here you can use the regular expression.

recordType = items[1].replaceAll( "\\B\"", "" ); 
recordType = recordType.replaceAll( "\"\\B", "" ); 

First statement replace the quotes in the beginning of the word with empty character. Second statement replace the quotes in the end of the word with empty character.