HTML.fromHtml Line Breaks Disappearing

2019-02-09 06:02发布

问题:

I am taking Spanned Text from an EditText box and converting it to a HTML tagged string using HTML.toHtml. This works fine. I have verified that the string is correct and contains a <br> in the appropriate location. However, when I got to convert the tagged string back to a spanned text to populate a TextView or EditText using HTML.fromHtml the <br> (or multiple ones if they are present) at the end of the first paragraph disappear. This means that if a users entered text with multiple line breaks and wanted to keep that formatting it gets lost.

I attached a picture to help illustrate this. The first EditText is the user input, the TextView Below it is the HTML.tohtml result of the EditText above it, the EditText below it is populated using HTML.fromHtml using the string in the TextView above it. As you can see the line breaks have disappeared and so have the extra lines. Furthermore, when the spanned text of the second edit text is run through the HTML.toHtml it now produces a different HTML tagged String.

I would like to be able to take the HTML tagged String from the first EditText and populate other TextViews or EditTexts without losing line breaks and formatting.

Thanks

回答1:

I also had this problem and I could not find an easy "transform" or something alike solution. Note something important, when the user presses "enter" java produces the special character \n but in HTML there is no such format for line breaking. It is the <br />.

So what I have done was to replace some specific CharSequences, from the plain text, by the alternative HTML format. In my case there was only the "enter" character so it was not that messy.



回答2:

I had similar problem when I was trying to save/restore editText content to db. The problem is in Html.toHtml, it somehow skips line brakes:

    String src = "<p dir=\"ltr\">First line</p><p dir=\"ltr\">Second<br/><br/><br/></p><p dir=\"ltr\">Third</p>";
    EditText editText = new EditText(getContext());
    // All line brakes are correct after this
    editText.setText(new SpannedString(Html.fromHtml(src))); 
    String result = Html.toHtml(editText.getText()); // Here breaks are lost
    // Output :<p dir="ltr">First line</p><p dir="ltr">Second<br></p><p dir="ltr">Third</p>

I've solved this by using custom toHtml function to serialize spanned text, and replaced all '\n' with "< br/>:

    public class HtmlParser {
        public static String toHtml(Spannable text) {
            final SpannableStringBuilder ssBuilder = new SpannableStringBuilder(text);
            int start, end;

            // Replace Style spans with <b></b> or <i></i>
            StyleSpan[] styleSpans = ssBuilder.getSpans(0, text.length(), StyleSpan.class);
            for (int i = styleSpans.length - 1; i >= 0; i--) {
                StyleSpan span = styleSpans[i];
                start = ssBuilder.getSpanStart(span);
                end = ssBuilder.getSpanEnd(span);
                ssBuilder.removeSpan(span);
                if (span.getStyle() == Typeface.BOLD) {
                    ssBuilder.insert(start, "<b>");
                    ssBuilder.insert(end + 3, "</b>");
                } else if (span.getStyle() == Typeface.ITALIC) {
                    ssBuilder.insert(start, "<i>");
                    ssBuilder.insert(end + 3, "</i>");
                }
            }

            // Replace underline spans with <u></u>
            UnderlineSpan[] underSpans = ssBuilder.getSpans(0, ssBuilder.length(), UnderlineSpan.class);
            for (int i = underSpans.length - 1; i >= 0; i--) {
                UnderlineSpan span = underSpans[i];
                start = ssBuilder.getSpanStart(span);
                end = ssBuilder.getSpanEnd(span);
                ssBuilder.removeSpan(span);
                ssBuilder.insert(start, "<u>");
                ssBuilder.insert(end + 3, "</u>");
            }
            replace(ssBuilder, '\n', "<br/>");

            return ssBuilder.toString();
        }

        private static void replace(SpannableStringBuilder b, char oldChar, String newStr) {
            for (int i = b.length() - 1; i >= 0; i--) {
                if (b.charAt(i) == oldChar) {
                    b.replace(i, i + 1, newStr);
                }
            }
        }
}

Also it turned out that this way is faster in about 4 times that default Html.toHtml(): I've made a benchmark with about 20 pages and 200 spans:

    Editable ed = editText.getText(); // Here is a Tao Te Ching :)
    String result = "";
    DebugHelper.startMeasure("Custom");
    for (int i = 0; i < 10; i++) {
        result = HtmlParserHelper.toHtml(ed);
    }
    DebugHelper.stopMeasure("Custom"); // 19 ms

    DebugHelper.startMeasure("Def");
    for (int i = 0; i < 10; i++) {
        result = Html.toHtml(ed);
    }
    DebugHelper.stopMeasure("Def"); // 85 ms


回答3:

Replace /n => < br>< br>

example

< p>hi< /p> < p>j< /p>

to:

< p>hi< /p>< br>< br>< p>j< /p>