How I can replace “text” in the each tag using Jso

2019-02-25 21:10发布

问题:

I have the following html:

<html>
<head>
</head>
<body>
    <div id="content" >
         <p>text <strong>text</strong> text <em>text</em> text </p>
    </div>
</body>    
</html>

How I can replace "text" to "word" in the each tag using Jsoup library. I want to see:

<html>
<head>
</head>
<body>
    <div id="content" >
         <p>word <strong>word</strong> word <em>word</em> word </p>
    </div>
</body>    
</html>

Thank you for any suggestions!

UPD: Thanks for answers, but I found the versatile way:

    Element entry = doc.select("div").first();
    Elements tags = entry.getAllElements();
    for (Element tag : tags) {
        for (Node child : tag.childNodes()) {
            if (child instanceof TextNode && !((TextNode) child).isBlank()) {
                System.out.println(child); //text
                ((TextNode) child).text("word"); //replace to word
            }
        }
    }

回答1:

Document doc = Jsoup.connect(url).get();
String str = doc.toString();
str = str.replace("text", "word");

try it..



回答2:

A quick search turned up this code:

Elements strongs = doc.select("strong");
Element f = strongs.first();
Element l = strongs.last();1,siblings.lastIndexOf(l));

etc

First what you want to do is understand how the library works and what features it contains, and then you figure out how to use the library to do what you need. The code above seems to allow you to select a strong element, at which point you could update it's inner text, but I'm sure there are a number of ways you could accomplish the same.

In general, most libraries which parse xml are able to select any given element in the document object model, or any list of elements, and either manipulate the elements themselves, or their inner text, attributes and the like.

Once you gain more experience working with different libraries, your starting point is to look for the documentation of the library to see what that library does. If you see a method that says it does something, that's what it does, and you can expect to use it to accomplish that goal. Then, instead of writing a question on Stack Overflow, you just need to parse the functionality of the library you're using, and figure out how to use it to do what you want.



回答3:

    String html = "<html> ...";
    Document doc = Jsoup.parse(html);
    Elements p = doc.select("div#content > p");
    p.html(p.html().replaceAll("text", "word"));
    System.out.println(doc.toString());

div#content > p means that the elements <p> in the element <div> which id is content.

If you want to replace the text only in <strong>text</strong>:

    Elements p = doc.select("div#content > p > strong");
    p.html(p.html().replaceAll("text", "word"));