Simple HTML Dom - find text between divs

2019-04-13 05:43发布

I need to extract the text in between divs here ("The third of four...") - using Simple HTML Dom PHP library.

I have tried everything I think! next_sibling() returns the comment, and next_sibling()->next_sibling() returns the <br/> tag. Ideally I would like to get all the text from the end of the first comment and to the next </div> tag.

<div class="left">
Bla-bla..
<div class="float">Bla-bla...
</div><!--/end of div.float-->
    <br />The third of four performances in the Society's Morning Melodies series features...<a href='index.php?page=tickets&month=20140201'>&lt;&lt; Back to full event listing</a>
</div><!--/end of div.left-->

This below prints <!--/end of div.float--> - the comment tag.

//find content that follows div with a class float. There is a comment in between.
$div_float = $html->find("div.float");
$betweendivs =  $div_float[0]->next_sibling();
$actual_content = $betweendivs ->outertext ;
echo $actual_content;

My next step would be getting innertext of the div.left and then deleting of all the divs inside of it, but that seems like a major hassle. Is there anything easier I can do?

3条回答
何必那么认真
2楼-- · 2019-04-13 05:51

why dont you use ->plaintext on div.class? It outputs the text as needed.

$html->find("div[class=left]")->plaintext;

Martti

查看更多
做自己的国王
3楼-- · 2019-04-13 05:55

Use find('text', $index) to get all the text blocks, where $index is the index of the wanted text...

So in this case, it's:

echo $html->find('text', 3);

// OUTPUT:
The third of four performances in the Society's Morning Melodies series features...

You can read more in the Manual

EDIT:

Here's a working code:

$input = '<div class="left">
Bla-bla..
<div class="float">Bla-bla...
</div><!--/end of div.float-->
    <br />The third of four performances in the Society\'s Morning Melodies series features...<a href="index.php?page=tickets&month=20140201">&lt;&lt; Back to full event listing</a>
</div><!--/end of div.left-->';

//Create a DOM object
$html = new simple_html_dom();
// Load HTML from a string
$html->load($input);

// Using $index
echo $html->find('text', 3);

echo "<hr>";

// Or, it's the 3rd element starting from the end
$text = $html->find('text');
echo $text[count($text)-3];

// Clear DOM object
$html->clear();
unset($html);

// OUTPUT
The third of four performances in the Society's Morning Melodies series features...
The third of four performances in the Society's Morning Melodies series features...

Working DEMO

查看更多
Viruses.
4楼-- · 2019-04-13 05:59

I actually think Simple HTML Dom does not provide tools to do this, as there is no "get before " or "get after" types of commands. If I am wrong, please let me know.

查看更多
登录 后发表回答