I need to extract the text in between divs here ("The third of four...") - using Simple HTML Dom PHP library.
I have tried everything I think! next_sibling()
returns the comment, and
next_sibling()->next_sibling()
returns the <br/>
tag. Ideally I would like to get all the text from the end of the first comment and to the next </div>
tag.
<div class="left">
Bla-bla..
<div class="float">Bla-bla...
</div><!--/end of div.float-->
<br />The third of four performances in the Society's Morning Melodies series features...<a href='index.php?page=tickets&month=20140201'><< Back to full event listing</a>
</div><!--/end of div.left-->
This below prints <!--/end of div.float-->
- the comment tag.
//find content that follows div with a class float. There is a comment in between.
$div_float = $html->find("div.float");
$betweendivs = $div_float[0]->next_sibling();
$actual_content = $betweendivs ->outertext ;
echo $actual_content;
My next step would be getting innertext
of the div.left and then deleting of all the divs inside of it, but that seems like a major hassle. Is there anything easier I can do?
Use find('text', $index)
to get all the text blocks, where $index
is the index of the wanted text...
So in this case, it's:
echo $html->find('text', 3);
// OUTPUT:
The third of four performances in the Society's Morning Melodies series features...
You can read more in the Manual
EDIT:
Here's a working code:
$input = '<div class="left">
Bla-bla..
<div class="float">Bla-bla...
</div><!--/end of div.float-->
<br />The third of four performances in the Society\'s Morning Melodies series features...<a href="index.php?page=tickets&month=20140201"><< Back to full event listing</a>
</div><!--/end of div.left-->';
//Create a DOM object
$html = new simple_html_dom();
// Load HTML from a string
$html->load($input);
// Using $index
echo $html->find('text', 3);
echo "<hr>";
// Or, it's the 3rd element starting from the end
$text = $html->find('text');
echo $text[count($text)-3];
// Clear DOM object
$html->clear();
unset($html);
// OUTPUT
The third of four performances in the Society's Morning Melodies series features...
The third of four performances in the Society's Morning Melodies series features...
Working DEMO
why dont you use ->plaintext on div.class? It outputs the text as needed.
$html->find("div[class=left]")->plaintext;
Martti
I actually think Simple HTML Dom does not provide tools to do this, as there is no "get before " or "get after" types of commands. If I am wrong, please let me know.