When using PHP Simple HTML DOM Parser, is it normal that line breaks
tags are stripped out?
相关问题
- Views base64 encoded blob in HTML with PHP
- Laravel Option Select - Default Issue
- PHP Recursively File Folder Scan Sorted by Modific
- Can php detect if javascript is on or not?
- Using similar_text and strpos together
Was struggling with this as well, since I needed the HTML to be easily editable after processing.
Apparently there's a boolean in the
SimpleHTMLDOM
script$stripRN
, that's set totrue
on default. It strips the\r
,\n
or\r\n
tags in the HTML.Set the var to
false
(several occurences in the script..) and your problem is solved.I know this is old, but I was looking for this as well, and realized there was actually a built in option to turn off the removal of line breaks. No need to go editing the source.
The PHP Simple HTML Dom Parser's
load
function supports multiple useful parameters:When calling the
load
function, simply passfalse
as the third parameter.If using
file_get_html
, it's the ninth parameter.Edit: For
str_get_html
, it's the fifth parameter (Thanks yitwail)If you were passing by here wondering if you can do the same thing in DomDocument then I'm please to say you can! - but it's a bit dirty :(
I had a snippet of code I wanted to tidy but retain the exact line breaks it contained (\n). This is what I did....
It's important to note that I know, without a shadow of a doubt that my input contained only \n. You may want your own variations if \r\n or \t needs to be accounted for. eg slash.T or slash.RN etc
Another option should one wish to preserve other formatting such as paragraphs & headings is to use
innertext
rather thanplaintext
then perform your own string cleaning with the result.I realise there is a performance hit but it does allow for more granular control.
You don't have to change all
$stripRN
to false, the only one that affects this behavior is at line 816 ``:Also consider to change line 988, because multibyte functions often are not installed on machines that do not deal with non-wester-european languages. Original line in v1.5 breaks the script immediately: