hey guys, i'm using simple html dom to retrieve content from another website, but the thing is theres a character encoding issue with the stuff retrieved using simple html dom. The characters are showing up as the little diamond with the question mark inside.
The character encoding issue only happens with the content retrieved, and all other text on my site is displaying fine.
If anyone could help that would be great.
Go to website and check their charset by viewing page info.
Try using
iconv
to convert the charset of the scraped text to the charset you use on your page.Signature:
Example:
I had this problem too, but it was not the charset problem.It was gzip compression that simple html dom doesn't handle. Here is my solution. Use the function
file_get_html2
insteadfile_get_html
.