The issue: Wordpress blog's error log is flooded by "charset not supported, assuming utf-8" messages; grows 0 bytes to 450 Mb in 24 hrs (~28k page views, if stats are correct).
Details: I have a Wordppress-powered blog hosted on shared hosting account. It's been running for years, and this was never an issue until not too long ago, but I can't pinpoint the exact time frame when this started to happen. A few months ago I started to exceed my allowed resources (memory mostly), so they moved me to a different server, and I had to upgrade the account for higher allowed recourse usage. Old server was running php5, this one - php7. Latest WP + around 15 popular plugins, all al respective latest versions. The theme is ancient, it's been there from the beginning.
Yesterday I deleted the error log of 9 GB(!) in the site's root, today, 24 hrs later its 500 MB. All lines are similar:
[datetime] PHP Warning: html_entity_decode(): charset `keep-ali0' not supported, assuming utf-8 in /home/accountname/public_html/wp-includes/formatting.php on line 5124
[datetime] PHP Warning: htmlentities(): charset `/[^0-9\.]/' not supported, assuming utf-8 in /home/accountname/public_html/wp-content/plugins/wp-super-cache/wp-cache-base.php on line 5
... etc.
I parsed the older 2 GB log:
- they came from 13 files: 3 core WP files, others from 6 different plugins
- only from these functions:
htmlentities()
,htmlspecialchars()
,html_entity_decode()
- over 1000 unique "charsets": all are garbage, most include non-printable chars, others just weird stuff: paths (not mine!), regexes, integers, hex values...:
#^[a-z]:[/\\]#i
,meta_value
,0x7fe858ae2920
,/home/someone-elses-account-name/public_html/includes/functions.php
, ...
Where do these values come from?
Where do I even start troubleshooting this?
Edit: Solution
There's a great answer below with explanation of why this is happening. Unfortunately being on shared hosting and using third party applications I couldn't use any of the workarounds. But after talking to out hosting provider, they added internal_encoding utf-8
to Apache web server config via include config (? something like that). And it worked.
It appears this is a known bug in PHP, which is difficult to reproduce so it's stuck around a while.
https://bugs.php.net/bug.php?id=71876
Various workarounds have been suggested, including:
internal_encoding=utf-8
in php.ini or usingini_set('internal_encoding', 'utf-8');
default_charset
is not set in php.inihtml_entity_decode($x, null, 'utf-8');
These workarounds appear to have mixed results.
This answer is wrong, please see comment to question by @miken32.
I won't be returning to stackoverflow for a while so I can give you only the first iteration of the process to solve your problem. Put the following in your functions.php file.
This will give you a backtrace when the error occurs. Based on this backtrace you may or may not need to gather additional data to understand your problem. The set_error_handler function is documented here and the debug_backtrace function is documented here. If there are other E_USER_WARNING errors you need to do additional filtering on the file and line number. The php.ini error_reporting option is documented here.
My guess is that this problem is probably caused by bad data in the database that was recently entered into your database. Hopefully, the backtrace will tell you where that data is.
Hope this helps, good luck.
mc