I have a large set of nested directories containing PHP, HTML, and Javascript files that should all be encoded as UTF-8. However, someone edited several of the files and saved them with ISO-8859-1 encoding. Unfortunately, they're all mixed in with the UTF-8 files.
I'd like to use the iconv
tool to convert the incorrectly-encoded files to UTF-8 (as described here). Primarily, the problems occur with characters that are valid ISO-8859-1 but invalid UTF-8.
I think an appropriate starting point would be to find all files that contain invalid UTF-8. What's a good way to do this?
I realise this won't catch all of the cases where the wrong character might be displayed. Any further tips on how I might fix this mess?