Blacklist of words on content to filter message [c

2019-01-26 02:08发布

For a website that takes input from kids we need to filter any naughty / bad words that they use when they enter their comments in the website (running PHP).

The comments are a free field and users can enter whatever comments they want. The solution I can think of is to have a words list like BLACKLIST: bad,bad,word,woord,craap,craaaap, (We can fill this with all the blacklisted words).

Then when the form is saved we can look at the list and if any of the words are present then we will not allow the comment to be saved.

BUT the prolem with this method is that they can get around by adding letters to the words to make it skip the filter EG: shiiiiit

Let me know what you think is the best way to create some filter for these words.

6条回答
唯我独甜
2楼-- · 2019-01-26 02:22

You're never going to be able to filter every permutation. Perhaps the most feasible solution is to filter the obvious, and implement a "Report Abuse" mechanism so someone can manually look over (and reject) suspect comments.

查看更多
手持菜刀,她持情操
3楼-- · 2019-01-26 02:22

Thanks to too much php I've found some links which might be a solution for your case:

查看更多
淡お忘
4楼-- · 2019-01-26 02:26

If you have enough time, it is worthwhile reading about the Scunthorpe problem.

Jeff Atwood also has a post on the futility of obscenity filters.

查看更多
Bombasti
5楼-- · 2019-01-26 02:29

Also there is always the possibility to filter word like "bass" which of course includes one of the words which is not permitted. At the moment some good moderators seem like the best solution to such a problem.

查看更多
贼婆χ
6楼-- · 2019-01-26 02:37

Use uClassify to train bad comments, when the system is trained well enough you can flag the offending comments for moderation.

查看更多
Deceive 欺骗
7楼-- · 2019-01-26 02:38

SO you are going to ban shit, shït, shıt, śhit, and śhiŧ?

Blacklisting is not a viable solution in the Unicode age. Yet banning € outright seems excessive.

查看更多
登录 后发表回答