Making email addresses safe from bots on a webpage

2019-01-10 05:46发布

问题:

When placing email addresses on a webpage do you place them as text like this:

joe.somebody@company.com

or use a clever trick to try and fool the email address harvester bots? For example:

HTML Escape Characters:

joe.somebody@company.com

Javascript Decrypter:

function XOR_Crypt(EmailAddress)
{
    Result = new String();
    for (var i = 0; i < EmailAddress.length; i++)
    {
        Result += String.fromCharCode(EmailAddress.charCodeAt(i) ^ 128);
    }
    document.write(Result);
}

XOR_Crypt("êïå®óïíåâïäùÀãïíðáîù®ãïí");

Human Decode:

joe.somebodyNOSPAM@company.com

joe.somebody AT company.com

What do you use or do you even bother?

回答1:

I generally don't bother. I used to be on a mailing list that got several thousand spams every day. Our spam filter (spamassassin) let maybe 1 or 2 a day through. With filters this good, why make it difficult for legitimate people to contact you?



回答2:

Invent your own crazy email address obfuscation scheme. Doesn't matter what it is, really, as long as it's not too similar to any of the commonly known methods.

The problem is that there really isn't a good solution to this, they're all either relatively simple to bypass, or rather irritating for the user. If any one method becomes prevalent, then someone will find a way around it.

So rather than looking for the One True email address obfuscation technique, come up with your own. Count on the fact that these bot authors don't care enough about your site to sit around writing a thing to bypass your slightly crazy rendering-text-with-css-and-element-borders or your completely bizarre, easily-cracked javascript encryption. It doesn't matter if it's trivial, nobody will bother trying to bypass it just so they can spam you.



回答3:

I've written an encoder (source) that uses all kinds of parsing tricks that I could think of (different kinds of HTML entities, URL encoding, comments, multiline attributes, soft hyphens, non-obvious structure of mailto: URL, etc)

It doesn't stop all harvesters, but OTOH it's completely standards-compliant and transparent to the users.

Another IMHO good approach (which you can use in addition to tricky encoding) is along lines of:

<a href="mailto:userhatestogetspam@example.com" 
   onclick="this.href=this.href.replace(/hatestogetspam/,'')">


回答4:

You can protect your email address with reCAPTCHA, they offer a free service so people have to enter a CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) to see your email: https://www.google.com/recaptcha/admin#mailhide



回答5:

I wouldn't bother -- it is fighting the SPAM war at the wrong level. Particularly for company web sites I think it makes things look very unprofessional if you have anything other than the straight text on the page with a mailto hyperlink.

There is so much spam flying around that you need good filtering anyway, and any bot is going end up understanding all the common tricks anyway.



回答6:

HTML:

<a href="#" class="--mailto--john--domain--com-- other classes goes here" />

JavaScript, using jQuery:

// match all a-elements with "--mailto--" somehere in the class property
$("a[class*='--mailto--']").each(function ()
{
    /*
    for each of those elements use a regular expression to pull
    out the data you need to construct a valid e-mail adress
    */
    var validEmailAdress = this.className.match();

    $(this).click(function ()
    {
        window.location = validEmailAdress;
    });
});


回答7:

I don't bother. You'll only annoy sophisticated users and confuse unsophisticated users. As others have said, Gmail provides very effective spam filters for a personal/small business domain, and corporate filters are generally also very good.



回答8:

The only safest way is of course not to put the email address onto web page in the first place.



回答9:

Use a contact form instead. Put all of your email addresses into a database and create an HTML form (subject, body, from ...) that submits the contents of the email that the user fills out in the form (along with an id or name that is used to lookup that person's email address in your database) to a server side script that then sends an email to the specified person. At no time is the email address exposed. You will probably want to implement some form of CAPTCHA to deter spambots as well.



回答10:

A response of mine on a similar question:

I use a very simple combination of CSS and jQuery which displays the email address correctly to the user and also works when the anchor is clicked:

HTML:

<a href="mailto:me@example.spam" id="lnkMail">moc.elpmaxe@em</a>

CSS:

#lnkMail {
  unicode-bidi: bidi-override;
  direction: rtl;
}

jQuery:

$('#lnkMail').hover(function(){
  // here you can use whatever replace you want
  var newHref = $(this).attr('href').replace('spam', 'com');
  $(this).attr('href', newHref);
});

Here is a working example.



回答11:

Try an Email Icon generator. http://services.nexodyne.com/email/

Ofcourse there are still some OCR bots which might get this..



回答12:

I make mine whateverDOC@whatever.com and then next to it I write "Remove the capital letters"



回答13:

Gmail which is free has an awesome spam filter.

If you don't want to use Gmail directly you could send the email to gmail and use gmail forwarding to send it back to you after it has gone through their spam filter.

In a more complex situation, when you need to show a @business.com address you could show the public@business.com and have all this mail forwarded to a gmail account who then forwards it back to the real@business.com

I guess it's not a direct solution to your question but it might help. Gmail being free and having such a good SPAM filter makes using it a very wise choice IMHO.

I receive about 100 spam per day in my gmail account but I can't remember the last time one of them got to my inbox.

To sum up, use a good spam filter whether Gmail or another. Having the user retype or modify the email address that is shown is like using DRM to protect against piracy. Putting the burden on the "good" guy shouldn't be the way to go about doing anything. :)



回答14:

For your own email address I'd recommend not worrying about it too much. If you have a need to make your email address available to thousands of users then I would recommend either using a gmail address (vanilla or via google apps) or using a high quality spam filter.

However, when displaying other users email addresses on your website I think some level of due diligence is required. Luckily, a blogger named Silvan Mühlemann has done all the difficult work for you. He tested out different methods of obfuscation over a period of 1.5 years and determined the best ones, most of them involve css or javascript tricks that allow the address to be presented correctly in the browser but will confuse automated scrapers.



回答15:

Another, possibly unique, technique might be to use multiple images and a few plain-text letters to display the address. That might confuse the bots.



回答16:

A script that saves email addresses to png files would be a secure solution ( if you have enough space and you are allowed to embed images in your page )



回答17:

This is what we use (VB.NET):

Dim rxEmailLink As New Regex("<a\b[^>]*mailto:\b[^>]*>(.*?)</a>")
Dim m As Match = rxEmailLink.Match(Html)
While m.Success
    Dim strEntireLinkOrig As String = m.Value
    Dim strEntireLink As String = strEntireLinkOrig
    strEntireLink = strEntireLink.Replace("'", """") ' replace any single quotes with double quotes to make sure the javascript is well formed
    Dim rxLink As New Regex("(<a\b[^>]*mailto:)([\w.\-_^@]*@[\w.\-_^@]*)(\b[^>]*?)>(.*?)</a>")
    Dim rxLinkMatch As Match = rxLink.Match(strEntireLink)
    Dim strReplace As String = String.Format("<script language=""JavaScript"">document.write('{0}{1}{2}>{3}</a>');</script>", _
                RandomlyChopStringJS(rxLinkMatch.Groups(1).ToString), _
                ConvertToAsciiHex(rxLinkMatch.Groups(2).ToString), _
                rxLinkMatch.Groups(3), _
                ConvertToHtmlEntites(rxLinkMatch.Groups(4).ToString))
    Result = Result.Replace(strEntireLinkOrig, strReplace)
    m = m.NextMatch()
End While

and

    Public Function RandomlyChopStringJS(ByVal s As String) As String
        Dim intChop As Integer = Int(6 * Rnd()) + 1
        Dim intCount As Integer = 0
        RandomlyChopStringJS = ""
        If Not s Is Nothing AndAlso Len(s) > 0 Then
            For Each c As Char In s.ToCharArray()
                If intCount = intChop Then
                    RandomlyChopStringJS &= "'+'"
                    intChop = Int(6 * Rnd()) + 1
                    intCount = 0
                End If
                RandomlyChopStringJS &= c
                intCount += 1
            Next
        End If
    End Function

We override Render and run the outgoing HTML through this before it goes out the door. This renders email addresses that render normally to a browser, but look like this in the source:

<script language="JavaScript">document.write('<a '+'clas'+'s='+'"Mail'+'Link'+'" hr'+'ef'+'="ma'+'ilto:%69%6E%66%6F%40%62%69%63%75%73%61%2E%6F%72%67">&#105;&#110;&#102;&#111;&#64;&#98;&#105;&#99;&#117;&#115;&#97;&#46;&#111;&#114;&#103;</a>');</script>

Obviously not foolproof, but hopefully cuts down on a certain amount of harvesting without making things hard for the visitor.



回答18:

It depends on what exactly your needs are. For most sites with which I work, I have found it far more useful to put in a "contact me/us" form which sends an email from the system to whomever needs to be contacted. I know that this isn't exactly the solution that you are seeking but it does completely protect against harvesting and so far I have never seen spam sent through a form like that. It will happen but it is very rare and you are never harvested.

This also gives you a chance to log the messages before sending them giving you an extra level of protection against losing a contact, if you so desire.



回答19:

Spam bots will have their own Javascript and CSS engines over time, so I think you shouldn't look in this direction.



回答20:

Option 1 : Split email address into multiple parts and create an array in JavaScript out of these parts. Next join these parts in the correct order and use the .innerHTML property to add the email address to the web page.

 <span id="email">  </span>   // blank tag

 <script>
 var parts = ["info", "XXXXabc", "com", "&#46;", "&#64;"];
 var email = parts[0] + parts[4] + parts[1] + parts[3] + parts[2];
 document.getElementById("email").innerHTML=email; 
 </script>

Option 2 : Use image instead of email text

Image creator website from text : http://www.chxo.com/labelgen/

Option 3 : We can use AT instead of "@" and DOT instead of " . "

i.e :

 info(AT)XXXabc(DOT)com 


回答21:

I just coded the following. Don't know if it's good but it's better then just writing the email in plain text. Many robots will be fooled but not all of them.

<script type="text/javascript">
    $(function () {
        setTimeout(function () {
            var m = ['com', '.', 'domain', '@', 'info', ':', 'mailto'].reverse().join('');

            /* Set the contact email url for each "contact us" links.*/
            $('.contactUsLink').prop("href", m);
        }, 200);
    });
</script>

If the robot solve this then there's no need to add more "simple logic" code like "if (1 == 1 ? '@' : '')" or adding the array elements in another order since the robot just evals the code anyway.



回答22:

Font-awesome works!

<link rel="stylesheet" href="path/to/font-awesome/css/font-awesome.min.css">

<p>myemail<i class="fa fa-at" aria-hidden="true"></i>mydomain.com</p>

http://fontawesome.io/