How to prevent robots from automatically filling u

2019-01-02 20:08发布

站内文章 / 后端开发

25 0

与风俱净

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I'm trying to come up with a good enough anti-spamming mechanism to prevent automatically generated input. I've read that techniques like captcha, 1+1=? stuff work well, but they also present an extra step impeding the free quick use of the application (I'm not looking for anything like that please).

I've tried setting some hidden fields in all of my forms, with display: none; However, I'm certain a script can be configured to trace that form field id and simply not fill it.

Do you implement/know of a good anti automatic-form-filling-robots method? Is there something that can be done seamlessly with HTML AND/OR server side processing, and be (almost) bulletproof? (without JS as one could simply disable it).

I'm trying not to rely on sessions for this (i.e. counting how many times a button is clicked to prevent overloads).

回答1:

An easy-to-implement but not fool-proof (especially on "specific" attacks) way of solving anti-spam is tracking the time between form-submit and page-load.

Bots request a page, parse the page and submit the form. This is fast.

Humans type in a URL, load the page, wait before the page is fully loaded, scroll down, read content, decide wether to comment/fill in the form, require time to fill in the form, and submit.

The difference in time can be subtle; and how to track this time without cookies requires some way of server-side database. This may be an impact in performance.
Also you need to tweak the threshold-time.

回答2:

I actually find that a simple Honey Pot field works well. Most bots fill in every form field they see, hoping to get around required field validators.

http://haacked.com/archive/2007/09/11/honeypot-captcha.aspx

If you create a text box, hide it in javascript, then verify that the value is blank on the server, this weeds out 99% of robots out there, and doesn't cause 99% of your users any frustration at all. The remaining 1% that have javascript disabled will still see the text box, but you can add a message like "Leave this field blank" for those such cases (if you care about them at all).

(Also, noting that if you do style="display:none" on the field, then it's way too easy for a robot to just see that and discard the field, which is why I prefer the javascript approach).

回答3:

What if - the Bot does not finds any form at all?

_{3 examples:}

1. Insert your form using AJAX

if you're OK with users having JS disabled and not being able to see/submit a form... You can always notify them using <noscript><p class="error">ERROR: The form could not be loaded. Please, re-enable JavaScript in your browser to fully enjoy our services.</p></noscript>. Than,

Create a form.html and place your form inside a <div id="formContainer"> element.
Than inside the page where you need to call that form use an empty <div id="dynamicForm"></div> and this jQuery:

$("#dynamicForm").load("form.html #formContainer");

2. Build your form entirely using JS

// THE FORM
var $form = $("<form/>", {
  appendTo : $("#formContainer"),
  class    : "myForm",
  submit   : AJAXSubmitForm
});

// EMAIL INPUT
$("<input/>",{
  name        : "Email", // Needed for serialization
  placeholder : "Your Email",
  appendTo    : $form,
  on          : {        // Yes, the jQuery's on() Method 
    input : function() {
      console.log( this.value );
    }
  }
});

// MESSAGE TEXTAREA
$("<textarea/>",{
  name        : "Message", // Needed for serialization
  placeholder : "Your message",
  appendTo    : $form
});

// SUBMIT BUTTON
$("<input/>",{
  type        : "submit",
  value       : "Send",
  name        : "submit",
  appendTo    : $form
});

function AJAXSubmitForm(event) {
  event.preventDefault(); // Prevent Default Form Submission
  // do AJAX instead:
  var serializedData = $(this).serialize();
  alert( serializedData );
  $.ajax({
    url: '/mail.php',
    type: "POST",
    data: serializedData,
    success: function (data) {
      // log the data sent back from PHP
      console.log( data );
    }
  });
}

.myForm input,
.myForm textarea{
  font: 14px/1 sans-serif;
  box-sizing: border-box;
  display:block;
  width:100%;
  padding: 8px;
  margin-bottom:12px;
}
.myForm textarea{
  resize: vertical;
  min-height: 120px;
}

<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<div id="formContainer"></div>

3. Bot-bait input

bots like (really like) saucy input elements like:

<input
  type="text"
  name="email"
  id="email"
  placeholder="Your email"
  autocomplete="nope"
  tabindex="-1" />

they'll be happy to enter some value like they do dsaZusil@kddGDHsj.com

than (after using the above HTML), using CSS do like:

input[name=email]{ /* bait input */
    /*
         don't use display:none or visibility:hidden
         cause that will not fool the bot
    */
    position:absolute;
    left:-2000px;
}

now that your input is not visible to the user, expect in PHP that your $_POST["email"] should be empty (without any value)! Otherwise don't submit the form.

Now, all oyu need to do is create another input like <input name="sender" type="text" placeholder="Your email"> after (!) the "bot-bait" input for the actual user Email address.)

_{Acknowledgments:}

Developer.Mozilla - Turning off form autocompletition
StackOverflow - Ignore Tabindex

回答4:

What I did is to use a hidden field and put the timestamp on it and then compared it to the timestamp on the Server using PHP.

If it was faster than 15 seconds (depends on how big or small is your forms) that was a bot.

Hope this help

回答5:

A very effective way to virtually eliminate spam is to have a text field that has text in it such as "Remove this text in order to submit the form!" and that text must be removed in order to submit the form.

Upon form validation, if the text field contains the original text, or any random text for that matter, do not submit the form. Bots can read form names and automatically fill in Name and Email fields but do not know if they have to actually remove text from a certain field in order to submit.

I implemented this method on our corporate website and it totally eliminated the spam we were getting on a daily basis. It really works!

回答6:

How about creating a text field input box the same color as the background which must remain blank. This will get around the problem of a bot reading display:none

回答7:

~~http://recaptcha.net/~~

reCAPTCHA is a free antibot service that helps digitize books

It has been aquired by Google (in 2009):

https://www.google.com/recaptcha
https://developers.google.com/recaptcha/

Also see

https://en.wikipedia.org/wiki/ReCAPTCHA
https://en.wikipedia.org/wiki/CAPTCHA for more general information

回答8:

Many of those spam-bots are just server-side scripts that prowl the web. You can combat many of them by using some javascript to manipulate the form request before its sent (ie, setting an additional field based on some client variable). This isn't a full solution, and can lead to many problems (eg, users w/o javascript, on mobile devices, etc), but it can be part of your attack plan.

Here is a trivial example...

<script>
function checkForm()
{
    // When a user submits the form, the secretField's value is changed
    $('input[name=secretField]').val('goodValueEqualsGoodClient');

    return true;
}
</script>

<form id="cheese" onsubmit="checkForm">
<input type="text" name="burger">

<!-- Check that this value isn't the default value in your php script -->
<input type="hidden" name="secretField" value="badValueEqualsBadClient">

<input type="submit">
</form>

Somewhere in your php script...

<?php

if ($_REQUEST['secretField'] != 'goodValueEqualsGoodClient')
{
    die('you are a bad client, go away pls.');
}

?>

Also, captchas are great, and really the best defense against spam.

回答9:

I'm surprised no one had mentioned this method yet:

On your page, include a small, hidden image.
Place a cookie when serving this image.
When processing the form submission, check for the cookie.

Pros:

convenient for user and developer
seems to be reliable
no JavaScript

Cons:

adds one HTTP request
requires cookies to be enabled on the client

For instance, this method is used by the WordPress plugin Cookies for Comments.

回答10:

With the emergence of headless browsers (like phantomjs) which can emulate anything, you can't suppose that :

spam bots do not use javascript,

you can track mouse events to detect bot,

they won't see that a field is visually hidden,

they won't wait a given time before submitting.

If that used to be true, it is no longer true.

If you wan't an user friendly solution, just give them a beautiful "i am a spammer" submit button:

 <input type="submit" name="ignore" value="I am a spammer!" />
 <input type="image" name="accept" value="submit.png" alt="I am not a spammer" />

Of course you can play with two image input[type=image] buttons, changing the order after each load, the text alternatives, the content of the images (and their size) or the name of the buttons; which will require some server work.

 <input type="image" name="random125454548" value="random125454548.png"
      alt="I perfectly understand that clicking on this link will send the
      e-mail to the expected person" />
 <input type="image" name="random125452548" value="random125452548.png"
      alt="I really want to cancel the submission of this form" />

For accessibility reasons, you have to put a correct textual alternative, but I think that a long sentence is better for screenreaders users than being considered as a bot.

回答11:

A very simple way is to provide some fields like <textarea style="display:none;" name="input"></textarea> and discard all replies that have this filled in.

Another approach is to generate the whole form (or just the field names) using Javascript; few bots can run it.

Anyway, you won't do much against live "bots" from Taiwan or India, that are paid $0.03 per one posted link, and make their living that way.

回答12:

I have a simple approach to stopping spammers which is 100% effective, at least in my experience, and avoids the use of reCAPTCHA and similar approaches. I went from close to 100 spams per day on one of my sites' html forms to zero for the last 5 years once I implemented this approach.

It works by taking advantage of the e-mail ALIAS capabilities of most html form handling scripts (I use FormMail.pl), along with a graphic submission "code", which is easily created in the most simple of graphics programs. One such graphic includes the code M19P17nH and the prompt "Please enter the code at left".

This particular example uses a random sequence of letters and numbers, but I tend to use non-English versions of words familiar to my visitors (e.g. "pnofrtay"). Note that the prompt for the form field is built into the graphic, rather than appearing on the form. Thus, to a robot, that form field presents no clue as to its purpose.

The only real trick here is to make sure that your form html assigns this code to the "recipient" variable. Then, in your mail program, make sure that each such code you use is set as an e-mail alias, which points to whatever e-mail addresses you want to use. Since there is no prompt of any kind on the form for a robot to read and no e-mail addresses, it has no idea what to put in the blank form field. If it puts nothing in the form field or anything except acceptable codes, the form submission fails with a "bad recipient" error. You can use a different graphic on different forms, although it isn't really necessary in my experience.

Of course, a human being can solve this problem in a flash, without all the problems associated with reCAPTCHA and similar, more elegant, schemes. If a human spammer does respond to the recipient failure and programs the image code into the robot, you can change it easily, once you realize that the robot has been hard-coded to respond. In five years of using this approach, I've never had a spam from any of the forms on which I use it nor have I ever had a complaint from any human user of the forms. I'm certain that this could be beaten with OCR capability in the robot, but I've never had it happen on any of my sites which use html forms. I have also used "spam traps" (hidden "come hither" html code which points to my anti-spam policies) to good effect, but they were only about 90% effective.

回答13:

I'm thinking of many things here:

using JS (although you don't want it) to track mouse move, key press, mouse click
getting the referral url (which in this case should be one from the same domain) ... the normal user must navigate through the website before reaching the contact form: PHP: How to get referrer URL?
using a $_SESSION variable to acquire the IP and check the form submit against that list of IPs
Fill in one text field with some dummy text that you can check on server side if it had been overwritten
Check the browser version: http://chrisschuld.com/projects/browser-php-detecting-a-users-browser-from-php.html ... It's clear that a bot won't use a browser but just a script.
Use AJAX to send the fields one by one and check the difference in time between submissions
Use a fake page before/after the form, just to send another input

回答14:

Another option instead of doing random letters and numbers like many websites do, is to do random pictures of recognizable objects. Then ask the user to type in either what color something in the picture is, or what the object itself is.

All in all, every solution is going to have its advantages and disadvantages. You are going to have to find a happy median between too hard for users to pass the antispam mechanism and the number of spam bots that can get through.

回答15:

The best solution I've found to avoid getting spammed by bots is using a very trivial question or field on your form.

Try adding a field like these :

Copy "hello" in the box aside
1+1 = ?
Copy the website name in the box

These tricks require the user to understant what must be input on the form, thus making it much harder to be the target of massive bot form-filling.

EDIT

The backside of this method, as you stated in your question, is the extra step for the user to validate its form. But, in my opinion, it is far simpler than a captcha and the overhead when filling the form is not more than 5 seconds, which seems acceptable from the user point of view.

回答16:

There is a tutorial about this on the JQuery site. Although it's JQuery the idea is framework independent.

If JavaScript isn't available then you may need to fall back to CAPTCHA type approach.

回答17:

the easy way i found to do this is to put a field with a value and ask the user to remove the text in this field. since bots only fill them up. if the field is not empty it means that the user is not human and it wont be posted. its the same purpose of a captcha code.

回答18:

Its just an idea, id used that in my application and works well

you can create a cookie on mouse movement with javascript or jquery and in server side check if cookie exist, because only humans have mouse, cookie can be created only by them the cookie can be a timestamp or a token that can be validate

回答19:

Use 1) form with tokens 2) Check form to form delay with IP address 3) Block IP (optional)

回答20:

In my experience, if the form is just a "contact" form you don't need special measures. Spam get decently filtered by webmail services (you can track webform requests via server-scripts to see what effectively reach your email, of course I assume you have a good webmail service :D)

Btw I'm trying not to rely on sessions for this (like, counting how many times a button is clicked to prevent overloads).

I don't think that's good, Indeed what I want to achieve is receiving emails from users that do some particular action because those are the users I'm interested in (for example users that looked at "CV" page and used the proper contact form). So if the user do something I want, I start tracking its session and set a cookie (I always set session cookie, but when I don't start a session it is just a fake cookie made to believe the user has a session). If the user do something unwanted I don't bother keeping a session for him so no overload etc.

Also It would be nice for me that advertising services offer some kind of api(maybe that already exists) to see if the user "looked at the ad", it is likely that users looking at ads are real users, but if they are not real well at least you get 1 view anyway so nothing loss. (and trust me, ads controls are more sophisticated than anything you can do alone)

回答21:

Actually the trap with display: none works like a charm. It helps to move the CSS declaration to a file containing any global style sheets, which would force spam bots to load those as well (a direct style="display:none;" declaration could likely be interpreted by a spam bot, as could a local style declaration within the document itself).

This combined with other countermeasures should make it moot for any spam bots to unload their junk (I have a guest book secured with a variety of measures, and so far they have fallen for my primary traps - however, should any bot bypass those, there are others ready to trigger).

What I'm using is a combination of fake form fields (also described as invalid fields in case a browser is used that doesn't handle CSS in general or display: none in particular), sanity checks (i. e. is the format of the input valid?), time stamping (both too fast and too slow submissions), MySQL (for implementing blacklists based on e-mail and IP addresses as well as flood filters), DNSBLs (e. g. the SBL+XBL from Spamhaus), text analysis (e. g. words that are a strong indication for spam) and verification e-mails (to determine whether or not the e-mail address provided is valid).

One note on verification mails: This step is entirely optional, but when one chooses to implement it, this process must be as easy-to-use as possible (that is, it should boil down to clicking a link contained in the e-mail) and cause the e-mail address in question to be whitelisted for a certain period of time so that subsequent verifications are avoided in case that user wants to make additional posts.

回答22:

I use a method where there is a hidden textbox. Since bots parse the website they probably fill it. Then I check it if it is empty if it is not website returns back.
Add email verification. The user receives an email and he needs to click a link. Otherwise discard the post in some time.

回答23:

I've added a time check to my forms. The forms will not be submitted if filled in less than 3 seconds and this was working great for me especially for the long forms. Here's the form check function that I call on the submit button

function formCheck(){
var timeStart; 
var timediff;

$("input").bind('click keyup', function () {
    timeStart = new Date().getTime();          
}); 
 timediff= Math.round((new Date().getTime() - timeStart)/1000);

  if(timediff < 3) { 
    //throw a warning or don't submit the form 
  } 
  else submit(); // some submit function

}

回答24:

You can try to cheat spam-robots adding the correct action atribute after Javascript validation. so if the robot block javascript they never submit correctly the form.

HTML

<form id="form01" action="false-action.php">
    //your inputs
    <button>SUBMIT</button>
</form>

JAVASCRIPT

$('#form01 button').click(function(){

   //your Validations and if everything is ok: 

    $('#form01').attr('action', 'correct-action.php').on("load",function(){
        document.getElementById('form01').submit()
    });
})

I add a "callback" after .attr() to prevent errors

回答25:

With increasingly sophisticated spam bots and techniques like automated browsers, it will become harder to determine the source of spam. But whether posted by software, a human, or both, spam is spam because of its content. I think the best solution is to run the posted content through an anti-spam API like Cleantalk or Akismet. It's relatively cheap and effective and doesn't hassle the user. You can check form submission times and the other traditional checks for less sophisticated bots before hitting the API.

回答26:

Robots cannot execute JavaScript so you do something like injecting some kind of hidden element into the page with JavaScript and then detecting it's presence prior to form submission but beware because some of your users will also have JavaScript disabled

Otherwise I think you will be forced to use a form of client proof of "humanness"

回答27:

Just my five cents worth. If the object of this is to stop 99% of robots which sounds pretty good, and if 99% of robots can't run Java-script the best solution that beats all is simply to not use a form that has an action of submit with a post URL.

If the form is controlled via java-script and the java-script collects the form data and then sends it via a HTTP request, no robot can submit the form. Since the submit button would use Java-script to run the code that sends the form.

标签：