How do you determine if a user is not a bot accessing your site?
There are 4 things we are looking for:
-
User agent string. This is very easy to spoof, but often scanners will use their own unique user agent string.
-
Speed of access to pages if they have access to more than every half of one and a half or so, which is usually a good indicator
-
If they only request HTML or request the whole page. Some crawlers will only ask for the HTML structure. This is usually a good hint.
-
Incoming URL
source to share
Reverse captcha sorting can also help; you can create a text input field with display: none; there is a style attribute (or your style) in it. If it is sent, most likely you are dealing with a bot.
Edit: This was actually what was aggregated in my RSS reader, if I can find the source I'll give a good example.
source to share
Take a look at Bad Behavior , a library that uses a wide variety of bot detection methods
source to share