Skip to content

Distinguish between humans and machines

Robert edited this page Nov 17, 2016 · 5 revisions

Seems to be a difficult task. Some information I found:

"Good" bots like search engine crawlers shouldn't be blocked.

Can be done with robot.txt in combination with a list of known bots, (e.g.http://www.robotstxt.org/db.html, http://www.useragentstring.com/pages/useragentstring.php) by comparing the useragent-string in the header with the list.

Problem: User-agent strings can be faked, comparing them to ip adresses can help detect some fakes. Still not reliable

Companys offering bot protection (not really "in-depth information"):

https://resources.distilnetworks.com/all-blog-posts/building-a-better-mouse-trap-how-we-detect-and-block-bot-traffic

https://www.incapsula.com/blog/banishing-bad-bots.html

Summary:

Look at Header Data

IP and ASN Verification

Beheaviour Monitoring: rate of requests, irregular browsing patterns, and abnormal interaction between clients and servers

Client Technology Finger Printing: JavaScript footprint and cookie/protocol support

###Further Ideas: Mousemovement-Tracking Timedifferences between input

Clone this wiki locally