Skip to content

Distinguish between humans and machines

Bop-Bop edited this page Nov 14, 2016 · 5 revisions

Seems to be a difficult task. Some information I found:

"Good" bots like search engine crawlers shouldn't be blocked. Can be done with robot.txt in combination with a list of known bots, (e.g.http://www.robotstxt.org/db.html, http://www.useragentstring.com/pages/useragentstring.php) by comparing the useragent-string in the header with the list. Problem: User-agent strings can be faked, comparing them to ip adresses can help detect some fakes. Still not reliable

Companys offering bot protection (not really "in-depth information"): https://resources.distilnetworks.com/all-blog-posts/building-a-better-mouse-trap-how-we-detect-and-block-bot-traffic

https://www.incapsula.com/blog/banishing-bad-bots.html Summary: Look at Header Data

IP and ASN Verification

Beheaviour Monitoring: rate of requests, irregular browsing patterns, and abnormal interaction between clients and servers

Client Technology Finger Printing: JavaScript footprint and cookie/protocol support

Clone this wiki locally