Skip to content

Commit 0ae4c59

Browse files
committed
Add missing assets
Signed-off-by: macdonst <[email protected]>
1 parent 847349b commit 0ae4c59

File tree

5 files changed

+81
-0
lines changed

5 files changed

+81
-0
lines changed
Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
---
2+
title: "Block the Bots in Enhance Projects"
3+
image: '/_public/blog/post-assets/stop-sign.jpg'
4+
image_alt: "An all ways stop sign."
5+
photographer: "John Matychuk"
6+
photographer_url: "https://unsplash.com/@john_matychuk?utm_content=creditCopyText&utm_medium=referral&utm_source=unsplash"
7+
category: ai, enhance
8+
description: "Introducing a new plugin for Architect and Enhance projects to block AI crawler bots."
9+
author: 'Simon MacDonald'
10+
avatar: 'simon.png'
11+
mastodon: "@[email protected]"
12+
published: "July 31, 2024"
13+
---
14+
15+
The backlash against Artificial Intelligence bots scraping the web seems to be growing. Web luminaries like [Ethan Marcotte](https://follow.ethanmarcotte.com/@beep) have written about [how and why](https://ethanmarcotte.com/wrote/blockin-bots/) they are opting out of their _work being hoovered up to train “AI” data models_. Sites like [Read The Docs](https://about.readthedocs.com/) are stating that [AI crawlers need to be more respectful](https://about.readthedocs.com/blog/2024/07/ai-crawlers-abuse/) after noticing their bandwidth declined 75% after blocking AI bots. Cloud providers like CloudFlare have made it much [easier to block bots](https://blog.cloudflare.com/declaring-your-aindependence-block-ai-bots-scrapers-and-crawlers-with-a-single-click).
16+
17+
With Enhance applications you’ve always been able to block AI crawlers by providing your own `robots.txt` file, but today we are introducing a new plugin called `@enhance/arc-plugin-block-bots`.
18+
19+
## Functionality
20+
21+
The plugin will add a new route to your application at `/robots.txt`. This route is used to tell web crawlers and bots which pieces of your web site they are allowed to access. By default, the response generated by the plugin looks like this:
22+
23+
```
24+
User-agent: Amazonbot
25+
User-agent: anthropic-ai
26+
User-agent: Applebot-Extended
27+
User-agent: Bytespider
28+
User-agent: CCBot
29+
User-agent: ChatGPT-User
30+
User-agent: ClaudeBot
31+
User-agent: Claude-Web
32+
User-agent: cohere-ai
33+
User-agent: Diffbot
34+
User-agent: FacebookBot
35+
User-agent: FriendlyCrawler
36+
User-agent: Google-Extended
37+
User-agent: GoogleOther
38+
User-agent: GoogleOther-Image
39+
User-agent: GoogleOther-Video
40+
User-agent: GPTBot
41+
User-agent: ImagesiftBot
42+
User-agent: img2dataset
43+
User-agent: Meta-ExternalAgent
44+
User-agent: OAI-SearchBot
45+
User-agent: omgili
46+
User-agent: omgilibot
47+
User-agent: PerplexityBot
48+
User-agent: YouBot
49+
Disallow: /
50+
```
51+
52+
Once a day, the plugin will check the well maintained [ai.robots.txt](https://github.com/ai-robots-txt/ai.robots.txt) for new user agents to block. If the list has been updated, your site’s `robot.txt` file will be updated accordingly. This way you don’t need to constantly update the file as the plugin will take care of that chore for you.
53+
54+
## Setup
55+
56+
To add `@enhance/arc-plugin-block-bots` to your Enhance application first install the package.
57+
58+
```bash
59+
npm i @enhance/arc-plugin-block-bots
60+
```
61+
62+
Then edit your `.arc` file to add the plugin.
63+
64+
```arc
65+
@plugins
66+
enhance/arc-plugin-block-bots
67+
```
68+
69+
Then all you need to do is deploy your application and the `/robots.txt` route will be available.
70+
71+
## Future Plans
72+
73+
This is just the first release of our bot blocking plugin. We’ve noticed that not all bots are well behaved citizens of the interwebs as some will ignore your `robots.txt` directives. We are looking at ways to protect each and every route of your application from bots using Enhance middleware or by automatically configuring [Amazon WAF Bot Control](https://aws.amazon.com/waf/features/bot-control/).
74+
75+
## Next Steps
76+
77+
* Try out the [plugin](https://github.com/enhance-dev/arc-plugin-block-bots) in your project, and let us know if you have any issues.
78+
* Let us know what metric you want to see next in the plugin. Better yet, send us a PR!
79+
* [Follow](https://fosstodon.org/@enhance_dev) Axol, the Enhance Mascot on Mastodon
80+
* Join the [Enhance Discord](https://enhance.dev/discord) and share what you’ve built, or ask for help.
81+
Loading
Binary file not shown.
Binary file not shown.

public/blog/post-assets/stop-sign.jpg

1.86 MB
Loading

0 commit comments

Comments
 (0)