Skip to content

Commit 33f3efe

Browse files
authored
Merge pull request #5 from ScrapeGraphAI/crewai
feat: add crewai integration
2 parents de31fd1 + 3656b00 commit 33f3efe

File tree

2 files changed

+160
-0
lines changed

2 files changed

+160
-0
lines changed

integrations/crewai.mdx

Lines changed: 159 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,159 @@
1+
---
2+
title: '👥 CrewAI'
3+
description: 'Use CrewAI with Scrapegraph'
4+
---
5+
6+
## Overview
7+
8+
[CrewAI](https://github.com/joaomdmoura/crewAI) is a framework for orchestrating role-playing AI agents. With the Scrapegraph CrewAI integration, you can easily incorporate web scraping capabilities into your agent workflows.
9+
10+
<Card
11+
title="Try it in Google Colab"
12+
icon="google"
13+
href="https://colab.research.google.com/drive/14f1nEmo_kPRhc6zTXcm3a9MFcF1aP_-l?authuser=2"
14+
>
15+
Interactive example notebook to get started with CrewAI and Scrapegraph
16+
</Card>
17+
18+
## Installation
19+
20+
Install the required packages:
21+
22+
```bash
23+
pip install crewai scrapegraph-tools python-dotenv
24+
```
25+
26+
## Available Tools
27+
28+
### ScrapegraphScrapeTool
29+
30+
The `ScrapegraphScrapeTool` provides web scraping capabilities to your CrewAI agents:
31+
32+
```python
33+
from crewai import Agent, Crew, Task
34+
from crewai_tools import ScrapegraphScrapeTool
35+
from dotenv import load_dotenv
36+
37+
# Initialize the tool
38+
tool = ScrapegraphScrapeTool()
39+
40+
# Create an agent with the tool
41+
agent = Agent(
42+
role="Web Researcher",
43+
goal="Research and extract accurate information from websites",
44+
backstory="You are an expert web researcher with experience in extracting and analyzing information from various websites.",
45+
tools=[tool],
46+
)
47+
```
48+
49+
<Accordion title="Complete Example" icon="code">
50+
```python
51+
from crewai import Agent, Crew, Task
52+
from crewai_tools import ScrapegraphScrapeTool
53+
from dotenv import load_dotenv
54+
55+
# Load environment variables
56+
load_dotenv()
57+
58+
# Initialize the Scrapegraph tool
59+
tool = ScrapegraphScrapeTool()
60+
61+
# Create an agent with the Scrapegraph tool
62+
agent = Agent(
63+
role="Web Researcher",
64+
goal="Research and extract accurate information from websites",
65+
backstory="You are an expert web researcher with experience in extracting and analyzing information from various websites.",
66+
tools=[tool],
67+
)
68+
69+
# Define a task for the agent
70+
task = Task(
71+
name="scraping task",
72+
description="Visit the website https://scrapegraphai.com and extract detailed information about the founders, including their names, roles, and any relevant background information.",
73+
expected_output="A file with the information extracted from the website.",
74+
agent=agent,
75+
)
76+
77+
# Create a crew with the agent and task
78+
crew = Crew(
79+
agents=[agent],
80+
tasks=[task],
81+
)
82+
83+
# Execute the task
84+
result = crew.kickoff()
85+
```
86+
</Accordion>
87+
88+
## Configuration
89+
90+
Set your Scrapegraph API key in your environment:
91+
92+
```bash
93+
export SCRAPEGRAPH_API_KEY="your-api-key-here"
94+
```
95+
96+
Or using a `.env` file:
97+
98+
```bash
99+
SCRAPEGRAPH_API_KEY=your_api_key_here
100+
```
101+
102+
<Note>
103+
Get your API key from the [dashboard](https://dashboard.scrapegraphai.com)
104+
</Note>
105+
106+
## Use Cases
107+
108+
<CardGroup cols={2}>
109+
<Card title="Content Research" icon="magnifying-glass">
110+
Gather information from multiple websites for market research or competitive analysis
111+
</Card>
112+
<Card title="Data Collection" icon="database">
113+
Extract structured data from websites for analysis or database population
114+
</Card>
115+
<Card title="Automated Monitoring" icon="chart-line">
116+
Keep track of changes on specific web pages
117+
</Card>
118+
<Card title="Information Extraction" icon="filter">
119+
Extract specific data points using natural language
120+
</Card>
121+
</CardGroup>
122+
123+
## Best Practices
124+
125+
<CardGroup cols={2}>
126+
<Card title="Rate Limiting" icon="gauge">
127+
Be mindful of website rate limits and implement appropriate delays
128+
</Card>
129+
<Card title="Error Handling" icon="bug">
130+
Implement proper error handling for failed requests
131+
</Card>
132+
<Card title="Data Validation" icon="check">
133+
Verify extracted data meets requirements
134+
</Card>
135+
<Card title="Ethical Scraping" icon="shield">
136+
Respect robots.txt and website terms of service
137+
</Card>
138+
</CardGroup>
139+
140+
## Support
141+
142+
Need help with the integration?
143+
144+
<CardGroup cols={2}>
145+
<Card
146+
title="GitHub Repository"
147+
icon="github"
148+
href="https://github.com/scrapegraph/scrapegraph-tools"
149+
>
150+
Report bugs and request features
151+
</Card>
152+
<Card
153+
title="Discord Community"
154+
icon="discord"
155+
href="https://discord.gg/scrapegraph"
156+
>
157+
Get help from our community
158+
</Card>
159+
</CardGroup>

mint.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -105,6 +105,7 @@
105105
"pages": [
106106
"integrations/langchain",
107107
"integrations/llamaindex",
108+
"integrations/crewai",
108109
"integrations/phidata"
109110
]
110111
},

0 commit comments

Comments
 (0)