Skip to content

Commit 319af3c

Browse files
authored
Merge pull request #1 from ScrapeGraphAI/async-examples-services
feat: add async examples directly to each service
2 parents ec4a8e0 + 615ebdd commit 319af3c

10 files changed

+329
-19
lines changed

README.md

+83-17
Original file line numberDiff line numberDiff line change
@@ -1,32 +1,98 @@
1-
# Mintlify Starter Kit
1+
# 📚 ScrapeGraphAPI Documentation
22

3-
Click on `Use this template` to copy the Mintlify starter kit. The starter kit contains examples including
3+
<p align="center">
4+
<img src="https://raw.githubusercontent.com/VinciGit00/Scrapegraph-ai/main/docs/assets/api-banner.png" alt="ScrapeGraphAPI Logo"/>
5+
</p>
46

5-
- Guide pages
6-
- Navigation
7-
- Customizations
8-
- API Reference pages
9-
- Use of popular components
7+
<p align="center">
8+
<a href="https://docs.scrapegraphai.com">Documentation</a> |
9+
<a href="https://scrapegraphai.com">Website</a> |
10+
<a href="https://discord.gg/uJN7TYcpNa">Discord</a>
11+
</p>
1012

11-
### Development
13+
This repository contains the official documentation for ScrapeGraphAPI, a suite of AI-powered web scraping tools. Our documentation is built using [Mintlify](https://mintlify.com), providing a modern and interactive documentation experience.
1214

13-
Install the [Mintlify CLI](https://www.npmjs.com/package/mintlify) to preview the documentation changes locally. To install, use the following command
15+
## 🚀 Quick Start
1416

15-
```
17+
### Prerequisites
18+
19+
Install the [Mintlify CLI](https://www.npmjs.com/package/mintlify) to preview the documentation locally:
20+
21+
```bash
1622
npm i -g mintlify
1723
```
1824

19-
Run the following command at the root of your documentation (where mint.json is)
25+
### Local Development
2026

21-
```
27+
1. Clone this repository
28+
2. Navigate to the documentation directory
29+
3. Run the development server:
30+
31+
```bash
2232
mintlify dev
2333
```
2434

25-
### Publishing Changes
35+
The documentation will be available at `http://localhost:3000`.
36+
37+
## 📚 Documentation Structure
38+
39+
```
40+
.
41+
├── introduction.mdx # Main introduction page
42+
├── services/ # Core services documentation
43+
│ ├── smartscraper.mdx # SmartScraper service
44+
│ ├── localscraper.mdx # LocalScraper service
45+
│ ├── markdownify.mdx # Markdownify service
46+
│ └── extensions/ # Browser extensions
47+
│ └── firefox.mdx # Firefox extension
48+
├── sdks/ # SDK documentation
49+
│ ├── python.mdx # Python SDK
50+
│ └── javascript.mdx # JavaScript SDK
51+
└── api-reference/ # API reference documentation
52+
```
53+
54+
## 🤝 Contributing
55+
56+
We welcome contributions to improve our documentation! Here's how you can help:
57+
58+
1. Fork this repository
59+
2. Create a new branch for your changes
60+
3. Make your changes to the documentation
61+
4. Test locally using `mintlify dev`
62+
5. Submit a pull request
63+
64+
### Contribution Guidelines
65+
66+
- Follow the existing documentation structure
67+
- Use Markdown and MDX for content
68+
- Include clear commit messages
69+
- Test all changes locally before submitting
70+
- Update the navigation in `mint.json` if adding new pages
71+
72+
## 🛠️ Development Tools
73+
74+
- **Mintlify**: Documentation framework
75+
- **MDX**: Extended Markdown syntax
76+
- **React**: For interactive components
77+
- **Node.js**: Required for local development
78+
79+
## 📝 License
80+
81+
This documentation is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
82+
83+
## 🔗 Useful Links
84+
85+
- [ScrapeGraphAI Website](https://scrapegraphai.com)
86+
- [API Documentation](https://docs.scrapegraphai.com/api-reference/introduction)
87+
- [Discord Community](https://discord.gg/uJN7TYcpNa)
88+
- [GitHub Organization](https://github.com/ScrapeGraphAI)
89+
90+
## 💡 Need Help?
2691

27-
Install our Github App to auto propagate changes from your repo to your deployment. Changes will be deployed to production automatically after pushing to the default branch. Find the link to install on your dashboard.
92+
- Join our [Discord community](https://discord.gg/uJN7TYcpNa)
93+
- Email us at [[email protected]](mailto:[email protected])
94+
- Check our [status page](https://scrapegraphapi.openstatus.dev)
2895

29-
#### Troubleshooting
96+
## ⭐ Show Your Support
3097

31-
- Mintlify dev isn't running - Run `mintlify install` it'll re-install dependencies.
32-
- Page loads as a 404 - Make sure you are running in a folder with `mint.json`
98+
If you find our documentation helpful, please consider giving us a star on GitHub!

integrations/llamaindex.mdx

+1-1
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ This tool integrates ScrapeGraph with LlamaIndex, providing intelligent web scra
1010
<Card
1111
title="Official LlamaHub Documentation"
1212
icon="book"
13-
href="https://llamahub.ai/l/tools/llama-index-tools-scrapegraph"
13+
href="https://docs.llamaindex.ai/en/stable/api_reference/tools/scrapegraph/"
1414
>
1515
View the integration on LlamaHub
1616
</Card>

mint.json

+7-1
Original file line numberDiff line numberDiff line change
@@ -84,7 +84,13 @@
8484
"pages": [
8585
"services/smartscraper",
8686
"services/localscraper",
87-
"services/markdownify"
87+
"services/markdownify",
88+
{
89+
"group": "Extensions",
90+
"pages": [
91+
"services/extensions/firefox"
92+
]
93+
}
8894
]
8995
},
9096
{

services/extensions/firefox.mdx

+167
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,167 @@
1+
---
2+
title: 'Firefox'
3+
description: 'Extract data from any webpage directly in your browser'
4+
icon: 'firefox'
5+
---
6+
7+
<Frame>
8+
<img src="/services/images/firefox-banner.png" alt="ScrapeGraphAI Firefox Extension" />
9+
</Frame>
10+
11+
## Overview
12+
13+
The ScrapeGraphAI Firefox Extension brings the power of our AI-driven data extraction directly to your browser. With a simple prompt, you can extract structured data from any webpage you're visiting, making web research and data collection effortless.
14+
15+
<Note>
16+
Install the extension from the [Firefox Add-ons Store](https://addons.mozilla.org/en-US/firefox/addon/scrapegraphai-extension/)
17+
</Note>
18+
19+
## Key Features
20+
21+
<CardGroup cols={2}>
22+
<Card title="One-Click Access" icon="mouse-pointer">
23+
Extract data from any webpage with a single click
24+
</Card>
25+
<Card title="Natural Language Prompts" icon="comments">
26+
Describe what you want to extract in plain English
27+
</Card>
28+
<Card title="Structured Output" icon="table">
29+
Get clean, structured JSON data
30+
</Card>
31+
<Card title="Browser Integration" icon="browser">
32+
Seamless integration with your browsing experience
33+
</Card>
34+
</CardGroup>
35+
36+
## Getting Started
37+
38+
### API Key Setup
39+
40+
First, you'll need to get your ScrapeGraphAI API key:
41+
42+
<Frame>
43+
<img src="/services/images/firefox-api-key.png" alt="Firefox Extension API Key Setup" />
44+
</Frame>
45+
46+
1. Get your API key from the [dashboard](https://dashboard.scrapegraphai.com)
47+
2. Click the ScrapeGraphAI icon in your toolbar
48+
3. Enter your API key in the settings
49+
4. Click Save to store your API key securely
50+
51+
### Usage
52+
53+
Once your API key is set up, you can start extracting data:
54+
55+
<Frame>
56+
<img src="/services/images/firefox-main-view.png" alt="Firefox Extension Main View" />
57+
</Frame>
58+
59+
1. Navigate to any webpage you want to extract data from
60+
2. Click the ScrapeGraphAI icon in your toolbar
61+
3. Enter your prompt describing what information you want to extract
62+
4. Click Extract to receive structured JSON data based on your prompt
63+
64+
### Example: Reddit Posts Extraction
65+
66+
Here's an example of extracting posts from Reddit:
67+
68+
```text
69+
"List me all the posts in the page, extracting title, post url, image url, upvotes and comments"
70+
```
71+
72+
The extension will return structured JSON data like this:
73+
74+
```json
75+
{
76+
"posts": [
77+
{
78+
"title": "Example Reddit Post Title",
79+
"post_url": "https://reddit.com/r/example/post1",
80+
"image_url": "https://i.redd.it/example1.jpg",
81+
"upvotes": 1234,
82+
"comments": 56
83+
},
84+
{
85+
"title": "Another Reddit Post",
86+
"post_url": "https://reddit.com/r/example/post2",
87+
"image_url": "https://i.redd.it/example2.jpg",
88+
"upvotes": 789,
89+
"comments": 23
90+
}
91+
// ... more posts
92+
]
93+
}
94+
```
95+
96+
## Example Use Cases
97+
98+
### Research
99+
- Extract company information
100+
- Gather product details
101+
- Collect article data
102+
- Compile contact information
103+
104+
### Data Collection
105+
- Market research
106+
- Competitive analysis
107+
- Content aggregation
108+
- Price monitoring
109+
110+
### Personal Use
111+
- Save article summaries
112+
- Extract recipe ingredients
113+
- Compile reading lists
114+
- Save product comparisons
115+
116+
## Example Prompts
117+
118+
Here are some example prompts you can use with the extension:
119+
120+
```text
121+
"Extract the main article content, author, and publication date"
122+
"Get all product information including price, features, and specifications"
123+
"Find all contact information including email, phone, and social media links"
124+
"Extract company information including location, size, and industry"
125+
```
126+
127+
## Best Practices
128+
129+
### Writing Effective Prompts
130+
1. Be specific about what you want to extract
131+
2. Include all relevant fields in your prompt
132+
3. Use clear, natural language
133+
4. Consider the structure of your desired output
134+
135+
### Tips for Better Results
136+
- Wait for the page to load completely before extraction
137+
- Ensure you have access to all content you want to extract
138+
- Use specific terminology when requesting technical information
139+
- Review the extracted data for accuracy
140+
141+
## Privacy & Security
142+
143+
- The extension only accesses webpage content when explicitly triggered
144+
- Data is processed securely through ScrapeGraphAI's servers
145+
- No personal browsing data is stored or tracked
146+
- Complies with Mozilla's privacy requirements
147+
148+
## Support & Resources
149+
150+
<CardGroup cols={2}>
151+
<Card title="Documentation" icon="book" href="/introduction">
152+
Comprehensive guides and tutorials
153+
</Card>
154+
<Card title="Community" icon="discord" href="https://discord.gg/uJN7TYcpNa">
155+
Join our Discord community
156+
</Card>
157+
<Card title="GitHub" icon="github" href="https://github.com/ScrapeGraphAI">
158+
Check out our open-source projects
159+
</Card>
160+
<Card title="Support" icon="headset" href="mailto:[email protected]">
161+
Get help with the extension
162+
</Card>
163+
</CardGroup>
164+
165+
<Card title="Ready to Start?" icon="rocket" href="https://addons.mozilla.org/en-US/firefox/addon/scrapegraphai-extension/">
166+
Install the Firefox extension now and start extracting data with ease!
167+
</Card>

services/images/firefox-api-key.png

22.7 KB
Loading

services/images/firefox-banner.png

662 KB
Loading

services/images/firefox-main-view.png

21.1 KB
Loading

services/localscraper.mdx

+32
Original file line numberDiff line numberDiff line change
@@ -161,6 +161,38 @@ const response = await localScraper(
161161

162162
</CodeGroup>
163163

164+
### Async Support
165+
166+
For applications requiring asynchronous execution, LocalScraper provides async support through the `AsyncClient`:
167+
168+
```python
169+
from scrapegraph_py import AsyncClient
170+
import asyncio
171+
172+
async def main():
173+
html_content = """
174+
<html>
175+
<body>
176+
<h1>Product: Gaming Laptop</h1>
177+
<div class="price">$999.99</div>
178+
<div class="description">
179+
High-performance gaming laptop with RTX 3080.
180+
</div>
181+
</body>
182+
</html>
183+
"""
184+
185+
async with AsyncClient(api_key="your-api-key") as client:
186+
response = await client.localscraper(
187+
website_html=html_content,
188+
user_prompt="Extract the product information"
189+
)
190+
print(response)
191+
192+
# Run the async function
193+
asyncio.run(main())
194+
```
195+
164196
## Integration Options
165197

166198
### Official SDKs

services/markdownify.mdx

+19
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,25 @@ The response includes:
8989
- `error`: Error message (if any occurred during conversion)
9090
</Accordion>
9191

92+
### Async Support
93+
94+
For applications requiring asynchronous execution, Markdownify provides async support through the `AsyncClient`:
95+
96+
```python
97+
from scrapegraph_py import AsyncClient
98+
import asyncio
99+
100+
async def main():
101+
async with AsyncClient(api_key="your-api-key") as client:
102+
response = await client.markdownify(
103+
website_url="https://example.com/article"
104+
)
105+
print(response)
106+
107+
# Run the async function
108+
asyncio.run(main())
109+
```
110+
92111
## Integration Options
93112

94113
### Official SDKs

0 commit comments

Comments
 (0)