Skip to content

Commit e7c532f

Browse files
authored
Merge pull request #7 from ScrapeGraphAI/feat-add-searchscraper
Feat add searchscraper
2 parents 33f3efe + ef01ef4 commit e7c532f

24 files changed

+1450
-908
lines changed

README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ The documentation will be available at `http://localhost:3000`.
4141
├── introduction.mdx # Main introduction page
4242
├── services/ # Core services documentation
4343
│ ├── smartscraper.mdx # SmartScraper service
44-
│ ├── localscraper.mdx # LocalScraper service
44+
│ ├── searchscraper.mdx # SearchScraper service
4545
│ ├── markdownify.mdx # Markdownify service
4646
│ └── extensions/ # Browser extensions
4747
│ └── firefox.mdx # Firefox extension

api-reference/endpoint/localscraper/get-status.mdx

-13
This file was deleted.

api-reference/endpoint/localscraper/start.mdx

-48
This file was deleted.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
---
2+
title: 'Get SearchScraper Status'
3+
api: 'GET /v1/searchscraper/{request_id}'
4+
description: 'Get the status and results of a previous search request'
5+
---
6+
7+
## Path Parameters
8+
9+
<ParamField path="request_id" type="string" required>
10+
The unique identifier of the search request to retrieve.
11+
12+
Example: "123e4567-e89b-12d3-a456-426614174000"
13+
</ParamField>
14+
15+
## Response
16+
17+
<ResponseField name="request_id" type="string">
18+
The unique identifier of the search request.
19+
</ResponseField>
20+
21+
<ResponseField name="status" type="string">
22+
Status of the request. One of: "queued", "processing", "completed", "failed"
23+
</ResponseField>
24+
25+
<ResponseField name="user_prompt" type="string">
26+
The original search query that was submitted.
27+
</ResponseField>
28+
29+
<ResponseField name="result" type="object">
30+
The search results. If an output_schema was provided in the original request, this will be structured according to that schema.
31+
</ResponseField>
32+
33+
<ResponseField name="reference_urls" type="array">
34+
List of URLs that were used as references for the answer.
35+
</ResponseField>
36+
37+
<ResponseField name="error" type="string">
38+
Error message if the request failed. Empty string if successful.
39+
</ResponseField>
40+
41+
## Example Request
42+
43+
```bash
44+
curl 'https://api.scrapegraphai.com/v1/searchscraper/123e4567-e89b-12d3-a456-426614174000' \
45+
-H 'SGAI-APIKEY: YOUR_API_KEY'
46+
```
47+
48+
## Example Response
49+
50+
```json
51+
{
52+
"request_id": "123e4567-e89b-12d3-a456-426614174000",
53+
"status": "completed",
54+
"user_prompt": "What is the latest version of Python and what are its main features?",
55+
"result": {
56+
"version": "3.12",
57+
"release_date": "October 2, 2023",
58+
"major_features": [
59+
"Improved error messages",
60+
"Per-interpreter GIL",
61+
"Support for the Linux perf profiler",
62+
"Faster startup time"
63+
]
64+
},
65+
"reference_urls": [
66+
"https://www.python.org/downloads/",
67+
"https://docs.python.org/3.12/whatsnew/3.12.html"
68+
],
69+
"error": ""
70+
}
71+
```
72+
73+
## Error Responses
74+
75+
<ResponseField name="400" type="object">
76+
Returned when the request_id is not a valid UUID.
77+
78+
```json
79+
{
80+
"error": "request_id must be a valid UUID"
81+
}
82+
```
83+
</ResponseField>
84+
85+
<ResponseField name="404" type="object">
86+
Returned when the request_id is not found.
87+
88+
```json
89+
{
90+
"error": "Request not found"
91+
}
92+
```
93+
</ResponseField>
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,111 @@
1+
---
2+
title: 'Start SearchScraper'
3+
api: 'POST /v1/searchscraper'
4+
description: 'Start a new AI-powered web search request'
5+
---
6+
7+
## Request Body
8+
9+
<ParamField body="user_prompt" type="string" required>
10+
The search query or question you want to ask. This should be a clear and specific prompt that will guide the AI in finding and extracting relevant information.
11+
12+
Example: "What is the latest version of Python and what are its main features?"
13+
</ParamField>
14+
15+
<ParamField body="headers" type="object">
16+
Optional headers to customize the search behavior. This can include user agent, cookies, or other HTTP headers.
17+
18+
Example:
19+
```json
20+
{
21+
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
22+
"Cookie": "cookie1=value1; cookie2=value2"
23+
}
24+
```
25+
</ParamField>
26+
27+
<ParamField body="output_schema" type="object">
28+
Optional schema to structure the output. If provided, the AI will attempt to format the results according to this schema.
29+
30+
Example:
31+
```json
32+
{
33+
"properties": {
34+
"version": {"type": "string"},
35+
"release_date": {"type": "string"},
36+
"major_features": {"type": "array", "items": {"type": "string"}}
37+
},
38+
"required": ["version", "release_date", "major_features"]
39+
}
40+
```
41+
</ParamField>
42+
43+
## Response
44+
45+
<ResponseField name="request_id" type="string">
46+
Unique identifier for the search request. Use this ID to check the status and retrieve results.
47+
</ResponseField>
48+
49+
<ResponseField name="status" type="string">
50+
Status of the request. One of: "queued", "processing", "completed", "failed"
51+
</ResponseField>
52+
53+
<ResponseField name="user_prompt" type="string">
54+
The original search query that was submitted.
55+
</ResponseField>
56+
57+
<ResponseField name="result" type="object">
58+
The search results. If an output_schema was provided, this will be structured according to that schema.
59+
</ResponseField>
60+
61+
<ResponseField name="reference_urls" type="array">
62+
List of URLs that were used as references for the answer.
63+
</ResponseField>
64+
65+
<ResponseField name="error" type="string">
66+
Error message if the request failed. Empty string if successful.
67+
</ResponseField>
68+
69+
## Example Request
70+
71+
```bash
72+
curl -X POST 'https://api.scrapegraphai.com/v1/searchscraper' \
73+
-H 'SGAI-APIKEY: YOUR_API_KEY' \
74+
-H 'Content-Type: application/json' \
75+
-d '{
76+
"user_prompt": "What is the latest version of Python and what are its main features?",
77+
"output_schema": {
78+
"properties": {
79+
"version": {"type": "string"},
80+
"release_date": {"type": "string"},
81+
"major_features": {"type": "array", "items": {"type": "string"}}
82+
},
83+
"required": ["version", "release_date", "major_features"]
84+
}
85+
}'
86+
```
87+
88+
## Example Response
89+
90+
```json
91+
{
92+
"request_id": "123e4567-e89b-12d3-a456-426614174000",
93+
"status": "completed",
94+
"user_prompt": "What is the latest version of Python and what are its main features?",
95+
"result": {
96+
"version": "3.12",
97+
"release_date": "October 2, 2023",
98+
"major_features": [
99+
"Improved error messages",
100+
"Per-interpreter GIL",
101+
"Support for the Linux perf profiler",
102+
"Faster startup time"
103+
]
104+
},
105+
"reference_urls": [
106+
"https://www.python.org/downloads/",
107+
"https://docs.python.org/3.12/whatsnew/3.12.html"
108+
],
109+
"error": ""
110+
}
111+
```

api-reference/endpoint/user/get-credits.mdx

+3-3
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,9 @@ description: 'Get the remaining credits and total credits used for your account.
55
---
66

77
This endpoint allows you to check your account's credit balance and usage. Each API request consumes a different number of credits:
8-
- Markdownify: 2 credits per webpage
9-
- SmartScraper: 5 credits per webpage
10-
- LocalScraper: 10 credits per webpage
8+
- Markdownify: 2 credits per request
9+
- SmartScraper: 10 credits per request
10+
- SearchScraper: 30 credits per request
1111

1212
The response shows:
1313
- `remaining_credits`: Number of credits available for use

api-reference/endpoint/user/submit-feedback.mdx

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ openapi: 'POST /v1/feedback'
44
description: 'Submit feedback for a specific request with rating and optional comments.'
55
---
66

7-
This endpoint allows you to submit feedback for any request you've made using our services (SmartScraper, LocalScraper, or Markdownify). Your feedback helps us improve our services.
7+
This endpoint allows you to submit feedback for any request you've made using our services (SmartScraper, SearchScraper, or Markdownify). Your feedback helps us improve our services.
88

99
### Rating System
1010
- Rating scale: 0-5 stars

api-reference/errors.mdx

+1-1
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ Indicates that the request was malformed or invalid.
4343
"error": "Invalid HTML content"
4444
}
4545
```
46-
Applies to LocalScraper when the provided HTML is invalid.
46+
Applies to SmartScraper when the provided HTML is invalid.
4747
</Accordion>
4848
</AccordionGroup>
4949

api-reference/introduction.mdx

+3-3
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ description: 'Complete reference for the ScrapeGraphAI REST API'
55

66
## Overview
77

8-
The ScrapeGraphAI API provides powerful endpoints for AI-powered web scraping and content extraction. Our RESTful API allows you to extract structured data from any website, process local HTML content, and convert web pages to clean markdown.
8+
The ScrapeGraphAI API provides powerful endpoints for AI-powered web scraping and content extraction. Our RESTful API allows you to extract structured data from any website, perform AI-powered web searches, and convert web pages to clean markdown.
99

1010
## Authentication
1111

@@ -31,8 +31,8 @@ https://api.scrapegraphai.com/v1
3131
<Card title="SmartScraper" icon="robot" href="/api-reference/endpoint/smartscraper/start">
3232
Extract structured data from any website using AI
3333
</Card>
34-
<Card title="LocalScraper" icon="file-code" href="/api-reference/endpoint/localscraper/start">
35-
Process local HTML content with AI extraction
34+
<Card title="SearchScraper" icon="magnifying-glass" href="/api-reference/endpoint/searchscraper/start">
35+
Perform AI-powered web searches with structured results
3636
</Card>
3737
<Card title="Markdownify" icon="markdown" href="/api-reference/endpoint/markdownify/start">
3838
Convert web content to clean markdown

0 commit comments

Comments
 (0)