feat: add new doc

VinciGit00 · VinciGit00 · commit 76e8656f2dca · 2025-03-10T11:46:08.000+01:00
diff --git a/services/markdownify.mdx b/services/markdownify.mdx
@@ -130,6 +130,151 @@ Want to learn more about our AI-powered scraping technology? Visit our [main web
 
 ## Advanced Usage
 
+### Request Helper Function
+
+The `getMarkdownifyRequest` function helps create properly formatted request objects for the Markdownify service:
+
+```javascript
+import { getMarkdownifyRequest } from 'scrapegraph-js';
+
+const request = getMarkdownifyRequest({
+  websiteUrl: "https://example.com/article",
+  headers: {
+    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
+  }
+});
+```
+
+#### Parameters
+
+| Parameter | Type | Required | Description |
+|-----------|------|----------|-------------|
+| websiteUrl | string | Yes | The URL of the webpage to convert to markdown. |
+| headers | object | No | Custom headers for the request (e.g., User-Agent, cookies). |
+
+#### Return Value
+
+Returns an object with the following structure:
+
+```typescript
+{
+  request_id: string;
+  status: "queued" | "processing" | "completed" | "failed";
+  website_url: string;
+  result?: string | null;
+  error: string;
+}
+```
+
+#### Error Handling
+
+The function includes built-in error handling for common scenarios:
+
+```javascript
+try {
+  const request = getMarkdownifyRequest({
+    websiteUrl: "https://example.com/article"
+  });
+} catch (error) {
+  if (error.code === 'INVALID_URL') {
+    console.error('The provided URL is not valid');
+  } else if (error.code === 'MISSING_REQUIRED') {
+    console.error('Required parameters are missing');
+  } else {
+    console.error('An unexpected error occurred:', error);
+  }
+}
+```
+
+#### Advanced Examples
+
+##### Using Custom Headers
+
+```javascript
+const request = getMarkdownifyRequest({
+  websiteUrl: "https://example.com/article",
+  headers: {
+    "User-Agent": "Custom User Agent",
+    "Accept-Language": "en-US,en;q=0.9",
+    "Cookie": "session=abc123; user=john",
+    "Authorization": "Bearer your-auth-token"
+  }
+});
+```
+
+##### Handling Dynamic Content
+
+For websites with dynamic content, you might need to adjust the request:
+
+```javascript
+const request = getMarkdownifyRequest({
+  websiteUrl: "https://example.com/dynamic-content",
+  headers: {
+    // Headers to handle dynamic content
+    "X-Requested-With": "XMLHttpRequest",
+    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9",
+    // Add any required session cookies
+    "Cookie": "dynamicContent=enabled; sessionId=xyz789"
+  }
+});
+```
+
+#### Best Practices
+
+1. **URL Validation**
+   - Always validate URLs before making requests
+   - Ensure URLs use HTTPS when possible
+   - Handle URL encoding properly
+
+```javascript
+import { isValidUrl } from 'scrapegraph-js/utils';
+
+const url = "https://example.com/article with spaces";
+const encodedUrl = encodeURI(url);
+
+if (isValidUrl(encodedUrl)) {
+  const request = getMarkdownifyRequest({ websiteUrl: encodedUrl });
+}
+```
+
+2. **Header Management**
+   - Use appropriate User-Agent strings
+   - Include necessary cookies for authenticated content
+   - Set proper Accept headers
+
+3. **Error Recovery**
+   - Implement retry logic for transient failures
+   - Cache successful responses when appropriate
+   - Log errors for debugging
+
+```javascript
+import { getMarkdownifyRequest, retry } from 'scrapegraph-js';
+
+const makeRequest = retry(async () => {
+  const request = await getMarkdownifyRequest({
+    websiteUrl: "https://example.com/article"
+  });
+  return request;
+}, {
+  retries: 3,
+  backoff: true
+});
+```
+
+4. **Performance Optimization**
+   - Batch requests when possible
+   - Use caching strategies
+   - Monitor API usage
+
+```javascript
+import { cache } from 'scrapegraph-js/utils';
+
+const cachedRequest = cache(getMarkdownifyRequest, {
+  ttl: 3600, // Cache for 1 hour
+  maxSize: 100 // Cache up to 100 requests
+});
+```
+
 ### Async Support
 
 For applications requiring asynchronous execution, Markdownify provides async support through the `AsyncClient`:
diff --git a/services/searchscraper.mdx b/services/searchscraper.mdx
@@ -195,6 +195,210 @@ Want to learn more about our AI-powered search technology? Visit our [main websi
 
 ## Advanced Usage
 
+### Request Helper Function
+
+The `getSearchScraperRequest` function helps create properly formatted request objects for the SearchScraper service:
+
+```javascript
+import { getSearchScraperRequest } from 'scrapegraph-js';
+
+const request = getSearchScraperRequest({
+  userPrompt: "What is the latest version of Python and its main features?",
+  outputSchema: {
+    version: { type: "string" },
+    release_date: { type: "string" },
+    major_features: { type: "array", items: { type: "string" } }
+  }
+});
+```
+
+#### Parameters
+
+| Parameter | Type | Required | Description |
+|-----------|------|----------|-------------|
+| userPrompt | string | Yes | The search query or question to answer. |
+| headers | object | No | Custom headers for the request. |
+| outputSchema | object | No | Schema defining the structure of the search results. |
+
+#### Return Value
+
+Returns an object with the following structure:
+
+```typescript
+{
+  request_id: string;
+  status: "queued" | "processing" | "completed" | "failed";
+  user_prompt: string;
+  result?: object | null;
+  reference_urls: string[];
+  error: string;
+}
+```
+
+#### Error Handling
+
+The function includes comprehensive error handling:
+
+```javascript
+try {
+  const request = getSearchScraperRequest({
+    userPrompt: "What are the latest AI chip developments?",
+    outputSchema: {
+      manufacturers: { type: "array" },
+      technologies: { type: "object" }
+    }
+  });
+} catch (error) {
+  if (error.code === 'INVALID_PROMPT') {
+    console.error('The search prompt is invalid or empty');
+  } else if (error.code === 'SCHEMA_VALIDATION') {
+    console.error('The output schema is invalid:', error.details);
+  } else if (error.code === 'MISSING_REQUIRED') {
+    console.error('Required parameters are missing');
+  } else {
+    console.error('An unexpected error occurred:', error);
+  }
+}
+```
+
+#### Advanced Examples
+
+##### Complex Search Queries
+
+```javascript
+const request = getSearchScraperRequest({
+  userPrompt: "Compare the top 3 cloud providers (AWS, Azure, GCP) focusing on ML services pricing and features",
+  outputSchema: {
+    providers: {
+      type: "array",
+      items: {
+        type: "object",
+        properties: {
+          name: { type: "string" },
+          ml_services: {
+            type: "array",
+            items: {
+              type: "object",
+              properties: {
+                name: { type: "string" },
+                pricing: { type: "string" },
+                features: { type: "array", items: { type: "string" } }
+              }
+            }
+          }
+        }
+      }
+    },
+    comparison_matrix: { type: "object" },
+    recommendation: { type: "string" }
+  }
+});
+```
+
+##### Time-Sensitive Searches
+
+```javascript
+const request = getSearchScraperRequest({
+  userPrompt: "Latest cryptocurrency market trends in the past 24 hours",
+  headers: {
+    // Headers for real-time data sources
+    "Cache-Control": "no-cache",
+    "Pragma": "no-cache"
+  },
+  outputSchema: {
+    timestamp: { type: "string" },
+    trends: { type: "array" },
+    market_summary: { type: "object" }
+  }
+});
+```
+
+#### Best Practices
+
+1. **Query Optimization**
+   - Be specific and clear in your prompts
+   - Include relevant context
+   - Use appropriate keywords
+
+```javascript
+// Good prompt example
+const request = getSearchScraperRequest({
+  userPrompt: "Compare iPhone 15 Pro Max and Samsung S24 Ultra specifications, focusing on camera capabilities, battery life, and performance benchmarks"
+});
+
+// Less effective prompt
+const badRequest = getSearchScraperRequest({
+  userPrompt: "Compare phones" // Too vague
+});
+```
+
+2. **Schema Design**
+   - Start with essential fields
+   - Use appropriate data types
+   - Include field descriptions
+   - Handle nested data properly
+
+```javascript
+const schema = {
+  comparison: {
+    type: "object",
+    properties: {
+      date: { type: "string", description: "Comparison date" },
+      devices: {
+        type: "array",
+        items: {
+          type: "object",
+          properties: {
+            name: { type: "string" },
+            specs: { type: "object" },
+            pros: { type: "array" },
+            cons: { type: "array" }
+          }
+        }
+      }
+    }
+  }
+};
+```
+
+3. **Error Recovery**
+   - Implement retry logic
+   - Handle rate limits
+   - Cache results when appropriate
+
+```javascript
+import { getSearchScraperRequest, retry } from 'scrapegraph-js';
+
+const searchWithRetry = retry(async (prompt) => {
+  const request = await getSearchScraperRequest({
+    userPrompt: prompt
+  });
+  return request;
+}, {
+  retries: 3,
+  backoff: {
+    initial: 1000,
+    multiplier: 2,
+    maxDelay: 10000
+  }
+});
+```
+
+4. **Performance Optimization**
+   - Use caching for repeated searches
+   - Batch related queries
+   - Monitor API usage
+
+```javascript
+import { cache } from 'scrapegraph-js/utils';
+
+const cachedSearch = cache(getSearchScraperRequest, {
+  ttl: 1800, // Cache for 30 minutes
+  maxSize: 50, // Cache up to 50 requests
+  keyGenerator: (params) => params.userPrompt // Cache key based on prompt
+});
+```
+
 ### Custom Schema Example
 
 Define exactly what data you want to extract using Pydantic or Zod:
diff --git a/services/smartscraper.mdx b/services/smartscraper.mdx