Skip to content

laxman-patel/r-squared

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 

Repository files navigation

$r^2$ (r-squared)

A Record & Replay browser automation engine inspired from 100X Bot

$r^2$ is a browser automation framework designed to solve the fragility of traditional RPA selectors. It implements a Record → Synthesize → Replay architecture, utilizing Gemini 3 Flash to convert raw user interactions into resilient, self-healing selector strategies. Unlike autonomous agents that rely on continuous LLM reasoning, $r^2$ compiles workflows into deterministic JSON execution plans, ensuring low-latency replay with API calls per workflow.


Architecture

The system is composed of three distinct phases:

1. Ingestion (Extension Context)

  • Core Library: rrweb (record-replay-web)
  • Mechanism: Captures the DOM mutation stream and user interactions (clicks, inputs, scrolls).
  • Context Injection: On critical interaction events (e.g., mousedown), the recorder captures a localized, sanitized snapshot of the target element's DOM tree (Parent, Siblings, Attributes) to serve as ground truth for the synthesis engine.

2. Synthesis Pipeline (Server Context)

  • Runtime: Bun + Hono
  • Orchestration: Inngest
  • Inference: Gemini 3 Flash
  • Process:
  1. Trace Upload: The client uploads the raw rrweb trace and DOM snapshots to the Bun backend.
  2. Selector Generation: Inngest triggers a background job where Gemini 3 Flash analyzes the target element's properties against the DOM snapshot.
  3. Strategy Compilation: The model generates a ranked list of selector strategies (Reliability Hierarchy: id > data-testid > aria-label > innerText > XPath).
  4. Artifact Creation: A portable JSON ReplayManifest is stored, containing the optimized selector logic for each step.

3. Execution Runtime (Client Context)

  • Mechanism: A lightweight JS injector (player.js) runs directly in the target tab.
  • Logic:
  • Fetches the ReplayManifest from the backend.
  • Iterates through the selector strategies for the current step.
  • Heuristic Validation: If the primary selector fails (e.g., dynamic class change), the runtime attempts fallback strategies in order.
  • Execution: Dispatches trusted isTrusted: true events to the resolved node.

Technical Stack

  • Runtime Environment: Bun (v1.1+) - Selected for high-throughput HTTP performance and native TypeScript support.
  • LLM Provider: Gemini 3 Flash - Utilized for its extensive context window (handling large DOM dumps) and low latency.
  • Async Orchestration: Inngest - Manages the non-blocking synthesis pipeline and retries.
  • Frontend/Extension: React + Vite (CRXJS) - Chrome Extension manifest v3.
  • Database: Supabase (PostgreSQL) - Stores ReplayManifests and user configurations.

Data Schema: ReplayManifest

The core artifact produced by the system is the ReplayManifest. It decouples the execution logic from the specific DOM state of the recording.

interface ReplayStep {
  id: string;
  action: 'click' | 'input' | 'scroll';
  timestamp: number;
  target: {
    // Primary stable identifier
    primarySelector: string;
    // Fallback strategies generated by Gemini
    strategies: [
      { type: 'id', value: 'ember123' },
      { type: 'attribute', key: 'aria-label', value: 'Connect' },
      { type: 'text_approx', value: 'Connect', threshold: 0.8 },
      { type: 'xpath', value: '//button[contains(@class, "artdeco-button")]' }
    ];
  };
  // Pre-condition check (optional)
  constraints?: {
    requiredText?: string;
    urlPattern?: string;
  };
}

Installation & Local Development

Prerequisites

  • Bun v1.1+
  • Google AI Studio API Key (Gemini 3 Flash enabled)
  • Supabase Project (or local Postgres)

Setup

  1. Clone and Install
git clone https://github.com/laxman-0/r-squared.git
cd r-squared
bun install
  1. Environment Configuration Create a .env file in the root:
GEMINI_API_KEY=your_key_here
INNGEST_EVENT_KEY=local_dev
DATABASE_URL=postgres://user:pass@localhost:5432/rsquared
  1. Start Services Backend (Bun + Inngest):
bun run dev:server

Extension Watch Mode:

bun run watch:ext

Inngest Dev Server:

npx inngest-cli@latest dev
  1. Load Extension
  • Navigate to chrome://extensions.
  • Enable Developer Mode.
  • Click Load Unpacked and select the dist/ directory.

API Reference

POST /api/v1/trace

Uploads a raw recording session.

Payload:

{
  "sessionId": "uuid",
  "url": "https://linkedin.com/in/...",
  "events": [ ...rrweb_events ]
}

GET /api/v1/manifest/:id

Retrieves the compiled ReplayManifest for execution.

Response:

{
  "status": "ready",
  "manifest": { ...ReplayManifest }
}

Implementation Notes

  • DOM Sanitization: To reduce token usage, the synthesis engine strips standard HTML attributes (style, width, height) that do not contribute to semantic identity before sending the DOM tree to Gemini.
  • Latency Optimization: The synthesis phase is decoupled from recording. The user can stop recording and immediately receive a "Processing..." status. Once the Inngest job completes, the extension receives a WebSocket push notification (or polls) to enable the "Replay" button.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages