Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/verify-pr.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ jobs:
enable-cache: true

- name: Install dependencies
run: uv sync --package grogbot-search-core --extra test
run: uv sync --package grogbot-search --extra test

- name: Run tests
run: uv run --package grogbot-search-core --extra test pytest packages/search-core/tests
run: uv run --package grogbot-search --extra test pytest packages/search/tests
60 changes: 40 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# Grogbot

Grogbot is a uv-based Python monorepo for multiple systems. The first system, **search**, provides local storage and rank-fused search over markdown documents using FTS, vector, and link authority signals, exposed through both a CLI and a FastAPI service.
Grogbot is a uv-based Python monorepo for multiple systems. The first system, **search**, provides local storage and rank-fused search over markdown documents using FTS, vector, and link authority signals, exposed through a CLI and a server-rendered FastAPI app.

## Packages

- **`grogbot-search-core`** (`packages/search-core`): Core models, SQLite persistence, ingestion, chunking, embeddings, document-link graph storage, and three-signal rank-fused search.
- **`grogbot-search`** (`packages/search`): Core models, SQLite persistence, ingestion, chunking, embeddings, document-link graph storage, and three-signal rank-fused search.
- **`grogbot-cli`** (`packages/cli`): Typer-powered CLI (`grogbot`) that surfaces search functionality.
- **`grogbot-api`** (`packages/api`): FastAPI app exposing the search system over HTTP.
- **`grogbot-app`** (`packages/app`): FastAPI + Jinja browser app for the search system, with a landing page and server-rendered search UI.

## Configuration

Expand All @@ -27,22 +27,34 @@ grogbot search ingest-url https://example.com/article
grogbot search query "hello world" --limit 5
```

## API Usage
## App Usage

The FastAPI app lives in `grogbot_api.app:app` and exposes `/search` routes.
The browser app lives in `grogbot_app.app:app`. It renders HTML directly from the same search database used by the CLI, so make sure you have already ingested content into your configured `db_path`.

Examples:
Run it locally with uvicorn:

```bash
GET /search/sources
POST /search/sources
GET /search/documents/{document_id}
POST /search/ingest/url
POST /search/documents/embed
POST /search/documents/embed/sync
uv run --package grogbot-app uvicorn grogbot_app.app:app --reload
```

Then open `http://127.0.0.1:8000`.

Useful routes:

```bash
GET /
GET /search
GET /search/query?q=hello+world
```

Notes:

- `/` is a simple Grogbot landing page.
- `/search` shows the search form.
- `/search/query` renders up to 25 server-side search results for the `q` parameter.
- Static assets are served from `/assets`.
- There is no standalone JSON HTTP API in the active workspace.

## Document storage and embedding workflow

- `content_markdown` is accepted on upsert/ingest inputs, but it is **not persisted** in the `documents` table.
Expand All @@ -56,27 +68,35 @@ GET /search/query?q=hello+world
- CLI: `grogbot search document embed <document_id>`
- CLI (bulk): `grogbot search document embed-sync --maximum 100`
- Shows a live progress bar with elapsed time and ETA on stderr while preserving the final JSON result on stdout.
- API: `POST /search/documents/embed`
- API (bulk): `POST /search/documents/embed/sync`
- SearchService embedding API uses canonical methods only:
- `embed_document_chunks(document_id)`
- `synchronize_document_embeddings(maximum=...)`
- Accepts an optional per-document progress callback so interactive callers can observe bulk embedding progress without moving CLI presentation logic into `search-core`.
- Accepts an optional per-document progress callback so interactive callers can observe bulk embedding progress without moving CLI presentation logic into `grogbot-search`.
- Legacy aliases `chunk_document` and `synchronize_document_chunks` have been removed.

## Development

Install test dependencies for the search core and run pytest:
Install workspace dependencies:

```bash
uv sync --extra test
uv run pytest packages/search-core/tests
```

Run coverage checks with `pytest-cov`:
Run package tests:

```bash
uv run --package grogbot-search-core --extra test \
pytest packages/search-core/tests \
uv run --package grogbot-search --extra test pytest packages/search/tests
uv run --package grogbot-app --extra test pytest packages/app/tests
```

Run coverage checks for the search package with `pytest-cov`:

```bash
uv run --package grogbot-search --extra test \
pytest packages/search/tests \
--cov=grogbot_search --cov-report=term-missing
```

## Historical note

Archived OpenSpec artifacts may still reference the former `packages/search-core`, `packages/web`, and `packages/api` names. Those references describe the repository at the time those changes were authored and are not the current canonical package layout.
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
schema: spec-driven
created: 2026-03-07
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
## Context

The repository currently presents four workspace packages: `search-core`, `cli`, `api`, and `web`. Their naming is inconsistent across directory names, distribution names, and Python module names: `search-core` already imports as `grogbot_search`, while the web frontend uses `grogbot_web` even though its role is the main browser-facing application. The standalone `api` package is also a thin FastAPI wrapper over `SearchService`, while the newer web frontend already talks directly to the same core service.

This change touches workspace metadata, package metadata, Python import paths, developer commands, tests, and runtime HTTP surfaces. It is also intentionally breaking: package identities change and the JSON HTTP surface disappears.

## Goals / Non-Goals

**Goals:**
- Establish a single canonical naming pattern of `packages/<name>` → `grogbot-<name>` → `grogbot_<name>` where appropriate.
- Rename `search-core` to `search` while preserving the existing `grogbot_search` Python module.
- Rename `web` to `app` and `grogbot_web` to `grogbot_app` so the browser-facing surface is described consistently.
- Remove the standalone `api` package and all JSON HTTP endpoints.
- Update current workspace metadata, tests, and documentation so the new names are the only active names moving forward.
- Explicitly document that archived OpenSpec artifacts may still mention the historical package names.

**Non-Goals:**
- Rebranding the project name from Grogbot to a shorter form.
- Changing the `grogbot_search` module name.
- Adding replacement JSON API routes inside `grogbot_app`.
- Rewriting archived OpenSpec artifacts to retroactively match the new repository structure.
- Changing search behavior, storage, or HTML page behavior beyond import/package identity updates.

## Decisions

### 1. Adopt an exact rename map for active packages

The active workspace packages will become:

- `packages/search` → distribution `grogbot-search` → module `grogbot_search`
- `packages/cli` → distribution `grogbot-cli` → module `grogbot_cli`
- `packages/app` → distribution `grogbot-app` → module `grogbot_app`

`grogbot_search` remains unchanged because it already matches the desired import naming pattern and is used broadly by the CLI and app.

**Rationale:** This yields a predictable rule for contributors: directory, distribution, and module names line up as closely as Python packaging allows.

**Alternatives considered:**
- Keep `grogbot-search-core` while only renaming the directory: rejected because it preserves the current mismatch.
- Rename the core Python module to `grogbot_search_core`: rejected because it adds churn without improving clarity.

### 2. Remove the standalone API package instead of folding its JSON routes into the app

The repository will no longer ship a separate `packages/api` package or its `grogbot-api` distribution. The app remains the only HTTP package and continues to serve the browser-facing HTML/static routes.

**Rationale:** The product direction is now CLI + server-rendered app. Keeping a separate JSON HTTP surface adds maintenance and naming overhead without serving the intended product shape.

**Alternatives considered:**
- Keep the API package for future use: rejected because it keeps unused package structure alive.
- Move the JSON routes into `grogbot_app`: rejected because the desired end state is zero JSON HTTP endpoints, not a relocated API.

### 3. Treat archived OpenSpec references as historical, not normative

Archived change artifacts may continue to reference `packages/search-core`, `packages/web`, and `packages/api`. The new proposal/spec/design set will explicitly state that those references describe the repository at the time of those archived changes and should not be read as the current canonical layout.

**Rationale:** Archive integrity is more valuable than retroactive textual consistency, and changing archived artifacts would blur historical context.

**Alternatives considered:**
- Rewrite archived artifacts to match current names: rejected because it distorts the historical record.

### 4. Update workspace metadata and verification paths in one coordinated sweep

The implementation should update the root workspace configuration, package `pyproject.toml` files, Python import paths, test imports, README commands, and lockfile/package references together.

**Rationale:** Partial renames are brittle in a uv workspace; coordinated updates minimize broken editable installs, stale lockfile entries, and mismatched developer commands.

**Alternatives considered:**
- Rename directories first and defer metadata/docs changes: rejected because it leaves the workspace in an inconsistent and confusing state.

## Risks / Trade-offs

- **[Risk] Existing consumers or scripts may still use `grogbot-search-core`, `grogbot-web`, or `grogbot-api` names** → **Mitigation:** mark the change as breaking and update current documentation/commands everywhere in-repo.
- **[Risk] Import-path churn around `grogbot_web` → `grogbot_app` may leave stale test or runtime references** → **Mitigation:** update all current imports, package metadata, and tests together and verify the app package imports cleanly.
- **[Risk] Removing the API package may surprise future readers because archived docs still mention API work** → **Mitigation:** add an explicit historical-note requirement in this change’s specs and current documentation.
- **[Trade-off] A future programmatic HTTP API would need to be reintroduced intentionally rather than already existing in dormant form** → **Mitigation:** accept this simplification now; if requirements change later, propose a new capability explicitly.

## Migration Plan

1. Rename workspace directories from `search-core` to `search` and `web` to `app`.
2. Update root workspace membership and source definitions to use `grogbot-search` and `grogbot-app`, and remove API package references.
3. Rename the web module package from `grogbot_web` to `grogbot_app` and update all imports/tests/docs accordingly.
4. Delete `packages/api` and remove all in-repo references to its package name, startup commands, and JSON endpoints from current documentation.
5. Regenerate the lockfile and validate package-oriented commands against the renamed workspace packages.

## Open Questions

- None currently. The target package names, HTTP-surface decision, and archive treatment are all decided.
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
## Why

The workspace package layout and distribution names are inconsistent today: `packages/search-core` publishes `grogbot-search-core` while the importable module is already `grogbot_search`, and the web frontend uses `web`/`grogbot-web`/`grogbot_web` instead of the clearer `app` naming used elsewhere. The repository also still carries a standalone JSON API package even though the intended product surfaces are now the CLI and the server-rendered app.

## What Changes

- Rename `packages/search-core` to `packages/search` while keeping the Python module name `grogbot_search` and renaming the published workspace package to `grogbot-search`.
- Rename `packages/web` to `packages/app`, rename the published workspace package from `grogbot-web` to `grogbot-app`, and rename the Python module from `grogbot_web` to `grogbot_app`.
- **BREAKING** Remove the standalone `packages/api` package, its `grogbot-api` distribution, and all JSON HTTP endpoints it exposes.
- Update workspace metadata, developer commands, tests, and current documentation to use the renamed packages as the canonical repository structure.
- Document that archived OpenSpec artifacts may continue to mention the historical `search-core`, `web`, and `api` names and should be read as historical context rather than the current package layout.

## Capabilities

### New Capabilities
- `workspace-package-structure`: Defines the canonical workspace directories, distribution names, module names, and how current docs should describe them.
- `app-http-surface`: Defines the remaining browser-facing HTTP surface after the standalone JSON API is removed.

### Modified Capabilities
- None.

## Impact

- Affected code: root `pyproject.toml`, `uv.lock`, `README.md`, package `pyproject.toml` files, app module imports/tests, and removal of `packages/api`.
- Affected developer workflows: `uv sync`, `uv run --package ...`, local app startup commands, and package-oriented test commands.
- Affected runtime/API surface: the standalone JSON FastAPI service is removed; only the server-rendered app HTTP surface remains.
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
## ADDED Requirements

### Requirement: The app is the only remaining HTTP package
The repository SHALL provide `packages/app` as the only active HTTP-serving package, and the standalone `packages/api` package SHALL be absent from the active workspace.

#### Scenario: A contributor inspects the active HTTP surfaces
- **WHEN** a contributor inspects the active workspace packages and package metadata
- **THEN** they find `packages/app` as the browser-facing HTTP package
- **AND** they do not find an active `packages/api` workspace package or `grogbot-api` distribution

### Requirement: The active HTTP surface is browser-facing rather than JSON API based
The `grogbot_app` package SHALL expose the server-rendered browser routes and static assets needed by the existing app experience, and the active repository SHALL not expose standalone JSON HTTP endpoints for search CRUD, ingestion, embedding, statistics, or query operations.

#### Scenario: A contributor runs the app locally
- **WHEN** a contributor starts the active HTTP package locally
- **THEN** the app serves the browser-facing routes for the landing page and search experience
- **AND** the app serves its static assets
- **AND** the active repository does not provide standalone JSON HTTP endpoints for source CRUD, document CRUD, ingestion, embedding, statistics, or query responses
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
## ADDED Requirements

### Requirement: Canonical workspace package identities
The repository SHALL define the active Grogbot workspace packages using a consistent naming pattern between directory names, distribution names, and Python module names. The active package set SHALL be `packages/search` / `grogbot-search` / `grogbot_search`, `packages/cli` / `grogbot-cli` / `grogbot_cli`, and `packages/app` / `grogbot-app` / `grogbot_app`.

#### Scenario: Active package layout is inspected
- **WHEN** a contributor inspects the current workspace configuration and package metadata
- **THEN** they find `search`, `cli`, and `app` as the active package directories
- **AND** the corresponding active workspace distribution names are `grogbot-search`, `grogbot-cli`, and `grogbot-app`
- **AND** the Python modules are `grogbot_search`, `grogbot_cli`, and `grogbot_app`

### Requirement: Current repository documentation uses canonical package names
Current repository documentation and package-oriented developer commands SHALL use the active package names and paths rather than the retired `search-core`, `web`, and `api` names.

#### Scenario: A contributor follows current setup or run documentation
- **WHEN** a contributor reads current repository documentation for syncing dependencies, running tests, or starting the browser app
- **THEN** the documented package names and paths use `packages/search`, `packages/cli`, and `packages/app`
- **AND** the documented `uv run --package ...` commands use `grogbot-search`, `grogbot-cli`, and `grogbot-app`

### Requirement: Archived naming references remain historical
Archived OpenSpec artifacts SHALL be treated as historical records and MAY retain references to the retired `search-core`, `web`, and `api` names, while current change artifacts and current repository documentation SHALL identify those references as historical rather than canonical.

#### Scenario: A contributor encounters old package names in archived artifacts
- **WHEN** a contributor reads archived OpenSpec artifacts that mention `packages/search-core`, `packages/web`, or `packages/api`
- **THEN** the current change artifacts describe those names as historical references
- **AND** the contributor can determine that the canonical current package layout is `search`, `cli`, and `app`
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
## 1. Rename active workspace packages

- [x] 1.1 Rename `packages/search-core` to `packages/search` and update its package metadata so the distribution name becomes `grogbot-search` while the module remains `grogbot_search`
- [x] 1.2 Rename `packages/web` to `packages/app` and update its package metadata so the distribution name becomes `grogbot-app` and the module becomes `grogbot_app`
- [x] 1.3 Update the root workspace configuration and source declarations to reference `packages/search`, `packages/cli`, and `packages/app` only

## 2. Remove the standalone API surface

- [x] 2.1 Delete the `packages/api` package and remove all active workspace/package references to `grogbot-api`
- [x] 2.2 Remove or update current in-repo references to the retired JSON API endpoints so the active repository no longer documents or imports that surface

## 3. Update verification and documentation

- [x] 3.1 Update app imports/tests and any current package-oriented commands to use `grogbot_app`, `grogbot-app`, and the renamed package paths
- [x] 3.2 Update current documentation to describe the new canonical package structure and explicitly note that archived OpenSpec artifacts may still reference historical names
- [x] 3.3 Regenerate workspace lock/package state as needed and run the relevant test suites or validation commands against the renamed active packages
17 changes: 0 additions & 17 deletions packages/api/pyproject.toml

This file was deleted.

1 change: 0 additions & 1 deletion packages/api/src/grogbot_api/__init__.py

This file was deleted.

Loading