Skip to content

Latest commit

 

History

History
149 lines (106 loc) · 8.68 KB

File metadata and controls

149 lines (106 loc) · 8.68 KB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project

An online repository platform for ShExMaps — mappings between RDF shapes defined by ShEx (Shape Expressions). See REQUIREMENTS.md for full requirements.

Tech Stack

Layer Technology
Backend Node.js with Fastify (api/)
Triplestore / SPARQL QLever (docker/qlever/)
Frontend React SPA with Vite (frontend/)
Auth Optional — OAuth2/OIDC + API keys (AUTH_ENABLED env var)
Deployment Docker + Docker Compose
ShEx processing @shexjs/parser, @shexjs/core
Visualization ReactFlow (mapping graphs), Recharts (coverage heatmaps)

Commands

# Start everything
cp .env.example .env
docker compose up --build

# Development (hot reload — starts api + qlever + nginx)
docker compose up

# Run API in dev mode (outside Docker, requires QLever running separately)
cd api && npm install && npm run dev

# Run frontend dev server
cd frontend && npm install && npm run dev

# Type-check API
cd api && npm run typecheck

# Run tests
cd api && npm test
cd frontend && npm test

# Force full QLever index rebuild (wipes volume, rebuilds from sparql/seed/ + ontology)
./scripts/rebuild-index.sh

# Backup the live triplestore to a Turtle file
./scripts/backup-db.sh                        # → sparql/backup/YYYY-MM-DDTHH-MM-SS.ttl
./scripts/backup-db.sh path/to/output.ttl     # custom output path

# Restore the triplestore from a Turtle backup (destructive — prompts for confirmation)
./scripts/restore-db.sh sparql/backup/YYYY-MM-DDTHH-MM-SS.ttl

# Validate a ShExMap file
npx tsx scripts/validate-shexmap.ts path/to/map.shexmap

Architecture

Services are orchestrated with Docker Compose and communicate over a private shexmap-net bridge network. Only nginx is exposed to the host on port 80.

Browser → nginx:80
           ├── /api/v1/*  → api:3000   (Fastify REST API)
           ├── /sparql    → api:3000   (proxied to QLever with optional auth)
           └── /*         → static     (React SPA)

api:3000 → qlever:7001  (direct SPARQL queries, not through nginx)

Key Directories

Create Pairing Page (/pairings/create)

CreatePairingPage.tsx is the main authoring UI. Key behaviours:

Side panels (source & target)

  • Each panel has a ShExMap selector, a versioned Monaco ShEx editor, a Sample Turtle Data editor, and a Focus IRI input.
  • Turtle data and focus IRI are persisted to localStorage keyed by mapId (shexmap-turtle-data and shexmap-focus-iri keys) and restored automatically when a map is selected.
  • When a pairing is loaded (?id=), the stored sourceFocusIri and targetFocusIri are also restored from the SPARQL pairing record.
  • Each panel has its own Validate button (in the Focus IRI row) that POSTs just that side's ShEx + Turtle + focus node to POST /api/v1/validate and shows a compact binding summary inline. Enabled only when all three inputs are present.

Shared variable highlighting

  • buildVarColorMap computes which %Map:{ variable %} names appear in both ShExMaps; matched variables are colour-coded, unmatched are greyed.

Paired validation (section 3)

  • Direction toggle: Source→Target or Target→Source.
  • Validate extracts bindings from the active source side.
  • Validate & Materialise additionally generates target RDF using the target ShEx.

Save / version

  • "Save Pairing" (new) or "Update Pairing" (edit) saves pairing metadata to QLever. On update, it also creates a ShExMapPairingVersion snapshot atomically. An optional change-note input appears next to the button when editing.
  • Saving also stores sourceFocusIri and targetFocusIri in the pairing record in QLever.
  • A separate ↓ Download button exports the full pairing (metadata + both ShEx contents + focus IRIs) as a JSON file. It is enabled only after the pairing has been saved at least once.
  • Version history is shown via a History (n) button that appears once snapshots exist.

Pairing data model additions

  • shexmap:sourceFocusIri and shexmap:targetFocusIri datatype properties added to ShExMapPairing in the ontology, model, service (GET/create/update), and frontend types. Requires a QLever index rebuild (./scripts/rebuild-index.sh) to take effect on the ontology.

Data Model (RDF)

All ShExMap data is stored as RDF in QLever. The ontology is at sparql/ontology/shexmap.ttl.

Core IRI patterns:

  • ShExMap: https://w3id.org/shexmap/resource/{uuid}
  • User: https://w3id.org/shexmap/resource/user/{id}
  • Schema: https://w3id.org/shexmap/resource/schema/{id}

Authentication

Auth is entirely behind the AUTH_ENABLED environment variable (default: false). When disabled, requireAuth preHandlers are no-ops and the platform is fully public read+write. When enabled, the API supports JWT (Bearer token) and API keys (X-API-Key header). OAuth providers: GitHub, ORCID, Google (wired via @fastify/oauth2).

QLever Notes

QLever builds an on-disk index at startup from Turtle files — it is not a live-append store like Fuseki. Updates go through SPARQL UPDATE via config.qlever.updateUrl. If QLever's UPDATE endpoint is unavailable, the index must be rebuilt via ./scripts/rebuild-index.sh.

The index build runs in the qlever-init init-container and gates all other services via depends_on: condition: service_completed_successfully.

Index builder: init-index.sh calls /qlever/qlever-index directly (not the qlever CLI wrapper, which requires a Qleverfile and fails in headless mode).

Rebuild script (scripts/rebuild-index.sh): bypasses the qlever-perms/qlever-init compose dependency chain entirely — it uses a plain docker run as root to clear the volume and rebuild, avoiding a persistent docker volume permission issue where qlever-perms (chmod 777) does not take effect for the subsequent qlever-init container mount. Seed files from sparql/seed/ and the ontology from sparql/ontology/ are merged and indexed. Wipes all runtime data.

Backup script (scripts/backup-db.sh): issues a CONSTRUCT { ?s ?p ?o } WHERE { ?s ?p ?o } query against the live QLever SPARQL endpoint and saves the result as Turtle to sparql/backup/. Requires QLever to be running.

Restore script (scripts/restore-db.sh): stops QLever, rebuilds the index from a Turtle backup file, and restarts QLever. Prompts for confirmation before proceeding. Use this to recover from a rebuild that wiped needed data.

No sample data by default: sparql/seed/ directories are empty. The QLever index starts with only the ontology triples. Add .ttl files under sparql/seed/shexmaps/ or sparql/seed/pairings/ to pre-populate on fresh index builds.

ShEx version content: ShExMap version content is stored directly in SPARQL as the shexmap:versionContent literal on each ShExMapVersion node — there is no filesystem file store.