Skip to content

Dev#91

Merged
kasin-it merged 4 commits into
mainfrom
dev
Jun 9, 2026
Merged

Dev#91
kasin-it merged 4 commits into
mainfrom
dev

Conversation

@kasin-it

@kasin-it kasin-it commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

Summary by CodeRabbit

  • New Features

    • Dashboard now displays real cost metrics, evaluation data, and prompt information from live sources.
    • Added historical prompt version tracking with version comparison capabilities.
    • Implemented loading skeletons for improved perceived performance.
  • Improvements

    • Enhanced error handling with graceful degradation when data sources are unavailable.
    • Real-time cost aggregation by workflow and time period.
    • Live evaluation metrics showing pass rates and grading progress.

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@vercel

vercel Bot commented Jun 9, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
ai-workflow-app (ai-workflow-demo) Ready Ready Preview, Comment Jun 9, 2026 11:32am
ai-workflow-app-dashboard Ready Ready Preview, Comment Jun 9, 2026 11:32am

Request Review

@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown

Review Change Stack

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: cfecf400-de28-44e9-aa0f-eb7e97263ff1

📥 Commits

Reviewing files that changed from the base of the PR and between fc7a574 and 58538a1.

📒 Files selected for processing (38)
  • .claude/learnings.md
  • apps/dashboard/app/(cockpit)/cost/page.tsx
  • apps/dashboard/app/(cockpit)/evals/page.tsx
  • apps/dashboard/app/(cockpit)/prompts/page.tsx
  • apps/dashboard/app/api/prompts/[name]/versions/[version]/route.ts
  • apps/dashboard/app/cost-data.tsx
  • apps/dashboard/app/cost-skeleton.tsx
  • apps/dashboard/app/evals-data.tsx
  • apps/dashboard/app/evals-skeleton.tsx
  • apps/dashboard/app/prompts-data.tsx
  • apps/dashboard/app/prompts-skeleton.tsx
  • apps/dashboard/app/skeleton-block.tsx
  • apps/dashboard/components/cockpit/screens/cost.tsx
  • apps/dashboard/components/cockpit/screens/evals.tsx
  • apps/dashboard/components/cockpit/screens/prompts.tsx
  • apps/dashboard/lib/api/fallbacks.ts
  • apps/shared/contracts/api.ts
  • apps/shared/contracts/domain.ts
  • apps/worker/src/lib/overview/collect-cost.test.ts
  • apps/worker/src/lib/overview/collect-cost.ts
  • apps/worker/src/lib/overview/collect-evals.test.ts
  • apps/worker/src/lib/overview/collect-evals.ts
  • apps/worker/src/lib/overview/collect-prompts.test.ts
  • apps/worker/src/lib/overview/collect-prompts.ts
  • apps/worker/src/routes/api/v1/cost.get.ts
  • apps/worker/src/routes/api/v1/evals.get.ts
  • apps/worker/src/routes/api/v1/prompts.get.ts
  • apps/worker/src/routes/api/v1/prompts/[name]/versions/[version].get.ts
  • apps/worker/src/sandbox/arthur-client.test.ts
  • apps/worker/src/sandbox/arthur-client.ts
  • apps/worker/src/workflows/prompts-step.test.ts
  • apps/worker/src/workflows/prompts-step.ts
  • docs/superpowers/plans/2026-06-08-cost-real-data.md
  • docs/superpowers/plans/2026-06-08-evals-real-data.md
  • docs/superpowers/plans/2026-06-08-prompts-real-data.md
  • docs/superpowers/specs/2026-06-08-cost-real-data-design.md
  • docs/superpowers/specs/2026-06-08-evals-real-data-design.md
  • docs/superpowers/specs/2026-06-08-prompts-real-data-design.md

📝 Walkthrough

Walkthrough

This PR converts three core dashboard pages—/cost, /evals, and /prompts—from hardcoded mock data to live, Arthur-backed data. It adds shared data contracts, extends the Arthur client with trace and prompt version reads, implements worker-side aggregation, wires new API routes, and refactors all dashboard components to fetch and consume typed responses within Suspense boundaries.

Changes

Dashboard migration to real Arthur-backed data

Layer / File(s) Summary
Shared data contracts
apps/shared/contracts/domain.ts, apps/shared/contracts/api.ts
New exported types define cost/evals/prompts response shapes: EvalsResponse (union of available/unavailable with fleet metrics), CostResponse (totals, per-workflow, daily timeseries), PromptsResponse (prompt list with Arthur enabled flag), and supporting domain types PromptVersion/PromptDef.
Arthur client trace and prompt read methods
apps/worker/src/sandbox/arthur-client.ts, *.test.ts
Client gains listAllTasks, listTraces, countTraces, listPromptVersions, getPromptVersionBody with new types (TraceRow, ArthurPromptVersion) and pagination/batching helpers. Tests verify pagination, deduping, and 404 fallback behavior across all new methods.
Cost aggregation
apps/worker/src/lib/overview/collect-cost.ts, *.test.ts
New collectCost computes month-to-date totals, per-workflow breakdown, and daily buckets from Arthur traces. Handles null costs, empty traces, and exports CostArthurClient/CollectCostOptions interfaces. Tests cover happy path, null handling, and empty-trace edge cases.
Evals aggregation
apps/worker/src/lib/overview/collect-evals.ts, *.test.ts
New collectEvals counts passed/failed/total continuous-eval traces and derives pass-rate score. Exports EvalsAggregate, EvalsArthurClient, CollectEvalsOptions. Tests verify score calculation, zero-graded edge case, and expected trace-counting calls.
Prompts resolution and version history
apps/worker/src/lib/overview/collect-prompts.ts, *.test.ts, apps/worker/src/workflows/prompts-step.ts
New resolvePrompts helper selects production bodies from Arthur or fallbacks, optionally fetches version history with production-body attachment. Supports Arthur-disabled graceful fallback. prompts-step.ts refactored to delegate to this helper. Tests cover Arthur enabled/disabled, null body, errors, and model selection (codex vs claude).
Worker API routes
apps/worker/src/routes/api/v1/{cost,evals,prompts}.get.ts, apps/worker/src/routes/api/v1/prompts/[name]/versions/[version].get.ts
Four new routes expose cost/evals/prompts data with cache headers and availability discrimination. Each checks Arthur configuration, calls the collector, and returns { available: true, ...data } or { available: false, reason } on unconfigured/error paths.
Dashboard pages and loading patterns
apps/dashboard/app/(cockpit)/{cost,evals,prompts}/page.tsx
All three pages updated to use Suspense boundaries: render async data components with skeleton fallbacks instead of direct screen components, enabling streaming and progressive rendering.
Dashboard skeleton and data components
apps/dashboard/app/{cost,evals,prompts}-{skeleton,data}.tsx, apps/dashboard/app/skeleton-block.tsx
Reusable Block skeleton component. Each feature gets a *Skeleton layout placeholder and an async *Data component that fetches from /api/v1/{feature}, applies fallback on error, and renders the screen with typed props.
Dashboard fallback responses
apps/dashboard/lib/api/fallbacks.ts
New evalsFallback, costFallback, promptsFallback functions provide typed empty/unavailable responses when worker APIs fail or are unconfigured.
Dashboard screen refactors
apps/dashboard/components/cockpit/screens/{cost,evals,prompts}.tsx
All three screens changed from mock-data rendering to data-driven: accept data prop, discriminate on available field, compute KPIs and render from real response fields. Cost adds shortDate() helper and renders KPIs/chart/table from totals and daily buckets. Evals collapses multi-group UI into single Quality card. Prompts adds arthur status color, refactors PromptList to derive tag filters from real data, and rebuilds PromptDetail with lazy version-body fetching and in-file LCS diff viewer.
Prompts version body proxy
apps/dashboard/app/api/prompts/[name]/versions/[version]/route.ts
New Next.js API route proxies historical prompt version bodies from the worker, returning deterministic fallback when upstream fails.
Implementation plans, design specs, and learnings
docs/superpowers/plans/*.md, docs/superpowers/specs/*.md, .claude/learnings.md
Comprehensive documentation of the three-feature conversion including task breakdowns, design rationale, Arthur API assumptions (with corrections), and self-review checklists.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

This PR introduces substantial, interconnected changes across worker logic, shared contracts, API routes, and dashboard UI. The complexity stems from: multiple homogeneous but distinct data flows (cost/evals/prompts) that each require understanding the full chain (contracts → collection → API → dashboard); refactored screen components with new data consumption patterns and UI simplifications; and extended Arthur client with pagination/batching logic. Review effort is elevated by the need to verify contract consistency across layers, trace the data flow end-to-end for each feature, and validate the refactored screens against their new real-data semantics.

Possibly related PRs

  • Blazity/ai-workflow#88: Earlier routing and page structure changes in the same /cost, /evals, /prompts pages that this PR wraps with Suspense and data fetching.
  • Blazity/ai-workflow#87: Initial implementation of the /cost, /evals, /prompts cockpit screens in mock form, directly replaced by real-data versions in this PR.
  • Blazity/ai-workflow#58: Core Arthur client and prompt infrastructure (task enumeration, trace access, prompt-step wiring) that this PR builds upon for the dashboard real-data feature.

Poem

🐰 A rabbit's ode to real-time dashboards

Once mock data filled the dashboard's frame,
But Arthur's traces whispered a better game.
Now costs, evals, and prompts flow free and true,
Server components fetch with a Suspense debut.
No more pretense—just real numbers for you! 🚀

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dev

Comment @coderabbitai help to get the list of available commands and usage tips.

@kasin-it kasin-it merged commit a315824 into main Jun 9, 2026
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant