Skip to content

feat(eval): token budget tracking and cost reports #1050

@yonatangross

Description

@yonatangross

Summary

REVISED: Descoped from token tracking to duration tracking.

CC Max = flat rate. Token costs irrelevant. Track eval duration instead.

EVAL DURATION REPORT
Skill          Trigger    Quality    Total
commit         45s        120s       2m 45s
assess         38s        95s        2m 13s
api-design     52s        180s       3m 52s

Helps decide whether to run --all or just changed skills.

Depends on #1042, #1043

Metadata

Metadata

Assignees

No one assigned

    Labels

    evalSkill evaluation pipeline

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions