Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add evaluation dashboard via github.io page publication #8

Open
keighrim opened this issue May 11, 2023 · 1 comment
Open

add evaluation dashboard via github.io page publication #8

keighrim opened this issue May 11, 2023 · 1 comment
Assignees
Labels
📝D Improvements or additions to documentation
Milestone

Comments

@keighrim
Copy link
Member

Epic issue to track progress and issues related to web presence of evaluation results, and possibly evaluation invokers.

@clams-bot clams-bot added this to infra May 11, 2023
@github-project-automation github-project-automation bot moved this to Todo in infra May 11, 2023
@keighrim keighrim added the 📝D Improvements or additions to documentation label May 11, 2023
@keighrim keighrim modified the milestones: docs-v1, eval-v1 Jun 26, 2023
@keighrim keighrim modified the milestones: eval-v1, docs-v1 Jan 29, 2024
@MrSqually MrSqually self-assigned this Jun 13, 2024
@MrSqually
Copy link
Contributor

Considering the use cases of an evaluation dashboard, the following features seem particularly salient:

Evaluation Dashboard Functions

  • quick lookup and enumeration of evaluated / "production ready" apps within a given problem space
  • comparison of the various CLAMS-ready tools for a given task, across all available metrics
  • comparison of prior evaluations to more recent app updates.

As such, I've come up with the following general structure for the evaluation dashboard:

Proposed Evaluation Dashboard Features

  • "task oriented" grouping of evaluations, similar to the directory structure
  • comparison table between similar app evaluations within a given task + eval dataset
  • link or reference to the annotations used by eval (for cross-reference, validation, etc.)

Example Layout

Optical Character Recognition [task heading]

Dataset: aapb-collaboration-batch-xx (symlink to annotations?)

Metric Tesseract Parseq docTR
CER 75 60 30
WER 60 60 20

etc.

Let me know if there's anything missing here that would be important for this page. I'm also playing with the idea of possibly adding external links to documentation for the metrics themselves (e.g., link to a wikipedia explanation of what CER actually is), but I'm not sure there's a programmatic way to generate that kind of documentation, and doing it manually seems like it defeats the purpose.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📝D Improvements or additions to documentation
Projects
Status: Todo
Development

No branches or pull requests

2 participants