Skip to content

Discover Evaluation Benchmarks for Your AI Models. EvalHub is a platform designed to help researchers and developers find and review evaluation benchmarks for their machine learning models. Easily browse, filter, and access detailed information, GitHub links, and research papers for a variety of metrics.

Notifications You must be signed in to change notification settings

ryantzr1/EvalHub

Folders and files

NameName
Last commit message
Last commit date

Latest commit

fba5b6f ยท Jul 28, 2024

History

70 Commits
Jul 7, 2024
Jul 28, 2024
Jun 9, 2024
Jun 9, 2024
Jun 18, 2024
Jun 16, 2024
Jul 6, 2024
Jul 7, 2024
Jul 21, 2024
Jul 21, 2024
Jun 9, 2024
Jul 7, 2024
Jun 16, 2024
Jun 16, 2024
Jul 6, 2024

Repository files navigation

EvalHub ๐Ÿš€

Discover and Review Evaluation Metrics for Your AI Models!

Description

EvalHub is a platform designed to help you discover and review evaluation metrics for your machine learning models. Explore a variety of metrics here and then head over to the lm-evaluation-harness repository by EleutherAI to evaluate your models.

Features

  • Metric Discovery: Search and filter through a comprehensive list of evaluation metrics.
  • Detailed Information: View detailed descriptions, GitHub links, and paper links for each metric.
  • Categorization: Metrics are organized into categories for easy navigation.
  • User Reviews: Coming soon - see reviews and ratings from other researchers.

Roadmap

  • Review System: Implement a review system for users to rate and comment on metrics.
  • Detailed Analysis: Provide in-depth analysis and comparisons of different metrics.
  • Additional Features: More features to be added based on user feedback and needs.

Contributing

  • Disclaimer: This platform is a work in progress. Contributions are welcome!
  • Feature Requests: If you have an idea for a new feature, please open an issue to discuss it before starting development.

Acknowledgements

  • Special thanks to EleutherAI for their lm-evaluation-harness repository, which inspired the creation of this platform.
  • We appreciate the contributions of all open-source developers and researchers whose work is referenced and utilized within this platform.
  • Thank you to our users and contributors for their valuable feedback and support in improving EvalHub.

About

Discover Evaluation Benchmarks for Your AI Models. EvalHub is a platform designed to help researchers and developers find and review evaluation benchmarks for their machine learning models. Easily browse, filter, and access detailed information, GitHub links, and research papers for a variety of metrics.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published