Pinned Loading
-
open-compass/opencompass
open-compass/opencompass PublicOpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
-
PremiLab-Math/MathCheck
PremiLab-Math/MathCheck Public[ICLR 2025] Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist
Python 31
-
NLP2CT/kNN-TL
NLP2CT/kNN-TL Public[ACL 2023] kNN-TL: k-Nearest-Neighbor Transfer Learning for Low-Resource Neural Machine Translation
-
NLP2CT/UaIT
NLP2CT/UaIT Public[EMNLP 2024] Can LLMs Learn Uncertainty on Their Own? Expressing Uncertainty Effectively in A Self-Training Manner
Python
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.