How is the SmolInstruct overall score calculated for Intern-S1-Pro?

I noticed that the SmolInstruct  overall score for Intern-S1-Pro on the leaderboard is 74.8.
After running the evaluation using the configuration file at
https://github.com/open-compass/opencompass/blob/main/examples/eval_intern_s1_pro.py
, the resulting SmolInstruct output only contains evaluation scores for individual subsets. Could you clarify how the overall score shown in the leaderboard is computed?

| dataset | version | metric | mode | qwen3 |
|----- | ----- | ----- | ----- | -----|
| SmolInstruct | - | - | - | - |
| NC-I2F-0shot-instruct | d2fb04 | score | gen | 86.33 |
| NC-I2S-0shot-instruct | ead200 | score | gen | 1.17 |
| NC-S2F-0shot-instruct | 989e6e | score | gen | 71.40 |
| NC-S2I-0shot-instruct | fb7430 | score | gen | 5.68 |
| PP-ESOL-0shot-instruct | 9cf92f | score | gen | 0.76 |
| PP-Lipo-0shot-instruct | 6af51f | score | gen | 0.81 |
| PP-BBBP-0shot-instruct | 5376b4 | accuracy | gen | 83.76 |
| PP-ClinTox-0shot-instruct | b94c41 | accuracy | gen | 40.97 |
| PP-HIV-0shot-instruct | fcbfe8 | accuracy | gen | 55.86 |
| PP-SIDER-0shot-instruct | b5e48e | accuracy | gen | 65.80 |
| MC-0shot-instruct | 35dbf4 | score | gen | 0.21 |
| MG-0shot-instruct | 30c630 | score | gen | 58.93 |
| FS-0shot-instruct | fe206a | score | gen | 72.48 |
| RS-0shot-instruct | bafb38 | score | gen | 44.76 |
|  | - | - | - | - |

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How is the SmolInstruct overall score calculated for Intern-S1-Pro? #36

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

dataset	version	metric	mode	qwen3
SmolInstruct	-	-	-	-
NC-I2F-0shot-instruct	d2fb04	score	gen	86.33
NC-I2S-0shot-instruct	ead200	score	gen	1.17
NC-S2F-0shot-instruct	989e6e	score	gen	71.40
NC-S2I-0shot-instruct	fb7430	score	gen	5.68
PP-ESOL-0shot-instruct	9cf92f	score	gen	0.76
PP-Lipo-0shot-instruct	6af51f	score	gen	0.81
PP-BBBP-0shot-instruct	5376b4	accuracy	gen	83.76
PP-ClinTox-0shot-instruct	b94c41	accuracy	gen	40.97
PP-HIV-0shot-instruct	fcbfe8	accuracy	gen	55.86
PP-SIDER-0shot-instruct	b5e48e	accuracy	gen	65.80
MC-0shot-instruct	35dbf4	score	gen	0.21
MG-0shot-instruct	30c630	score	gen	58.93
FS-0shot-instruct	fe206a	score	gen	72.48
RS-0shot-instruct	bafb38	score	gen	44.76
	-	-	-	-

How is the SmolInstruct overall score calculated for Intern-S1-Pro? #36

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions