Skip to content

[New Task] Add ASSIN2 (Brazilian Portuguese) — RTE and STS subtasks #3765

Description

@lucasrabay

Summary

This proposes adding ASSIN2 to the harness — a standard Brazilian Portuguese benchmark covering two subtasks:

  • ASSIN2-RTE: Recognizing Textual Entailment (~6,500 train / 500 val / ~3,000 test sentence pairs, F1-macro metric)
  • ASSIN2-STS: Semantic Textual Similarity (same splits, Pearson correlation metric)

Dataset available at nilc-nlp/assin2 on Hugging Face.

Motivation

ASSIN2 is the standard upgrade to ASSIN v1 (already in portuguese_bench) and is the most widely used Brazilian Portuguese NLP benchmark since 2020. Adding it would extend the existing portuguese_bench task group with a more recent evaluation suite, and complements the recently proposed ifeval_pt (#3622).

Implementation plan

  • Two task configs: assin2_rte and assin2_sts under portuguese_bench
  • Dataset: nilc-nlp/assin2
  • RTE: few-shot multiple choice, F1-macro
  • STS: generation, Pearson correlation
  • Happy to implement if maintainers are interested

References

  • Dataset: huggingface.co/datasets/nilc-nlp/assin2
  • Paper: Real et al., 2020, "The ASSIN 2 Shared Task: A Quick Overview"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions