Skip to content

Organizing Existing Customized Functionality #388

Open
@pfliu-nlp

Description

@pfliu-nlp

Below is an organization of existing customized functionalities of ExplainaBoard SDK.
Something could be discussed:

  • Should we disable 2? (the downside is users with customized datasets couldn't define customized features.
  • Regarding 4, what would be a good way to introduce customized functions?

1. Customized Dataset

  • users are allowed to analyze their system outputs on their customized datasets by specifying their formats (e.g.., tsv, json)
  • Example
        loader = get_loader_class(TaskType.text_classification)(
            load_file_as_str(self.tsv_dataset),
            load_file_as_str(self.txt_output),
            Source.in_memory,
            Source.in_memory,
            FileType.tsv,
            FileType.text,
        )

2. Customized features through system_output file

  • users are allowed to define and provide the value of customized features by specifying some information in their system output files
  • This can work for all data loaders
  • Example: a system output file with customized features
{
  "metadata": {
    "custom_features": {
      "rel_type": {
        "dtype": "string",
        "description": "symmetric or asymmetric",
        "num_buckets": 2
      }
    }
  },
  "examples": [
    {
      "gold_head": "/m/08966",
      "gold_predicate": "/travel/travel_destination/climate./travel/travel_destination_monthly_climate/month",
      "gold_tail": "/m/05lf_",
      "predict": "tail",
      "predictions": [
        "/m/05lf_",
        "/m/02x_y",
        "/m/01nv4h",
        "/m/02l6h",
        "/m/0kz1h"
      ],
      "rel_type": "asymmetric",
      "example_id": "1"
    },
  ...

3. Customized features through an additional config file

  • user can introduce customized features by specifying the feature name in an external global config file.
  • Note that this is only valid when DataLab loader is used
  • Example
{
  "sst2": {
    "custom_features": {
      "example": {
        "label": {
          "dtype": "string",
          "description": "the true label"
        }
      }
    },
...
}

4. Customized feature function through an additional config file

  • user can introduce customized feature functions by specifying the string-style function in an external global config file.
  • Note that this is only valid when DataLab loader is used
  • Example
{
  "sst2": {
    "label": {
      "dtype": "string",
      "description": "the true label",
      "num_buckets": 2
    },
    "text_len": {
      "dtype": "float",
      "description": "text length",
      "num_buckets": 4,
      "func": "lambda x:len(x['text'].split())"
    },
    "long_text": {
      "dtype": "string",
      "description": "whether a text is long",
      "num_buckets": 2,
      "func": "lambda x:'Long Text' if len(x['text'].split()) > 20 else 'Short Text'"
    }
...
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions