|
15 | 15 | "source": [
|
16 | 16 | "This example shows how to repeatedly improve consensus labels established from data labeled by multiple annotators by iterating the following steps: (1) train a model on the current consensus labels, (2) leverage the model's predictions to obtain superior consensus labels that can be used to subsequently train a better model in the next round. In each round, consensus labels are established using the [CROWDLAB algorithm](https://cleanlab.github.io/multiannotator-benchmarks/paper.pdf), for which a quickstart tutorial is available in the [cleanlab documentation](https://docs.cleanlab.ai/stable/tutorials/multiannotator.html). \n",
|
17 | 17 | "\n",
|
18 |
| - "Here we demonstrate this functionality using a subset of the [CIFAR-10H](https://github.com/jcpeterson/cifar-10h) dataset from Peterson et al. (2019), in which multiple human annotators were asked to suggest labels for images from the famous [CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html) image classification dataset.\n", |
| 18 | + "Here we demonstrate this functionality using a variant of the [CIFAR-10](https://github.com/jcpeterson/cifar-10h) dataset from Peterson et al. (2019), in which multiple human annotators were asked to suggest labels for images from this famous image classification dataset.\n", |
19 | 19 | "Because this notebook utilizes AutoML for model training and cleanlab is compatible with any model/dataset, you should be able to run the below code with any image classification dataset where each image has been labeled by multiple annotators."
|
20 | 20 | ]
|
21 | 21 | },
|
|
76 | 76 | "# Here is an alternative command to download the data from the source:\n",
|
77 | 77 | "# cifar2png cifar10 ./data/cifar10_test --name-with-batch-index\n",
|
78 | 78 | "\n",
|
79 |
| - "# Import CIFAR-10h labels and image paths\n", |
| 79 | + "# Import CIFAR-10 multi-annotator labels and image paths\n", |
80 | 80 | "!cd $experiment_path && wget -nc 'https://cleanlab-public.s3.amazonaws.com/Multiannotator/cifar-10h/cifar-10h-worst25-coin20/c10h_labels_worst25_coin20.npy'\n",
|
81 | 81 | "!cd $experiment_path && wget -nc 'https://cleanlab-public.s3.amazonaws.com/Multiannotator/cifar-10h/cifar-10h-worst25-coin20/c10h_image_paths.npy'\n",
|
82 | 82 | "!cd $experiment_path && wget -nc 'https://cleanlab-public.s3.amazonaws.com/Multiannotator/cifar-10h/cifar-10h-worst25-coin20/c10h_test_labels.npy'"
|
|
88 | 88 | "metadata": {},
|
89 | 89 | "source": [
|
90 | 90 | "## Load labels selected by each annotator\n",
|
91 |
| - "Here `multiannotator_labels` is a subset of `CIFAR-10H`, where each image has been labeled by one or more annotators, but not every annotator has labeled every image." |
| 91 | + "Here `multiannotator_labels` contains labels from multiple annotators for each CIFAR-10 image. Each image has been labeled by one or more annotators, but not every annotator has labeled every image." |
92 | 92 | ]
|
93 | 93 | },
|
94 | 94 | {
|
|
0 commit comments