diff --git a/docs/autolabel/guide/overview/getting-started.md b/docs/autolabel/guide/overview/getting-started.md index b50aa6c..a100798 100644 --- a/docs/autolabel/guide/overview/getting-started.md +++ b/docs/autolabel/guide/overview/getting-started.md @@ -28,7 +28,7 @@ Let's say we wanted to run sentiment analysis on a dataset of movie reviews. We Now, we could label a few hundred examples by hand which would take us a few hours. Instead, let's use Autolabel to get a clean, labeled dataset in a few minutes. -A dataset[^1] containing 200 unlabeled movie reviews is available [here](https://github.com/refuel-ai/autolabel/blob/main/docs/assets/movie_reviews_preview.csv), and a couple of examples (with labels) are shown below: +A dataset[^1] containing 200 unlabeled movie reviews is available [here](https://github.com/refuel-ai/autolabel/blob/main/docs/assets/movie_reviews.csv), and a couple of examples (with labels) are shown below: {{ read_csv('docs/assets/movie_reviews_preview.csv') }} @@ -89,7 +89,7 @@ config = { } ``` -*To create a custom configuration, you can use the [CLI](https://docs.refuel.ai/guide/resources/CLI) or [write your own](https://docs.refuel.ai/guide/resources/configs/).* +*To create a custom configuration, you can use the [CLI](../resources/CLI.md) or [write your own](../resources/configs.md).* ### Preview the labeling against your dataset diff --git a/docs/autolabel/guide/overview/tutorial-classification.md b/docs/autolabel/guide/overview/tutorial-classification.md index 8a1a0dd..855bd73 100644 --- a/docs/autolabel/guide/overview/tutorial-classification.md +++ b/docs/autolabel/guide/overview/tutorial-classification.md @@ -23,9 +23,9 @@ get_data('civil_comments') The output is: ``` -Downloading seed example dataset to "seed.csv"... +Downloading seed example dataset to "data/civil-comments/seed.csv"... 100% [..............................................................................] 65757 / 65757 -Downloading test dataset to "test.csv"... +Downloading test dataset to "data/civil-comments/test.csv"... 100% [............................................................................] 610663 / 610663 ``` @@ -73,14 +73,14 @@ config = { } } ``` -*To create a custom configuration, you can use the [CLI](https://docs.refuel.ai/guide/resources/CLI) or [write your own](https://docs.refuel.ai/guide/resources/configs).* +*To create a custom configuration, you can use the [CLI](../resources/CLI.md) or [write your own](../resources/configs.md).* Now, we do the dry-run with `agent.plan`: ```python from autolabel import LabelingAgent, AutolabelDataset agent = LabelingAgent(config) -ds = AutolabelDataset('test.csv', config = config) +ds = AutolabelDataset('data/civil-comments/test.csv', config = config) agent.plan(ds) ``` diff --git a/docs/autolabel/guide/resources/refuel_datasets.md b/docs/autolabel/guide/resources/refuel_datasets.md index 10ddb7a..379f6ed 100644 --- a/docs/autolabel/guide/resources/refuel_datasets.md +++ b/docs/autolabel/guide/resources/refuel_datasets.md @@ -5,6 +5,7 @@ Autolabel provides datasets out-of-the-box so you can easily get started with LL | banking | Classification | | civil_comments | Classification | | ledgar | Classification | +| movie_reviews | Classification | | walmart_amazon | Entity Matching | | company | Entity Matching | | squad_v2 | Question Answering | @@ -14,14 +15,14 @@ Autolabel provides datasets out-of-the-box so you can easily get started with LL ## Downloading any dataset -To download a specific dataset, such as `squad_v2`, run: +To download a specific dataset, such as `civil_comments`, run: ```python from autolabel import get_data get_data('civil_comments') -> Downloading seed example dataset to "seed.csv"... +> Downloading seed example dataset to "data/civil_comments/seed.csv"... > 100% [..............................................................................] 65757 / 65757 -> Downloading test dataset to "test.csv"... +> Downloading test dataset to "data/civil_comments/test.csv"... > 100% [............................................................................] 610663 / 610663 ``` \ No newline at end of file