Releases · veselink1/sponsorml-ytcaptions

[
  {
    "video_id": "---jcia5ufM",
    "captions": [
      ["so you've decided to get a new dog", 2.149, 4.22],
      ["congratulations that's a huge decision", 4.23, 6.59],
      /* ... */
    ],
    "sponsor_times": [
      [41.05, 52.56]
    ]
  },
  /* ... */
]

Use the data_loader.py from the tagged commit 44e2cdc.

Assets 18

02 May 07:59

veselink1

preprocess-v2

717cd74

Preprocessed data (v2)

Dataset created with 11dc381.

[
  {
    "videoID": "---jcia5ufM",
    "captions": [
      {
        "end": 2.139,
        "start": 0.03,
        "text": ""
      },
      {
        "end": 4.22,
        "start": 2.149,
        "text": "so you've decided to get a new dog\n"
      },
      {
        "end": 6.59,
        "start": 4.23,
        "text": "congratulations that's a huge decision\n"
      },
      /* ... */
    ],
    "sponsor_ranges": [
      [
        41.05,
        52.56
      ]
    ]
  },
  /* ... */
]

Assets 34

30 Apr 09:18

veselink1

preprocess-v1

1c4d77b

Preprocessed data Pre-release

Pre-release

videoID	transcript	sponsorText	sponsorTokenRange
---jcia5ufM	welcome back to ...	sponsor for today's video ...	(406, 589)
...	...	...	...

Each file contains 10,000 records.

Assets 34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Releases: veselink1/sponsorml-ytcaptions

distilbert-span-extraction-uncased

Uh oh!

distilbert-classification-uncased + dataset

Uh oh!

Preprocessed data (v3)

Uh oh!

Preprocessed data (v2)

Uh oh!

Preprocessed data

Uh oh!