You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is almost no data of real-life conversations on the internet. This means speech-AI training data is drastically scarcer than text—something we have verified empirically. oto is a project that pairs a wearable voice-capture device with a smartphone app to turn daily conversations around the world into structured data. For speakers of major languages, oto unlocks personalized services—automatic task management, meeting notes, health insights. For under-represented languages and heavy accents, users can monetize their uploads by licensing data to AI firms. These incentives let us map global conversation flow, creating a speech-based Google Trends or Maps.
11
14
12
15
## The problem oto solves
16
+
13
17
There is a global shortage of voice data for AI training.
18
+
14
19
- Out of approximately 7,000 languages worldwide, voice AI supports only around 150—meaning 98% of languages remain unsupported.
15
20
- Even in major languages like English, speech models still perform poorly with accents and dialects.
16
21
- Voice AI systems are still unable to engage in human-level natural conversation.
17
-
All of these limitations stem from a fundamental lack of high-quality, diverse training data.
18
-
One notable initiative is Mozilla Common Voice, which treats voice as a public good. However, it still falls short in terms of dataset volume and diversity.
19
-
We aim to address this problem by building on the public-good model and introducing DePIN-style token incentives to accelerate the creation and sharing of diverse, real-world voice data at scale.
22
+
All of these limitations stem from a fundamental lack of high-quality, diverse training data.
23
+
One notable initiative is Mozilla Common Voice, which treats voice as a public good. However, it still falls short in terms of dataset volume and diversity.
24
+
We aim to address this problem by building on the public-good model and introducing DePIN-style token incentives to accelerate the creation and sharing of diverse, real-world voice data at scale.
0 commit comments