Merge pull request #19 from Heterod0x/vapi_hackathon_frontend

mashharuki · web-flow · commit 8545815c64ca · 2025-06-14T21:12:19.000+09:00
Vapi hackathon frontend
diff --git a/README.md b/README.md
@@ -1,24 +1,30 @@
 # Oto
+
 ![Oto Thumbnail](./docs/images/thumbnail.png)
 
 [![Netlify Status](https://api.netlify.com/api/v1/badges/d3c17ad7-6bee-48d3-abe2-cb239051aa5a/deploy-status)](https://app.netlify.com/projects/oto-evm/deploys)
 
 ## Vision
+
 Turning the world's conversations into data
 
 ## Overview
+
 There is almost no data of real-life conversations on the internet. This means speech-AI training data is drastically scarcer than text—something we have verified empirically. oto is a project that pairs a wearable voice-capture device with a smartphone app to turn daily conversations around the world into structured data. For speakers of major languages, oto unlocks personalized services—automatic task management, meeting notes, health insights. For under-represented languages and heavy accents, users can monetize their uploads by licensing data to AI firms. These incentives let us map global conversation flow, creating a speech-based Google Trends or Maps.
 
 ## The problem oto solves
+
 There is a global shortage of voice data for AI training.
+
 - Out of approximately 7,000 languages worldwide, voice AI supports only around 150—meaning 98% of languages remain unsupported.
 - Even in major languages like English, speech models still perform poorly with accents and dialects.
 - Voice AI systems are still unable to engage in human-level natural conversation.
-All of these limitations stem from a fundamental lack of high-quality, diverse training data.
-One notable initiative is Mozilla Common Voice, which treats voice as a public good. However, it still falls short in terms of dataset volume and diversity.
-We aim to address this problem by building on the public-good model and introducing DePIN-style token incentives to accelerate the creation and sharing of diverse, real-world voice data at scale.
+  All of these limitations stem from a fundamental lack of high-quality, diverse training data.
+  One notable initiative is Mozilla Common Voice, which treats voice as a public good. However, it still falls short in terms of dataset volume and diversity.
+  We aim to address this problem by building on the public-good model and introducing DePIN-style token incentives to accelerate the creation and sharing of diverse, real-world voice data at scale.
 
 ## Pitch Silde
+
 https://www.figma.com/slides/zENm8UTvypmVpUscp14Imc/oto---Pitch-Deck?node-id=5-45&t=3RG8vMWEwdsLl8zv-0
 
 ## Product Page (Colosseum)
@@ -37,6 +43,10 @@ https://www.figma.com/slides/zENm8UTvypmVpUscp14Imc/oto---Pitch-Deck?node-id=5-4
 
 [https://oto-evm.netlify.app/](https://oto-evm.netlify.app/)
 
+## Live Demo for VAPI Hackathon
+
+[https://heterod0x.github.io/oto/](https://heterod0x.github.io/oto/)
+
 ## Deployed Contract
 
 [Solscan - otoUzj3eLyJXSkB4DmfGR63eHBMQ9tqPHJaGX8ySSsY](https://solscan.io/account/otoUzj3eLyJXSkB4DmfGR63eHBMQ9tqPHJaGX8ySSsY?cluster=devnets)
diff --git a/frontend_vapi/lib/oto-api.ts b/frontend_vapi/lib/oto-api.ts
@@ -812,7 +812,7 @@ export async function getConversationDetail(
   try {
     const cleanApiKey = apiKey.replace(/^Bearer\s+/i, "");
 
-    const response = await fetch(`${apiEndpoint}/conversation/${conversationId}`, {
+    const response = await fetch(`${apiEndpoint}/conversation/${conversationId}/transcript`, {
       method: "GET",
       headers: {
         Authorization: `Bearer ${cleanApiKey}`,
diff --git a/frontend_vapi/pages/record.tsx b/frontend_vapi/pages/record.tsx
@@ -707,8 +707,8 @@ export default function RecordPage() {
             if (wsState === WebSocket.OPEN) {
               try {
                 console.log(`🎤 Sending audio chunk (${event.data.size} bytes) - WebSocket state: ${wsState}`);
-                // Try binary mode first (raw audio data)
-                sendRealtimeAudioData(websocketRef.current, event.data, true);
+                // Send audio data in JSON format (not binary) for server compatibility
+                sendRealtimeAudioData(websocketRef.current, event.data, false);
               } catch (error) {
                 console.error("❌ Failed to send audio data:", error);
               }
@@ -802,17 +802,14 @@ export default function RecordPage() {
     console.log(`🎤 Streaming status changed: ${isStreaming}`);
   }, [isStreaming]);
 
-  // クリーンアップ
+  // クリーンアップ - only run on component unmount
   useEffect(() => {
     return () => {
       if (isStreaming) {
         stopAudioStreaming();
       }
-      if (websocketRef.current) {
-        websocketRef.current.close();
-      }
     };
-  }, [isStreaming, stopAudioStreaming]);
+  }, [stopAudioStreaming]); // Removed isStreaming from dependencies to prevent cleanup on state change
 
   if (!authenticated) {
     return null;