Skip to content

feat: adds streamingV2 sample for speech #10079

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added speech/resources/brooklyn_bridge.wav
Binary file not shown.
112 changes: 112 additions & 0 deletions speech/src/main/java/com/example/speech/TranscribeStreamingV2.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
/*
* Copyright 2025 Google LLC
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package com.example.speech;

// [START speech_to_text_transcribe_streaming_v2]

import com.google.api.gax.rpc.BidiStream;
import com.google.cloud.speech.v2.*;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: import each type individually -- don't use star imports.

import com.google.protobuf.ByteString;

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;

public class TranscribeStreamingV2 {
private static final String PROJECT_ID = System.getenv("GOOGLE_CLOUD_PROJECT");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: pass project ID into the sample function. Look at other samples for how to do this.


// Transcribes audio from an audio file stream using Google Cloud Speech-to-Text API.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: fix the formatting on these comments.

// Args:
// streamFile (String): Path to the local audio file to be transcribed.
// Example: "resources/audio.wav"
// Returns:
// List<StreamingRecognizeResponse>: A list of objects.
// Each response includes the transcription results for the corresponding audio segment.
//
Comment on lines +36 to +43

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The Javadoc comments are incomplete. Please provide a comprehensive description of the method, its arguments, and return values. The Args and Returns sections are commented out. A complete Javadoc is important for understanding and maintaining the code.

    /**
     * Transcribes audio from an audio file stream using Google Cloud Speech-to-Text API.
     *
     * @param streamFile (String): Path to the local audio file to be transcribed.
     *                   Example: "resources/audio.wav"
     * @return List<StreamingRecognizeResponse>: A list of objects.
     *         Each response includes the transcription results for the corresponding audio segment.
     * @throws IOException if an I/O error occurs reading the file.
     */

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1.

The Javadoc comments as-written look correct for Python; however, Javadoc style is different.

public static List<StreamingRecognizeResponse> transcribeStreamingV2(String streamFile) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: provide a main() method that invokes the sample method.

List<StreamingRecognizeResponse> responses = new ArrayList<>();
// Instantiates a client
try (SpeechClient client = SpeechClient.create()) {

Path path = Paths.get(streamFile);
byte[] audioContent = Files.readAllBytes(path);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Consider adding error handling for the Files.readAllBytes method. If the file does not exist or cannot be read, an IOException will be thrown, and the program will crash. Wrapping this in a try-catch block will make the code more robust.

What happens if the file does not exist?

            try {
                audioContent = Files.readAllBytes(path);
            } catch (IOException e) {
                System.err.println("Failed to read audio file: " + e.getMessage());
                return null; // Or throw the exception, depending on desired behavior
            }


// In practice, stream should be a generator yielding chunks of audio data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: fix the formatting on this comment.

int chunkLength = audioContent.length / 5;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The value 5 is a magic number. Define a constant with a descriptive name to represent the number of chunks. This improves readability and maintainability.

            int numChunks = 5; // Define the number of chunks
            int chunkLength = audioContent.length / numChunks;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1.

question: why use 5? Provide a comment explaining why the number must be 5 and make this value a constant.

List<byte[]> stream = new ArrayList<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: rename this variable to something like "chunks".

for (int i = 0; i < audioContent.length; i += chunkLength) {
int end = Math.min(i + chunkLength, audioContent.length);
byte[] chunk = new byte[end - i];
System.arraycopy(audioContent, i, chunk, 0, end - i);
stream.add(chunk);
}


List<StreamingRecognizeRequest> audioRequests = new ArrayList<>();
for (byte[] audio : stream) {
audioRequests.add(StreamingRecognizeRequest.newBuilder().setAudio(ByteString.copyFrom(audio)).build());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: break the builder call out onto its own line. Pass the build object into the call to audioRequests.add().

}

RecognitionConfig recognitionConfig = RecognitionConfig.newBuilder()
.setAutoDecodingConfig(AutoDetectDecodingConfig.getDefaultInstance())
.addLanguageCodes("en-US")
.setModel("long")
.build();

StreamingRecognitionConfig streamingConfig = StreamingRecognitionConfig.newBuilder()
.setConfig(recognitionConfig)
.build();

StreamingRecognizeRequest configRequest = StreamingRecognizeRequest.newBuilder()
.setRecognizer(String.format("projects/%s/locations/global/recognizers/_", PROJECT_ID))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The recognizer is set to _. Should this be a specific recognizer? If so, please specify it, otherwise it should be removed.

                    .setRecognizer(String.format("projects/%s/locations/global/recognizers/{YOUR_RECOGNIZER}", PROJECT_ID))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

If the user needs to provide a specific recognizer by ID, pass in the recognizer ID into the sample function.

.setStreamingConfig(streamingConfig)
.build();


List<StreamingRecognizeRequest> requests = new ArrayList<>();
requests.add(configRequest);
requests.addAll(audioRequests);

BidiStream<StreamingRecognizeRequest, StreamingRecognizeResponse> stream1 = client.streamingRecognizeCallable().call();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: use a different name than stream1. I would consider renaming the other variable, stream, to something like chunks.

for (StreamingRecognizeRequest request : requests) {
stream1.send(request);
}
stream1.closeSend();

Iterator<StreamingRecognizeResponse> responseIterator = stream1.iterator();
while (responseIterator.hasNext()) {
StreamingRecognizeResponse response = responseIterator.next();
System.out.println(response);
// Process the response and extract the transcript
System.out.println("Transcript: " + response.getResultsList().get(0).getAlternativesList().get(0).getTranscript());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: put individual method calls on separate lines for better readability

responses.add(response);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: remove excess blank lines.


}
return responses;
}

public static void main(String[] args) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: move main to top of file.

List<StreamingRecognizeResponse> responses = transcribeStreamingV2("./resources/brooklyn_bridge.wav");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: use a variable and pass the resource path from the test.

}
}
// [END speech_to_text_transcribe_streaming_v2]
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
/*
* Copyright 2025 Google Inc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: 'Inc' should be 'LLC'.

*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package com.example.speech;
import com.google.cloud.speech.v2.StreamingRecognizeResponse;
import com.google.common.truth.Truth;
import org.junit.Test;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.List;
import java.util.regex.Pattern;
import org.junit.runner.RunWith;
import org.junit.runners.JUnit4;

@RunWith(JUnit4.class)
public class TranscribeStreamingV2IT {

@Test
public void testTranscribeStreamingV2_Success() throws IOException {
// Create a dummy audio file for testing
String testFilePath = "./resources/brooklyn_bridge.wav";

// Call the method to test
List<StreamingRecognizeResponse> responses = TranscribeStreamingV2.transcribeStreamingV2(testFilePath);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Consider adding error handling for the TranscribeStreamingV2.transcribeStreamingV2 method call. If an IOException is thrown, the test will fail without a clear error message. Wrapping this in a try-catch block will make the test more robust.

        try {
            responses = TranscribeStreamingV2.transcribeStreamingV2(testFilePath);
        } catch (IOException e) {
            System.err.println("Failed to transcribe audio: " + e.getMessage());
            throw e; // Re-throw the exception to fail the test
        }


// Assert the transcript
String transcript = "";
for (StreamingRecognizeResponse response : responses) {
if (response.getResultsCount() > 0) {
transcript += response.getResults(0).getAlternatives(0).getTranscript();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: put individual method calls on separate lines.

}
}
// Use a regex to match the expected transcript
Pattern pattern = Pattern.compile("how old is the Brooklyn Bridge", Pattern.CASE_INSENSITIVE | Pattern.DOTALL);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: check for "Transcript:" only, if you check for output from model. The model doesn't deterministically produce "how old is the Brooklyn Bridge", but DOES deterministically return "Transcript:".

Truth.assertThat(pattern.matcher(transcript).find()).isTrue();

}
}