feat: adds streamingV2 sample for speech #10079

amirx-cloud · 2025-04-23T17:11:10Z

Description

Fixes #

Note: Before submitting a pull request, please open an issue for discussion if you are not associated with Google.

Checklist

This reverts commit 0ab846e.

This reverts commit f876969.

…T.java Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

snippet-bot · 2025-04-23T17:11:16Z

Here is the summary of changes.

You are about to add 1 region tag.

speech/src/main/java/com/example/speech/TranscribeStreamingV2.java:19, tag speech_to_text_transcribe_streaming_v2

This comment is generated by snippet-bot.
If you find problems with this result, please file an issue at:
https://github.com/googleapis/repo-automation-bots/issues.
To update this comment, add snippet-bot:force-run label or use the checkbox below:

Refresh this comment

gemini-code-assist

Hello @amirx-cloud, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

This pull request adds a new sample, TranscribeStreamingV2.java, demonstrating how to transcribe audio from a file stream using the Google Cloud Speech-to-Text API V2. It includes a corresponding integration test, TranscribeStreamingV2IT.java, to verify the functionality. The sample reads an audio file, splits it into chunks, sends these chunks to the Speech-to-Text API, and then prints the transcribed text.

Highlights

New Sample: Streaming Speech-to-Text V2: Introduces a new sample demonstrating streaming transcription using the Speech-to-Text V2 API.
Audio Chunking: The sample code splits the audio file into smaller chunks for streaming to the API.
Integration Test: An integration test is included to verify the accuracy of the transcription.

Changelog

speech/src/main/java/com/example/speech/TranscribeStreamingV2.java
- Added a new class TranscribeStreamingV2 to demonstrate streaming transcription.
- Implements the transcribeStreamingV2 method to read audio from a file, chunk it, and send it to the Speech-to-Text API.
- Includes a main method to run the transcription on a sample audio file (brooklyn_bridge.wav).
speech/src/test/java/com/example/speech/TranscribeStreamingV2IT.java
- Added a new integration test TranscribeStreamingV2IT to verify the streaming transcription functionality.
- The test reads the brooklyn_bridge.wav audio file and asserts that the transcribed text contains the expected phrase 'how old is the Brooklyn Bridge'.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

A stream of sound,
To text it is bound,
Words from the cloud.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request adds a new sample for streaming speech transcription using the V2 API. The code appears to be well-structured and includes a test case. However, there are a few areas that could be improved for clarity and robustness.

Summary of Findings

Missing Error Handling: The code lacks proper error handling for file operations and API calls. This could lead to unexpected behavior or crashes if errors occur during these operations. Specifically, the Files.readAllBytes method in TranscribeStreamingV2 and the audio file loading in the test case should include error handling.
Magic Number for Chunk Length: The chunk length calculation audioContent.length / 5 uses a magic number (5). It would be better to define this as a constant with a descriptive name to improve readability and maintainability.
Incomplete Javadoc: The Javadoc comments for the transcribeStreamingV2 method are incomplete. The Args and Returns sections are commented out, and the description is not very detailed. Providing comprehensive Javadoc is crucial for understanding and maintaining the code.

Merge Readiness

The pull request introduces a valuable new sample. However, the identified issues related to error handling and code clarity should be addressed before merging. I am unable to approve this pull request, and recommend that others review and approve this code before merging. Addressing the high severity issues is crucial for the stability and reliability of the sample.

gemini-code-assist · 2025-04-23T17:11:36Z

speech/src/main/java/com/example/speech/TranscribeStreamingV2.java

+        try (SpeechClient client = SpeechClient.create()) {
+
+            Path path = Paths.get(streamFile);
+            byte[] audioContent = Files.readAllBytes(path);


Consider adding error handling for the Files.readAllBytes method. If the file does not exist or cannot be read, an IOException will be thrown, and the program will crash. Wrapping this in a try-catch block will make the code more robust.

What happens if the file does not exist?

try { audioContent = Files.readAllBytes(path); } catch (IOException e) { System.err.println("Failed to read audio file: " + e.getMessage()); return null; // Or throw the exception, depending on desired behavior }

gemini-code-assist · 2025-04-23T17:11:36Z

speech/src/test/java/com/example/speech/TranscribeStreamingV2IT.java

+        String testFilePath = "./resources/brooklyn_bridge.wav";
+
+        // Call the method to test
+        List<StreamingRecognizeResponse> responses = TranscribeStreamingV2.transcribeStreamingV2(testFilePath);


Consider adding error handling for the TranscribeStreamingV2.transcribeStreamingV2 method call. If an IOException is thrown, the test will fail without a clear error message. Wrapping this in a try-catch block will make the test more robust.

try { responses = TranscribeStreamingV2.transcribeStreamingV2(testFilePath); } catch (IOException e) { System.err.println("Failed to transcribe audio: " + e.getMessage()); throw e; // Re-throw the exception to fail the test }

gemini-code-assist · 2025-04-23T17:11:36Z

speech/src/main/java/com/example/speech/TranscribeStreamingV2.java

+    //    Transcribes audio from an audio file stream using Google Cloud Speech-to-Text API.
+//    Args:
+//        streamFile (String): Path to the local audio file to be transcribed.
+//            Example: "resources/audio.wav"
+//    Returns:
+//        List<StreamingRecognizeResponse>: A list of objects.
+//            Each response includes the transcription results for the corresponding audio segment.
+//


The Javadoc comments are incomplete. Please provide a comprehensive description of the method, its arguments, and return values. The Args and Returns sections are commented out. A complete Javadoc is important for understanding and maintaining the code.

/** * Transcribes audio from an audio file stream using Google Cloud Speech-to-Text API. * * @param streamFile (String): Path to the local audio file to be transcribed. * Example: "resources/audio.wav" * @return List<StreamingRecognizeResponse>: A list of objects. * Each response includes the transcription results for the corresponding audio segment. * @throws IOException if an I/O error occurs reading the file. */

+1.

The Javadoc comments as-written look correct for Python; however, Javadoc style is different.

gemini-code-assist · 2025-04-23T17:11:37Z

speech/src/main/java/com/example/speech/TranscribeStreamingV2.java

+            byte[] audioContent = Files.readAllBytes(path);
+
+//          In practice, stream should be a generator yielding chunks of audio data
+            int chunkLength = audioContent.length / 5;


The value 5 is a magic number. Define a constant with a descriptive name to represent the number of chunks. This improves readability and maintainability.

int numChunks = 5; // Define the number of chunks int chunkLength = audioContent.length / numChunks;

+1.

question: why use 5? Provide a comment explaining why the number must be 5 and make this value a constant.

gemini-code-assist · 2025-04-23T17:11:37Z

speech/src/main/java/com/example/speech/TranscribeStreamingV2.java

+                    .build();
+
+            StreamingRecognizeRequest configRequest = StreamingRecognizeRequest.newBuilder()
+                    .setRecognizer(String.format("projects/%s/locations/global/recognizers/_", PROJECT_ID))


The recognizer is set to _. Should this be a specific recognizer? If so, please specify it, otherwise it should be removed.

.setRecognizer(String.format("projects/%s/locations/global/recognizers/{YOUR_RECOGNIZER}", PROJECT_ID))

+1

If the user needs to provide a specific recognizer by ID, pass in the recognizer ID into the sample function.

telpirion

Hi @amirx-cloud -- thank you for submitting this PR!

I changed the status of this PR to "draft" because it needs some work. I've left comments to help you improve the quality of the PR. Please reopen the PR once you've addressed the comments.

Related -- the CI/CD reports show that there is an NPE somewhere in the code sample. Be sure to address that problem before re-opening.

telpirion · 2025-04-28T23:01:29Z

speech/src/main/java/com/example/speech/TranscribeStreamingV2.java

+import java.util.List;
+
+public class TranscribeStreamingV2 {
+    private static final String PROJECT_ID = System.getenv("GOOGLE_CLOUD_PROJECT");


issue: pass project ID into the sample function. Look at other samples for how to do this.

telpirion · 2025-04-28T23:02:03Z

speech/src/main/java/com/example/speech/TranscribeStreamingV2.java

+// [START speech_to_text_transcribe_streaming_v2]
+
+import com.google.api.gax.rpc.BidiStream;
+import com.google.cloud.speech.v2.*;


issue: import each type individually -- don't use star imports.

telpirion · 2025-04-28T23:02:40Z

speech/src/main/java/com/example/speech/TranscribeStreamingV2.java

+    //    Transcribes audio from an audio file stream using Google Cloud Speech-to-Text API.
+//    Args:
+//        streamFile (String): Path to the local audio file to be transcribed.
+//            Example: "resources/audio.wav"
+//    Returns:
+//        List<StreamingRecognizeResponse>: A list of objects.
+//            Each response includes the transcription results for the corresponding audio segment.
+//


+1.

The Javadoc comments as-written look correct for Python; however, Javadoc style is different.

telpirion · 2025-04-28T23:03:22Z

speech/src/main/java/com/example/speech/TranscribeStreamingV2.java

+//        List<StreamingRecognizeResponse>: A list of objects.
+//            Each response includes the transcription results for the corresponding audio segment.
+//
+    public static List<StreamingRecognizeResponse> transcribeStreamingV2(String streamFile) throws IOException {


issue: provide a main() method that invokes the sample method.

telpirion · 2025-04-28T23:04:15Z

speech/src/main/java/com/example/speech/TranscribeStreamingV2.java

+            Path path = Paths.get(streamFile);
+            byte[] audioContent = Files.readAllBytes(path);
+
+//          In practice, stream should be a generator yielding chunks of audio data


issue: fix the formatting on this comment.

telpirion · 2025-04-28T23:13:47Z

speech/src/main/java/com/example/speech/TranscribeStreamingV2.java

+                StreamingRecognizeResponse response = responseIterator.next();
+                System.out.println(response);
+                // Process the response and extract the transcript
+                System.out.println("Transcript: " + response.getResultsList().get(0).getAlternativesList().get(0).getTranscript());


issue: put individual method calls on separate lines for better readability

telpirion · 2025-04-28T23:14:02Z

speech/src/main/java/com/example/speech/TranscribeStreamingV2.java

+                System.out.println("Transcript: " + response.getResultsList().get(0).getAlternativesList().get(0).getTranscript());
+                responses.add(response);
+            }
+


issue: remove excess blank lines.

telpirion · 2025-04-28T23:14:54Z

speech/src/test/java/com/example/speech/TranscribeStreamingV2IT.java

+        String transcript = "";
+        for (StreamingRecognizeResponse response : responses) {
+            if (response.getResultsCount() > 0) {
+                transcript += response.getResults(0).getAlternatives(0).getTranscript();


issue: put individual method calls on separate lines.

telpirion · 2025-04-28T23:16:07Z

speech/src/test/java/com/example/speech/TranscribeStreamingV2IT.java

+            }
+        }
+        // Use a regex to match the expected transcript
+        Pattern pattern = Pattern.compile("how old is the Brooklyn Bridge", Pattern.CASE_INSENSITIVE | Pattern.DOTALL);


issue: check for "Transcript:" only, if you check for output from model. The model doesn't deterministically produce "how old is the Brooklyn Bridge", but DOES deterministically return "Transcript:".

telpirion · 2025-04-28T23:17:21Z

speech/src/main/java/com/example/speech/TranscribeStreamingV2.java

+
+//          In practice, stream should be a generator yielding chunks of audio data
+            int chunkLength = audioContent.length / 5;
+            List<byte[]> stream = new ArrayList<>();


issue: rename this variable to something like "chunks".

telpirion · 2025-04-28T23:22:12Z

Tip: You should be able to find the linting errors by running:

mvn -P lint --quiet --batch-mode checkstyle:check

amirx-cloud and others added 7 commits April 14, 2025 15:34

added streaming for V2

0ab846e

Revert "added streaming for V2"

f876969

This reverts commit 0ab846e.

Reapply "added streaming for V2"

3036aa1

This reverts commit f876969.

Update speech/src/test/java/com/example/speech/TranscribeStreamingV2I…

5cb5466

…T.java Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Removed unnecessary parts

cf18f41

Merge branch 'main' of https://github.com/amirx-cloud/java-docs-samples

368073c

Merge branch 'GoogleCloudPlatform:main' into main

a7d5843

amirx-cloud requested review from yoshi-approver and a team as code owners April 23, 2025 17:11

product-auto-label bot added samples Issues that are directly related to samples. api: speech Issues related to the Speech-to-Text API. labels Apr 23, 2025

gemini-code-assist bot reviewed Apr 23, 2025

View reviewed changes

blunderbuss-gcf bot assigned bourgeoisor Apr 23, 2025

gemini-code-assist bot suggested changes Apr 23, 2025

View reviewed changes

telpirion changed the title ~~Adding StreamingV2 sample for speech~~ feat: adds streamingV2 sample for speech Apr 28, 2025

telpirion assigned telpirion and unassigned bourgeoisor Apr 28, 2025

telpirion marked this pull request as draft April 28, 2025 22:59

telpirion requested changes Apr 28, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: adds streamingV2 sample for speech #10079

feat: adds streamingV2 sample for speech #10079

amirx-cloud commented Apr 23, 2025

snippet-bot bot commented Apr 23, 2025

gemini-code-assist bot left a comment

gemini-code-assist bot left a comment

gemini-code-assist bot Apr 23, 2025

gemini-code-assist bot Apr 23, 2025

gemini-code-assist bot Apr 23, 2025

telpirion Apr 28, 2025

gemini-code-assist bot Apr 23, 2025

telpirion Apr 28, 2025

gemini-code-assist bot Apr 23, 2025

telpirion Apr 28, 2025

telpirion left a comment

telpirion Apr 28, 2025

telpirion Apr 28, 2025

telpirion Apr 28, 2025

telpirion Apr 28, 2025

telpirion Apr 28, 2025

telpirion Apr 28, 2025

telpirion Apr 28, 2025

telpirion Apr 28, 2025

telpirion Apr 28, 2025

telpirion Apr 28, 2025

telpirion commented Apr 28, 2025

feat: adds streamingV2 sample for speech #10079

Are you sure you want to change the base?

feat: adds streamingV2 sample for speech #10079

Conversation

amirx-cloud commented Apr 23, 2025

Description

Checklist

snippet-bot bot commented Apr 23, 2025

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Changelog

Footnotes

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Summary of Findings

Merge Readiness

gemini-code-assist bot Apr 23, 2025

Choose a reason for hiding this comment

gemini-code-assist bot Apr 23, 2025

Choose a reason for hiding this comment

gemini-code-assist bot Apr 23, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gemini-code-assist bot Apr 23, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gemini-code-assist bot Apr 23, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

telpirion left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

telpirion commented Apr 28, 2025