Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supports continuous speech recognition and barge-in #5426

Merged
merged 44 commits into from
Feb 13, 2025
Merged
Show file tree
Hide file tree
Changes from 41 commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
20de20a
Add mock SpeechSynthesis
compulim Feb 7, 2025
f352b5a
Clean up
compulim Feb 7, 2025
8019538
Use jest-mock
compulim Feb 7, 2025
34d0ee6
Add expectingInput
compulim Feb 8, 2025
379881e
Complete the test
compulim Feb 8, 2025
1a48a92
Add import map
compulim Feb 8, 2025
f7de58a
Use import map
compulim Feb 8, 2025
3cfc548
Add await to resolveAll()
compulim Feb 8, 2025
14d5511
Complete case
compulim Feb 8, 2025
5c2cb35
No need to wait for send when barge-in
compulim Feb 10, 2025
6fa4857
Add interims
compulim Feb 10, 2025
a3da8a8
Support barge-in
compulim Feb 11, 2025
42cde33
Bump version
compulim Feb 11, 2025
b5d215d
Bump version
compulim Feb 11, 2025
da5fa78
Continue to show "Listening..."
compulim Feb 11, 2025
8ef07af
Bump react-dictate-button
compulim Feb 12, 2025
5ca1ea9
Add more expectations
compulim Feb 12, 2025
c9c9e69
Clean up
compulim Feb 12, 2025
14822c0
Clean up
compulim Feb 12, 2025
1c62085
Add tests
compulim Feb 12, 2025
1815c57
Clean up
compulim Feb 12, 2025
de01d5a
Clean up
compulim Feb 12, 2025
b4edd0a
Add more scenarios
compulim Feb 12, 2025
ce4af27
Ignore html2
compulim Feb 12, 2025
d5fc3c2
Ported test
compulim Feb 12, 2025
8cb2ae5
Added test
compulim Feb 12, 2025
206c6f2
Bump react-dictate-button
compulim Feb 13, 2025
1e0f1a3
Add entry
compulim Feb 13, 2025
282cd99
Update entries
compulim Feb 13, 2025
4fd345f
Bump to [email protected]
compulim Feb 13, 2025
e88f40e
Bump to [email protected]
compulim Feb 13, 2025
07c84b3
Clean up
compulim Feb 13, 2025
9e5abc2
Clean up
compulim Feb 13, 2025
4936171
More comments
compulim Feb 13, 2025
e6f41dc
Add perform card action
compulim Feb 13, 2025
ddefad0
Add perform card action tests
compulim Feb 13, 2025
d8608d1
Add test
compulim Feb 13, 2025
2e2797e
More scenarios
compulim Feb 13, 2025
1a81e19
Merge branch 'main' into feat-speech-barge-in
compulim Feb 13, 2025
6cd4a83
Better comments
compulim Feb 13, 2025
1311d17
Better comment
compulim Feb 13, 2025
50f1628
Add comment
compulim Feb 13, 2025
5be4691
Add speech error telemetry
compulim Feb 13, 2025
0e65ee5
Add types
compulim Feb 13, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,8 @@ Notes: web developers are advised to use [`~` (tilde range)](https://github.com/
- When set to `'activity-status'`, feedback buttons appear in the activity status area (default behavior)
- Added support for including activity ID and key into form data indicated by `data-webchat-include-activity-id` and `data-webchat-include-activity-key` attributes, in PR [#5418](https://github.com/microsoft/BotFramework-WebChat/pull/5418), by [@OEvgeny](https://github.com/OEvgeny)
- Added dedicated loading animation for messages in preparing state for Fluent theme, in PR [#5423](https://github.com/microsoft/BotFramework-WebChat/pull/5423), by [@OEvgeny](https://github.com/OEvgeny)
- Resolved [#2661](https://github.com/microsoft/BotFramework-WebChat/issues/2661) and [#5352](https://github.com/microsoft/BotFramework-WebChat/issues/5352). Added speech recognition continuous mode with barge-in support, in PR [#5426](https://github.com/microsoft/BotFramework-WebChat/pull/5426), by [@RushikeshGavali](https://github.com/RushikeshGavali) and [@compulim](https://github.com/compulim)
- Set `styleOptions.speechRecognitionContinuous` to `true` with a Web Speech API provider with continuous mode support

### Changed

Expand All @@ -101,9 +103,10 @@ Notes: web developers are advised to use [`~` (tilde range)](https://github.com/
- Switched math block syntax from `$$` to Tex-style `\[ \]` and `\( \)` delimiters with improved rendering and error handling, in PR [#5353](https://github.com/microsoft/BotFramework-WebChat/pull/5353), by [@OEvgeny](https://github.com/OEvgeny)
- Improved avatar display and grouping behavior by fixing rendering issues and activity sender identification, in PR [#5346](https://github.com/microsoft/BotFramework-WebChat/pull/5346), by [@OEvgeny](https://github.com/OEvgeny)
- Activity "copy" button will use `outerHTML` and `textContent` for clipboard content, in PR [#5378](https://github.com/microsoft/BotFramework-WebChat/pull/5378), by [@compulim](https://github.com/compulim)
- Bumped dependencies to the latest versions, by [@compulim](https://github.com/compulim) in PR [#5385](https://github.com/microsoft/BotFramework-WebChat/pull/5385) and [#5400](https://github.com/microsoft/BotFramework-WebChat/pull/5400)
- Bumped dependencies to the latest versions, by [@compulim](https://github.com/compulim) in PR [#5385](https://github.com/microsoft/BotFramework-WebChat/pull/5385), [#5400](https://github.com/microsoft/BotFramework-WebChat/pull/5400), and [#5426](https://github.com/microsoft/BotFramework-WebChat/pull/5426)
- Production dependencies
- [`[email protected]`](https://npmjs.com/package/web-speech-cognitive-services)
- [`[email protected]`](https://npmjs.com/package/react-dictate-button)
- Enabled icon customization in Fluent theme through CSS variables, in PR [#5413](https://github.com/microsoft/BotFramework-WebChat/pull/5413), by [@OEvgeny](https://github.com/OEvgeny)

### Fixed
Expand Down
39 changes: 0 additions & 39 deletions __tests__/hooks/useDictateState.js

This file was deleted.

77 changes: 77 additions & 0 deletions __tests__/html2/hooks/private/renderHook.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
// Adopted from https://github.com/testing-library/react-testing-library/blob/main/src/pure.js#L292C1-L329C2

/*!
* The MIT License (MIT)
* Copyright (c) 2017-Present Kent C. Dodds
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in all
* copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
* SOFTWARE.
*/

function wrapUiIfNeeded(innerElement, wrapperComponent) {
return wrapperComponent ? React.createElement(wrapperComponent, null, innerElement) : innerElement;
}

export default function renderHook(
/** @type {(props: RenderCallbackProps) => any} */ renderCallback,
/** @type {{}} */ options = {}
) {
const { initialProps, ...renderOptions } = options;

if (renderOptions.legacyRoot && typeof ReactDOM.render !== 'function') {
const error = new Error(
'`legacyRoot: true` is not supported in this version of React. ' +
'If your app runs React 19 or later, you should remove this flag. ' +
'If your app runs React 18 or earlier, visit https://react.dev/blog/2022/03/08/react-18-upgrade-guide for upgrade instructions.'
);
Error.captureStackTrace(error, renderHook);
throw error;
}

const result = React.createRef();

function TestComponent({ renderCallbackProps }) {
const pendingResult = renderCallback(renderCallbackProps);

React.useEffect(() => {
result.current = pendingResult;
});

return null;
}

// A stripped down version of render() from `@testing-library/react`.
const render = ({ renderCallbackProps }) => {
const element = document.querySelector('main');

ReactDOM.render(wrapUiIfNeeded(React.createElement(TestComponent, renderCallbackProps), renderOptions.wrapper), element);

return { rerender: render, unmount: () => ReactDOM.unmountComponentAtNode(element) };
};

const { rerender: baseRerender, unmount } = render(
React.createElement(TestComponent, { renderCallbackProps: initialProps }),
renderOptions
);

function rerender(rerenderCallbackProps) {
return baseRerender(React.createElement(TestComponent, { renderCallbackProps: rerenderCallbackProps }));
}

return { result, rerender, unmount };
}
221 changes: 221 additions & 0 deletions __tests__/html2/hooks/useDictateState.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,221 @@
<!doctype html>
<html lang="en-US">
<head>
<link href="/assets/index.css" rel="stylesheet" type="text/css" />
<script crossorigin="anonymous" src="https://unpkg.com/[email protected]/umd/react.development.js"></script>
<script crossorigin="anonymous" src="https://unpkg.com/[email protected]/umd/react-dom.development.js"></script>
<script crossorigin="anonymous" src="/test-harness.js"></script>
<script crossorigin="anonymous" src="/test-page-object.js"></script>
<script crossorigin="anonymous" src="/__dist__/webchat-es5.js"></script>
</head>
<body>
<main id="webchat"></main>
<script type="importmap">
{
"imports": {
"@testduet/wait-for": "https://unpkg.com/@testduet/wait-for@main/dist/wait-for.mjs",
"jest-mock": "https://esm.sh/jest-mock",
"react-dictate-button/internal": "https://unpkg.com/react-dictate-button@main/dist/react-dictate-button.internal.mjs"
}
}
</script>
<script type="module">
import { waitFor } from '@testduet/wait-for';
import { fn, spyOn } from 'jest-mock';
import {
SpeechGrammarList,
SpeechRecognition,
SpeechRecognitionAlternative,
SpeechRecognitionErrorEvent,
SpeechRecognitionEvent,
SpeechRecognitionResult,
SpeechRecognitionResultList
} from 'react-dictate-button/internal';
import { SpeechSynthesis, SpeechSynthesisEvent, SpeechSynthesisUtterance } from '../speech/js/index.js';
import renderHook from './private/renderHook.js';

const {
React: { createElement },
ReactDOM: { render },
testHelpers: { createDirectLineEmulator },
WebChat: {
Components: { BasicWebChat, Composer },
hooks: { useDictateState },
renderWebChat,
testIds
}
} = window;

run(async function () {
const speechSynthesis = new SpeechSynthesis();
const ponyfill = {
SpeechGrammarList,
SpeechRecognition: fn().mockImplementation(() => {
const speechRecognition = new SpeechRecognition();

spyOn(speechRecognition, 'abort');
spyOn(speechRecognition, 'start');

return speechRecognition;
}),
speechSynthesis,
SpeechSynthesisUtterance
};

spyOn(speechSynthesis, 'speak');

const { directLine, store } = createDirectLineEmulator();
const WebChatWrapper = ({ children }) =>
createElement(
Composer,
{ directLine, store, webSpeechPonyfillFactory: () => ponyfill },
createElement(BasicWebChat),
children
);

// WHEN: Render initially.
const renderResult = renderHook(() => useDictateState()[0], {
legacyRoot: true,
wrapper: WebChatWrapper
});

await pageConditions.uiConnected();

// THEN: `useDictateState` should returns IDLE.
await waitFor(() => expect(renderResult).toHaveProperty('result.current', 0)); // IDLE

// WHEN: Microphone button is clicked and priming user gesture is done.
await pageObjects.clickMicrophoneButton();

await waitFor(() => expect(speechSynthesis.speak).toHaveBeenCalledTimes(1));
speechSynthesis.speak.mock.calls[0][0].dispatchEvent(
new SpeechSynthesisEvent('end', { utterance: speechSynthesis.speak.mock.calls[0] })
);

// THEN: `useDictateState` should returns STARTING.
renderResult.rerender();
// Dictate state "1" is for "automatic turning on microphone after current synthesis completed".
await waitFor(() => expect(renderResult).toHaveProperty('result.current', 2));

// THEN: Should construct SpeechRecognition().
expect(ponyfill.SpeechRecognition).toHaveBeenCalledTimes(1);

const { value: speechRecognition1 } = ponyfill.SpeechRecognition.mock.results[0];

// THEN: Should call SpeechRecognition.start().
expect(speechRecognition1.start).toHaveBeenCalledTimes(1);

// WHEN: Recognition started and interims result is dispatched.
speechRecognition1.dispatchEvent(new Event('start'));
speechRecognition1.dispatchEvent(new Event('audiostart'));
speechRecognition1.dispatchEvent(new Event('soundstart'));
speechRecognition1.dispatchEvent(new Event('speechstart'));

// WHEN: Recognized interim result of "Hello".
speechRecognition1.dispatchEvent(
new SpeechRecognitionEvent('result', {
results: new SpeechRecognitionResultList(
new SpeechRecognitionResult(new SpeechRecognitionAlternative(0, 'Hello'))
)
})
);

// THEN: `useDictateState` should returns DICTATING.
renderResult.rerender();
await waitFor(() => expect(renderResult).toHaveProperty('result.current', 3));

// WHEN: Recognized finalized result of "Hello, World!" and ended recognition.
await (
await directLine.actPostActivity(() =>
speechRecognition1.dispatchEvent(
new SpeechRecognitionEvent('result', {
results: new SpeechRecognitionResultList(
SpeechRecognitionResult.fromFinalized(new SpeechRecognitionAlternative(0.9, 'Hello, World!'))
)
})
)
)
).resolveAll();

speechRecognition1.dispatchEvent(new Event('speechend'));
speechRecognition1.dispatchEvent(new Event('soundend'));
speechRecognition1.dispatchEvent(new Event('audioend'));
speechRecognition1.dispatchEvent(new Event('end'));

// THEN: `useDictateState` should returns IDLE.
renderResult.rerender();
await waitFor(() => expect(renderResult).toHaveProperty('result.current', 0));

// WHEN: Bot replied.
await directLine.emulateIncomingActivity({
inputHint: 'expectingInput', // "expectingInput" should turn the microphone back on after synthesis completed.
text: 'Aloha!',
type: 'message'
});
await pageConditions.numActivitiesShown(2);

// THEN: Should call SpeechSynthesis.speak() again.
await waitFor(() => expect(speechSynthesis.speak).toHaveBeenCalledTimes(2));

// THEN: Should start synthesize "Aloha!".
expect(speechSynthesis.speak).toHaveBeenLastCalledWith(expect.any(SpeechSynthesisUtterance));
expect(speechSynthesis.speak).toHaveBeenLastCalledWith(expect.objectContaining({ text: 'Aloha!' }));

// THEN: `useDictateState` should returns WILL_START.
renderResult.rerender();
await waitFor(() => expect(renderResult).toHaveProperty('result.current', 1));

// WHEN: Synthesis completed.
speechSynthesis.speak.mock.calls[1][0].dispatchEvent(
new SpeechSynthesisEvent('end', { utterance: speechSynthesis.speak.mock.calls[1] })
);

// THEN: `useDictateState` should returns STARTING.
renderResult.rerender();
await waitFor(() => expect(renderResult).toHaveProperty('result.current', 2));

// WHEN: Recognition started and interims result is dispatched.
const { value: speechRecognition2 } = ponyfill.SpeechRecognition.mock.results[1];

// THEN: Should call SpeechRecognition.start().
expect(speechRecognition2.start).toHaveBeenCalledTimes(1);

// WHEN: Recognition started and interims result is dispatched.
speechRecognition2.dispatchEvent(new Event('start'));
speechRecognition2.dispatchEvent(new Event('audiostart'));
speechRecognition2.dispatchEvent(new Event('soundstart'));
speechRecognition2.dispatchEvent(new Event('speechstart'));

// WHEN: Recognized interim result of "Good".
speechRecognition2.dispatchEvent(
new SpeechRecognitionEvent('result', {
results: new SpeechRecognitionResultList(
new SpeechRecognitionResult(new SpeechRecognitionAlternative(0, 'Good'))
)
})
);

// THEN: `useDictateState` should returns LISTENING.
renderResult.rerender();
await waitFor(() => expect(renderResult).toHaveProperty('result.current', 3));

// WHEN: Click on microphone button.
await pageObjects.clickMicrophoneButton();

// THEN: `useDictateState` should returns STOPPING.
renderResult.rerender();
await waitFor(() => expect(renderResult).toHaveProperty('result.current', 4));

// WHEN: Recognition ended.
speechRecognition2.dispatchEvent(new Event('speechend'));
speechRecognition2.dispatchEvent(new Event('soundend'));
speechRecognition2.dispatchEvent(new Event('audioend'));
speechRecognition2.dispatchEvent(new Event('end'));

// THEN: `useDictateState` should returns STOPPING.
renderResult.rerender();
await waitFor(() => expect(renderResult).toHaveProperty('result.current', 0));
});
</script>
</body>
</html>
Loading
Loading