Basic STT->Language Translator->TTS example

Ever wondered what it might take to have what you're speaking be translated on the fly to another language? This project does exactly that in a very simple way. The project uses the IBM Watson Speech to Text, Language Translator, and Text to Speech service in a simple flow. Spoken text is processed by Speech to Text to create a transcript, and then the transcript is processed by Language Translator, and finally sent over to Text to Speech.

The system is designed to work on a sentence-by-sentence basis. In order to do this, the speaker needs to have a brief pause at the end of each spoken sentence, allowing the Speech to Text service to finalize the transcription. When running on a terminal, the transcribed sentence is displayed to confirm that it's been processed and sent over for translation. After an initial delay to transcribe and translate the first sentence, following sentences will be delivered to listeners in a sequential fashion as long as the translated sentences are approximately the same length as the original sentence.

A more sophisticated orchestration system would extend this approach by processing the metadata output from Text to Speech to ensure that later translated sentences were queued by the appropriate time needed to playback the current sentence. Generally though, the pauses between spoken sentences are sufficient to prevent overlap or interruptions.

Setup

On IBM Cloud, deploy Node-RED starter, bind instances of Language Translator and Speech To Text (preferrably standard) as Lite only supports 10k chars. Restage after binding.
Create instance of Text to Speech and make note of the api key.
Copy the .env.example file to .env and update with the Text to Speech api key and update the WSTARGET value to match the deployed Node-RED starter, appending /ws/text to the URL. Example: https://translator-low-code-2019.mybluemix.net/ws/text
Import the translate-and-speak.json flow into Node-RED.
Use npm to configure the local application that listens to the mic (tested on Mac OSX):
```
   npm install
   npm start
```

Once the application starts, speak each sentence clearly and at an even pace. Pause briefly at the end of the sentence and you will see the most recent sentence transcription appear to the console. When that appears, the local application is sending over a websocket to Node-RED the text, which is then procesed on IBM Cloud to produce the audio output.

To listen to the audio output, open in a browser the URL of the Node-RED starter with /audio appended to the host name. Multiple users can connect to the websocket output.

The example flow is configured to translate from English to Spanish and speak with the TTS female voice.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
transcribe-mic-to-websocket.js		transcribe-mic-to-websocket.js
translate-and-speak.json		translate-and-speak.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Basic STT->Language Translator->TTS example

Setup

About

Releases

Packages

Languages

License

timroster/basic-translator

Folders and files

Latest commit

History

Repository files navigation

Basic STT->Language Translator->TTS example

Setup

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages