This is day 20 in my #javascript30 journey. This is the free course from Wes Bos that lets you brush up on your JavaScript skills via 30 projects.
Yesterday we explored accessing & playing around with the webcam. You can keep track of all the projects we’re building here.
Today we’re exploring speech detection.
Day 20 - Native Speech Recognition
So… Speech Recognition is available directly in the browser with no need for libraries. That is amazing.
There is minimal support for SpeechRecognition
available. We will only be able to run this app in Chrome or FireFox.
Even though there are some limitations this is still a really cool feature that I’m looking forward to implementing.
Accessing SpeechRecognition
We need to access SpeechRecognition
on the window. Chrome requires a webkit
prefix but we are going to associate them with the same keywords at the beginning of the page:
window.SpeechRecognition = window.SpeechRecognition || window.webkitSpeechRecognition
Next up we will create a new instance of speech recognition. We will also set the interimResults
to true. This allows us to view the text as we are speaking (as opposed to waiting until we are finished speaking to print the text).
const recognition = new SpeechRecognition()
recognition.interimResults = true
Printing the Text in the Browser
Once the browser has a stream of input coming in we need to print it out to the screen.
To do this we will create a paragraph. For each ‘pause’ in speech we want to create a new paragraph element.
We will only ever be editing the final element.
To do this we need to:
let p= = document.createElement('p')
const words = document.querySelector('.words')
words.appendChild(p)
Now we need to add an event listener and convert the results into an Array:
recognition.addEventListener('result', e => {
const transcript = Array.from(e.result)
.map(result => result[0])
.map(result => result.transcript)
.join('')
})
Now this will only run until the user has a pause in speech. We then need to add an event listener for when the recognition ends and have it listen for when it starts again:
recognition.addEventListener('end', recognition.start)
Now that we have the transcription stream coming through we need to output it to the `
` elements that we create.
We need to create a new `
` for each pause in speech:
recognition.addEventListener('result', e => {
const transcript = Array.from(e.result)
.map(result => result[0])
.map(result => result.transcript)
.join('')
p.textContent = transcript
if(e.results[0].isFinal) {
p = document.createElement('p')
words.appendChild(p)
}
})
Now we have a working transcript!
You can play around with the speech detection & transcript here.
You can keep track of all the projects in this JavaScript30 challenge here.