In this article, we will learn to convert speech into text using HTML and JavaScript.
Approach: We added a content editable “div” by which we make any HTML element editable.
HTML
< div class = "words" contenteditable> < p id = "p" ></ p > </ div > |
We use the SpeechRecognition object to convert the speech into text and then display the text on the screen.
We also added WebKit Speech Recognition to perform speech recognition in Google chrome and Apple safari.
Javascript
window.SpeechRecognition=window.SpeechRecognition || window.webkitSpeechRecognition; |
InterimResults results should be returned true and the default value of this is false. So set interimResults= true
Javascript
recognition.interimResults = true ; |
Use appendChild() method to append a node as the last child of a node.
Javascript
const words=document.querySelector( '.words' ); words.appendChild(p); |
Add eventListener, in this event listener, map() method is used to create a new array with the results of calling a function for every array element.
Note: This method does not change the original array.
Use join() method to return array as a string.
Javascript
recognition.addEventListener( 'result' , e => { const transcript = Array.from(e.results) .map(result => result[0]) .map(result => result.transcript) .join( '' ) document.getElementById( "p" ).innerHTML = transcript; console.log(transcript); }); |
Final Code:
HTML
<!DOCTYPE html> < html lang = "en" > < head > < meta charset = "UTF-8" > < meta name = "viewport" content = "width=device-width, initial-scale=1.0" > < title >Speech to Text</ title > </ head > < body > < div class = "words" contenteditable> < p id = "p" ></ p > </ div > < script > var speech = true; window.SpeechRecognition = window.SpeechRecognition || window.webkitSpeechRecognition; const recognition = new SpeechRecognition(); recognition.interimResults = true; const words = document.querySelector('.words'); words.appendChild(p); recognition.addEventListener('result', e => { const transcript = Array.from(e.results) .map(result => result[0]) .map(result => result.transcript) .join('') document.getElementById("p").innerHTML = transcript; console.log(transcript); }); if (speech == true) { recognition.start(); recognition.addEventListener('end', recognition.start); } </ script > </ body > </ html > |
Output:
If the user tells “Hello World” after running the file, it shows the following on the screen.
Hello World