Text to Speech with Javascript

In modern browsers text to speech feature is built natively in them. This means that you don't need external plugins or APIs to convert text to speech !


SpeechSynthesis API

SpeechSynthesis API is a part of the Web Speech API, that is responsible for speech service. The global window.speechSynthesis object implements the SpeechSynthesis API.

var synth = window.speechSynthesis;

The important methods and properties defined in it are :

  • getVoices() : This method will give a list of available voices that can be played. They will come in different languages, and you can set a language of your preference to speak. Each voice has few properties, some of them are name, lang etc.
  • speak() : This method will add a speech (or an utterance) to a queue called utterance queue. This speech will be spoken after all speeches in the queue before it have been spoken.

Here are the complete APIs for the SpeechSynthesis object.

SpeechSynthesisUtterance API

Whenever you want a speech to be spoken, you will need to create a SpeechSynthesisUtterance object.

var utter = new SpeechSynthesisUtterance();

This object contains properties that affect various factors defining a speech :

  • lang : Language of the speech
  • pitch : Pitch of the speech
  • rate : Speed at which speech will be spoken
  • text : Text of the speech
  • voice : Voice of speech. This will be one of the voices returned by window.speechSynthesis.getVoices() method
  • volume : Volume of the speech

In addition there are several events that are fired along the way of a speech, some of them are :

  • onstart : Fired when speech has begun to be spoken
  • onend : Fired when speech has finished
  • onboundary : Fired when speech reaches a word or sentence boundary

Here are the complete APIs for the SpeechSynthesisUtterance object.

Sample Javascript Code

// get all voices that browser offers
var available_voices = window.speechSynthesis.getVoices();

// this will hold an english voice
var english_voice = '';

// find voice by language locale "en-US"
// if not then select the first voice
for(var i=0; i<available_voices.length; i++) {
	if(available_voices[i].lang === 'en-US') {
		english_voice = available_voices[i];
if(english_voice === '')
	english_voice = available_voices[0];

// new SpeechSynthesisUtterance object
var utter = new SpeechSynthesisUtterance();
utter.rate = 1;
utter.pitch = 0.5;
utter.text = 'Hello World';
utter.voice = english_voice;

// event after text has been spoken
utter.onend = function() {
	alert('Speech has finished');

// speak

Browser Compatability

SpeechSynthesis API is availabe in all current versions of Firefox, Chrome, Edge & Safari.

Don't Autoplay a Speech

Some sites start a speech upon the page being loaded. To prevent such autoplay behaviour, it is now required to have some user interaction before speech synthesis API will work. Otherwise it will throw an error.

Find more about autoplay policies on the web.