
Twilio’s communications platform has added over 50 Text-To-Speech (TTS) voices thanks to Amazon's deep learning-based Polly service.
Amazon Polly offers lifelike synthesized speech across various languages.
“You can select the ideal voice and build speech-enabled applications that work in many different countries,” says Amazon on Polly’s page.
Twilio is a cloud-based platform which aims to make it easy for developers to build applications with SMS, voice, and messaging capabilities with global scalability.
It’s easy to see why the two are natural partners.
Prior to today, Twilio only supported three voices and a limited set of languages. Amazon Polly vastly improves the company’s solution while enabling deeper levels of customisation.
Developers can now manually control the volume, pitch, rate, and pronunciation of the voices that interact with their users. This is controlled using the Speech Synthesis Markup Language (SSML) standard.
As an example, within the <Say> element, the SSML tag <prosody> can be used to control the speed of the speech. A full list of the SSML tags can be found here.
For existing Twilio developers, the default TTS provider can be switched from 'Basic' to 'Amazon Polly' in Twilio Console. Unless specified otherwise in your TwiML code, the default voice and locale selected here will apply.
The language and voice attributes can be used where non-defaults are to be used. For example, to use Amazon Polly’s “Emma” voice in UK English, you would use the following TwiML: <Say voice="Polly.Emma" language="en-GB">Thanks for calling!</Say>
Overall, the integration of Amazon Polly should offer a more natural TTS experience for users of apps which make use of Twilio’s popular communications platform.
Full documentation on TwiML Voice <Say> is available here.
What are your thoughts on the addition of Amazon Polly to Twilio? Let us know in the comments.