Wednesday, February 28

Amazon is Working on Deepfake Voices for Alexa

Amazon’s smart assistant Alexa could get a feature where it can imitate a human’s voice based on a minute of audio.


Amazon says it is working on the feature where Alexa can imitate someone’s voice based on a minute of audio. The company will announce this at its re: MARS conference in Las Vegas in the United States. The technology is packaged as an opportunity to reminisce about dearly departed by having texts read in their voices.

Quite apart from the psychological ramifications of that idea, and ethical questions about the rights to a person’s voice, technology that can quickly and easily “deepfake” someone’s voice has other potential consequences. For example, it could be used for fraudulent purposes such as voice phishing (“vishing”). Think, for example, of a phone call from the ‘CFO’ who asks to immediately pay a specific account (to the fraudsters).

To be clear, this is a demo of possible future technology. Computer systems can already imitate voices, but currently require much more input than a minute of sound.

Amazon showed the feature in a video in which a child asks to hear a story being “read” by grandma. It is not clear how far the development of the function has progressed. The demo appears to be based on advances in text-to-speech technology from Amazon, as explained in this white paper.

