SCIENCE: AI Listens to Voices, Then Generates Speakers’ Faces

Have you ever constructed a mental image of a person you’ve never seen, based solely on their voice? Artificial intelligence (AI) can now do that, generating a digital image of a person’s face using only a brief audio clip for reference.

Named Speech2Face, the neural network — a computer that “thinks” in a manner similar to the human brain — was trained by scientists on millions of educational videos from the internet that showed over 100,000 different people talking.

From this dataset, Speech2Face learned associations between vocal cues and certain physical features in a human face, researchers wrote in a new study. The AI then used an audio clip to model a photorealistic face matching the voice.

The findings were published online May 23 in the preprint jounral arXiv and have not been peer-reviewed.

Thankfully, AI doesn’t (yet) know exactly what a specific individual looks like based on their voice alone. The neural network recognized certain markers in speech that pointed to gender, age and ethnicity, features that are shared by many people, the study authors reported.

Full Story From Live Science

Join Our Newsletter List, Get 4 Free Books

First name

Last name

File Type Preferred epub (kindle, barnes & noble, kobo) pdf (generic)

Privacy By clicking here, you agree to our terms and privacy policy

Queer Sci Fi Newsletter Consent Yes, Add me to the QSFNewsletter/Offers List No, I do not want to receive emails from Queer Sci Fi

Please consider also subscribing to the newsletters of the authors who are providing these free eBooks to you.

Author Newsletter Consent Yes, please add me to the newsletter mailing lists for the authors who are providing these free eBooks to me. No, I do not want to be added to the author email lists.

Check your inbox to confirm your addition to the list(s)