Chris Buchanan: Singing Synthesis

When: Wednesday 2nd October 2019 @ 5:10 PM

Where: The Atrium (G.10), Alison House, 12 Nicholson Sq, University of Edinburgh

Title: Singing Synthesis

Speaker: Chris Buchanan (Cereproc, Edinburgh, UK)

Abstract

In the last two years speech synthesis technology has changed beyond recognition. Being able to create seamless copies of voices is a reality, and the manipulation of voice quality using synthesis techniques can now produce dynamic audio content that is impossible to differentiate from natural spoken output. Speech synthesis is also vocal editing software. It will allow us to create artificial singing that is better than many human singers, graft expressive techniques from one singer to another, and using analysis-by-synthesis categorise and evaluate singing far beyond simple pitch estimation. How we apply and interface with this technology in the musical domain is at a cusp. It is the audio engineering community that will, in the end, dictate how these new techniques are incorporated into the music technology of the future. In this talk we approach this expanding field in a modern context, give some examples, delve into the multi-faceted nature of singing user interface and obstacles still to overcome, illustrate a novel avenue of singing modification, and discuss the future trajectory of this powerful technology from Text-to-Speech to music and audio engineering platforms.

Speaker Bio

Chris Buchanan is Audio Development Engineer for CereProc. He graduated in the Acoustics & Music Technology MSc here at the University of Edinburgh in 2016, after 3 years as a signal processing geophysicist at the French seismic imaging company CGG. He also holds a BSc degree in Mathematics from the same university. As a freelancer, Chris has been involved with the core DSP driving technology offered by the likes of Goodhertz, Signum Audio in their professional dynamics monitoring suite, and published research on 3D audio in collaboration with the Acoustics & Audio Group here. His research interests focus mainly on structural modelling/synthesis of the human voice andreal-time 3D audio synthesis via structural modelling of the Head-Related Transfer Function. More recently he’s taken on the challenge of singing synthesis, helping produce one of the world’s first Text-to-Singing (TTS) systems and thus enabling any voice to sing.