Author Archives: admin_mike

Romain Michon: Embedded Real-Time Audio DSP With the Faust Programming Language

When: Wednesday 6th November 2019 @ 5:10 PM

Where: The Atrium (G.10), Alison House, 12 Nicholson Sq, University of Edinburgh

Title: Embedded Real-Time Audio DSP With the Faust Programming Language

Speakers: Dr Romain Michon (CCRMA, Stanford + GRAME-CNCM, Lyon, France)


Faust is a Domain-Specific programming Language (DSL) for real-time audio Digital Signal Processing (DSP). The Faust compiler can generate code in various lower-level programming languages (e.g., C, C++, LLVM, Rust, WebAssembly, etc.) from high-level DSP specifications. Generated code can be embedded in wrappers to add specific features to it (e.g., MIDI, polyphony, OSC, etc.) and to turn it into ready-to-use objects (e.g., audio plugins, standalones, mobile apps, web apps, etc.). More recently, Faust has been used a lot in the context of low-level audio embedded systems programming such as microcontrollers, bare-metal on the Raspberry Pi, FPGAs, etc. Optimizations are made for specific processor architectures (e.g., use of intrinsics, etc.) but hidden from the user to keep the programming experience as smooth and as easy as possible. After giving a quick introduction to Faust, we’ll present an overview of the work that has been made by the Faust team around embedded systems for audio. We’ll then present ongoing and future projects around this topic.

Speaker Bio

Romain Michon is a full-time researcher at GRAME-CNCM (Lyon, France) and a researcher and lecturer at the Center for Computer Research in Music and Acoustics (CCRMA) at Stanford University (USA). He has been involved in the development of the Faust programming language since 2008 and he’s now part of the core Faust development team at GRAME. Beside that, Romain’s research interests involve embedded systems for real-time audio processing, Human Computer Interaction (HCI), New Interfaces for Musical Expression (NIME), and physical modeling of musical instruments.

Li Su: AI and recent developments in music information retrieval

When: Wednesday 9th October 2019 @ 5:10 PM

Where: The Atrium (G.10), Alison House, 12 Nicholson Sq, University of Edinburgh

Title: AI and recent developments in music information retrieval

Speakers: Dr Li Su (Music and Culture Technology Laboratory, Institute of Information Science, Academia Sinica, Taiwan), Dr Yu-Fen Huang (University of Edinburgh), and Tsung-Ping Chen (Music and Culture Technology Laboratory, Institute of Information Science, Academia Sinica, Taiwan)


In this talk, we will discuss how to apply deep learning approaches to several challenging tasks in music information retrieval (MIR) including automatic music transcription, musical body movement analysis, and automatic chord recognition. Automatic music transcription (AMT) refers to the process of converting music recordings to symbolic representations. Since music transcription is by no means easy even for human, AMT has been one of the core challenges in MIR. Thanks to the recent advance of computing power and deep learning, more and more AMT solutions are becoming applicable in real-world cases. In this talk, we will first discuss the issues of solving the AMT problem from either signal processing and machine learning aspects. Then, we will have an introduction on the proposed solutions in transcribing piano music, singing voice, and non-Western music. Possible application, challenges and future research directions will also be discussed.

Computational musicology is an appealing scenario to apply MIR techniques. In this talk, the potential to perform music analysis using computational and deep learning approaches is discussed. Our recent work analyses musical movement and identifies highlighted features in music orchestral conducting movement using Recurrent Neural Network (RNN). Our work applies deep learning approaches to model the chords and harmony in tonal music will also be introduced.

Speaker Bio

Dr Li Su received the B. S. degree on electronic engineering and mathematics in National Taiwan University in 2008, and the Ph. D. degree on communication engineering in National Taiwan University in 2012. He had served as a postdoctoral research fellow in the Center of Information and Technology Innovation, Academia Sinica, from 2012 to 2016. Since 2017, he has served as an Assistant Research Fellow in the Institute of Information Science, Academia Sinica. His research has been highly focused on signal processing and machine learning for music information retrieval. His ongoing projects include automatic music transcription, style transfer, AI-based music-visual animation, etc. He has been a technical committee member in the International Society of Music Information Retrieval (ISMIR) since 2014.

Dr Yu-Fen Huang is a post-doctoral research fellow at Music and Culture Technology Laboratory, Institute of Information Science, Academia Sinica, Taiwan. Her Ph.D. research at University of Edinburgh (UK) collaborated with supervisors in Music and in Sport Science, and applied biomechanics and motion capture technology to music orchestral conducting movement analysis. She has a M.A. in Musicology (National Taiwan University), and possesses a B.A. in Music (National Taiwan Normal University). Her current research applies Recurrent Neural Network to explore the stylish features in different orchestral conductors’ conducting.

Tsung-Ping Chen is a research assistant at Music and Culture Technology Laboratory, Institute of Information Science, Academia Sinica, Taiwan. His research applies deep learning approaches to model the chords and harmony in tonal music, e.g. automatic chord recognition and chord generation. He possesses a M.A. in Musicology (National Taiwan University), where his project study the correlation between music and human physiology. He also has a B.S. in Engineering Science.

Chris Buchanan: Singing Synthesis

When: Wednesday 2nd October 2019 @ 5:10 PM

Where: The Atrium (G.10), Alison House, 12 Nicholson Sq, University of Edinburgh

Title: Singing Synthesis

Speaker: Chris Buchanan (Cereproc, Edinburgh, UK)


In the last two years speech synthesis technology has changed beyond recognition. Being able to create seamless copies of voices is a reality, and the manipulation of voice quality using synthesis techniques can now produce dynamic audio content that is impossible to differentiate from natural spoken output. Speech synthesis is also vocal editing software. It will allow us to create artificial singing that is better than many human singers, graft expressive techniques from one singer to another, and using analysis-by-synthesis categorise and evaluate singing far beyond simple pitch estimation. How we apply and interface with this technology in the musical domain is at a cusp. It is the audio engineering community that will, in the end, dictate how these new techniques are incorporated into the music technology of the future. In this talk we approach this expanding field in a modern context, give some examples, delve into the multi-faceted nature of singing user interface and obstacles still to overcome, illustrate a novel avenue of singing modification, and discuss the future trajectory of this powerful technology from Text-to-Speech to music and audio engineering platforms.

Speaker Bio

Chris Buchanan is Audio Development Engineer for CereProc. He graduated in the Acoustics & Music Technology MSc here at the University of Edinburgh in 2016, after 3 years as a signal processing geophysicist at the French seismic imaging company CGG. He also holds a BSc degree in Mathematics from the same university. As a freelancer, Chris has been involved with the core DSP driving technology offered by the likes of Goodhertz, Signum Audio in their professional dynamics monitoring suite, and published research on 3D audio in collaboration with the Acoustics & Audio Group here. His research interests focus mainly on structural modelling/synthesis of the human voice andreal-time 3D audio synthesis via structural modelling of the Head-Related Transfer Function. More recently he’s taken on the challenge of singing synthesis, helping produce one of the world’s first Text-to-Singing (TTS) systems and thus enabling any voice to sing.

Dario D’Orazio: Measuring Room Impulse Responses in Noisy Environments

When: Wednesday 10th April 2019 @ 5:10 PM

Where: The Atrium (G.10), Alison House, 12 Nicholson Sq, University of Edinburgh

Title: Measuring Room Impulse Responses in Noisy Environments

Speaker: Dr Dario D’Orazio (Acoustics Research Group, University of Bologna, Italy)


Measurement techniques allow for the identification of acoustic impulse responses (IRs), ideally free of noise due to statistic properties of exciting signals (MLS, ESS, etc..). In ‘real’ cases the measured IRs may be affected by background noises (hums, impulses, speeches…). This lecture will present some practical cases, pointing out the strategies to enhance the measurement of IRs in noisy environments. These techniques concern the hardware setup of the measurement chain and the post-processing extraction of room criteria. Case studies will be exposed briefly, in order to improve measurements according to ISO 3382-1 (large halls), ISO 3382-2 (classrooms), and ISO 3382-3 (open-plan offices).

Speaker Bio

Dario D’Orazio obtained his M.Sc. degree in Electronic Engineering and his PhD in Applied Acoustics at the University of Bologna, IT, in 2007 and 2011 respectively. Currently is senior post doctoral fellow at the Department of Industrial Engineering at the same University. His researches involve room acoustics, material properties, classrooms and open-plan offices. He works also as a part-time acoustic consultant for opera houses (Galli Theatre of Rimini, Duse Theatre of Bologna), auditoria (Le Torri dell’acqua), classrooms (Former faculty of Letters and Philosophy at Bologna University), cinemas (Fulgor in Fellini’s house), worship spaces (Varignano Churches in Viareggio).

Archontis Politis: Reproducing recorded spatial sound scenes – Parametric and non-parametric approaches

When: Monday 1st April 2019 @ 5:30 PM

Where: Room 4.31/4.33, Informatics Forum, 10 Crichton St, University of Edinburgh

Title: Reproducing recorded spatial sound scenes – Parametric and non-parametric approaches

Speaker: Dr Archontis Politis (Tampere University, Finland)


Spatial sound reproduction methods for recorded sound scenes are an active field of research, in parallel with evolving vision-related or multi-modal technologies that aim to deliver a new generation of immersive multimedia content to the user. Contrary to previous channel-based surround approaches, modern spatial audio requirements demand methods that can handle fully-immersive content, and are flexible in terms of rendering capabilities to various playback systems. This presentation gives an overview of such methods, with a distinction between parametric and non-parametric methods. Non-parametric methods make no distinction of the sound scene itself, and distribute the recordings to the playback channels based on specifications of the recording setup and the playback system only. Parametric methods assume additionally a model of the sound scene content and aim to adaptively estimate its parameters from the recordings. Some representative approaches from both categories are presented, with emphasis on some of the methods co-developed by the presenter at the Acoustics Lab of Aalto University, Finland.

Speaker Bio

Archontis Politis obtained his M.Eng. degree in Civil Engineering at Aristotle’s University of Thessaloniki, Greece, and his M.Sc. degree in Sound & Vibration Studies at ISVR, University of Southampton, UK, in 2006 and 2008 respectively. From 2008 to 2010 he worked as a graduate acoustic consultant at Arup Acoustics, Glasgow, UK, and as a researcher in a joint collaboration between Arup Acoustics and the Glasgow School of Arts, on interactive auralization of architectural spaces using 3D sound techniques. In 2016 he obtained his doctoral degree on the topic of parametric spatial sound recording and reproduction from Aalto University, Finland. He has also completed an internship at the Audio and Acoustics Research Group of Microsoft Research, during summer of 2015. He is currently a post-doctoral researcher at Tampere University, Finland. His research interests include spatial audio technologies, virtual acoustics, array signal processing and acoustic scene analysis.

Kurt Werner: “Boom Like an 808” – Secrets of the TR-808 Bass Drum’s Circuit

When: Wednesday 27th March 2019 @ 5:10 PM

Where: Room 4.31/4.33, Informatics Forum, 10 Crichton St, University of Edinburgh

Title: “Boom Like an 808″ – Secrets of the TR-808 Bass Drum’s Circuit

Speaker: Dr Kurt Werner (Queen’s University Belfast, UK)


The Roland TR-808 kick drum is among the most iconic sounds in all of popular music. Analogue drum machines like the TR-808 produce simulacra of percussive sounds using electrical “voice circuits,” whose schematics I treat as a primary text to be read alongside their reception history. I argue that these voice circuits and their schematics are the key to recovering a holistic history of analog drum synthesis. In this seminar, I’ll present a close reading of the TR-808 kick drum’s voice circuit and a study of its conceptual antecedents, highlighting the contributions of hobbyists and hackers, circuit theorists, and commercial instrument designers. This analysis reveals that while some aspects of the TR-808’s voice circuits are unremarkable, other aspects related to time-varying pitch shifts are unique and betray a deep understanding of traditional instrument acoustics. This investigation offers one answer to the question: Why does the 808 sound so good?!”.

Speaker Bio

Dr. Kurt James Werner is a Lecturer in Audio at the Sonic Arts Research Centre (SARC) of Queen’s University Belfast, where he joined the faculty of Arts, English, and Languages in early 2017. As a researcher, he studies theoretical aspects of Wave Digital Filters and other virtual analog topics, computer modeling of circuit-bent instruments, and the history of music technology. As part of his Ph.D. in Computer-Based Music Theory and Acoustics from Stanford University’s Center for Computer Research in Music and Acoustics (CCRMA), he wrote a doctoral dissertation entitled “Virtual Analog Modeling of Audio Circuitry Using Wave Digital Filters.” This proposed a number of new techniques for modeling audio circuitry, greatly expanding the class of circuits that can be modeled using the Wave Digital Filter approach to include circuits with complicated topologies and multiple nonlinear electrical elements. As a composer of electro-acoustic/acousmatic music, his music references elements of chiptunes, musique concrète, circuit bending, algorithmic/generative composition, and breakbeat.

Braxton Boren: Acoustic Simulation of Soundscapes from History

When: Monday 11th March 2019 @ 5:10 PM

Where: Atrium (G.10), Alison House, Nicolson Square

Title: Acoustic Simulation of Soundscapes from Historys

Speaker: Dr Braxton Boren (American University, USA)


Computational acoustic simulation has been used as a tool realize unbuilt spaces in the case of architectural design, or purely virtual spaces in the case of video game audio. However, another important application of this technology is the capacity to recreate sounds and soundscapes that no longer exist for historical and musicological research. Most of our historical knowledge is visually oriented – the size, color, or texture of places and people has been able to be recorded in some form for millennia. Conversely sound is instead transient, quickly decaying, and was not able to be recorded generally until the 19th century. Because of this, our conception of history is more like a photo album than a movie – the sounds of performance spaces or charismatic speakers are mostly left to our imagination. However, in the past decade, computational acoustic simulation has allowed a lens into sounds from the past, allowing us to predict with high accuracy the role of sound in different spaces and historical contexts. This talk will give examples of using acoustic modeling to simulate the influence of changing church acoustics on Western music, focusing especially on the examples of Renaissance Venice and Baroque Leipzig. The talk will also examine the role of acoustics and speech intelligibility on oratory and speeches to large crowds before electronic amplification was available, focusing on the examples of George Whitefield in 18th century London and Julius Caesar during the Roman Civil War.

Speaker Bio

Braxton Boren is Assistant Professor of Audio Technology at American University, where he joined the faculty in Fall 2017. He received a BA in Music Technology from Northwestern University, where he was the valedictorian of the Bienen School of Music in 2008. He was awarded a Gates Cambridge Scholarship to attend the University of Cambridge to research computational acoustic simulation, where he earned his MPhil in Physics in 2010. He completed his Ph.D. in Music Technology at MARL, the Music and Audio Research Laboratory at New York University in 2014. He worked as a postdoctoral researcher working in spatial audio over headphones at Princeton University’s 3D Audio and Applied Acoustics Laboratory from 2014-2016. He taught high school Geometry from 2016-2017 in Bedford-Stuyvesant, NY.

Jens Ahrens: Current Trends in Binaural Auralization of Microphone Array Recordings

When: Wednesday 9th January 2018 @ 5:10 PM

Where: Room 4.31/4.33, Informatics Forum, 10 Crichton Street

Title: PCurrent Trends in Binaural Auralization of Microphone Array Recordings

Speaker: Dr Jens Ahrens (Chalmers University of Technology, Sweeden)


Many approaches for the capture and auralization of real acoustic spaces have been proposed over the past century. Limited spatial resolution on the capture side has typically been the factor that caused compromises in the achievable authenticity of the auralization. Recent advancements in the field of microphone arrays provide new perspective particularly for headphone-based auralization. It has been shown that head-tracked binaural auralization of the data captured by a bowling-ball-sized spherical array of around 90 microphones allows for creating signals at the ears of the listener that are perceptually almost indistinguishable from the ear signals that arise in the original space. Promising results have also been obtained based on smaller arrays with fewer microphones. In the present talk, we provide an overview of the current activities in the research community and demonstrate the latest advancements and remaining challenges.

Speaker Bio

Jens Ahrens has been an Associate Professor and head of the Audio Technology Group within the Division of Applied Acoustics at Chalmers since 2016. He has also been a Visiting Professor at the Applied Psychoacoustics Lab at University of Huddersfield, UK, since 2018. Jens received his Diploma (equivalent to a M.Sc.) in Electrical Engineering/Sound Engineering jointly from Graz University of Technology and the University of Music and Dramatic Arts, Graz, Austria, in 2005. He completed his Doctoral Degree (Dr.-Ing.) at the Technische Universität Berlin, Germany, in 2010. From 2006 to 2011, he was a member of the Audio Technology Group at Deutsche Telekom Laboratories / TU Berlin where he worked on the topic of sound field synthesis. From 2011 to 2013, Jens was a Postdoctoral Researcher at Microsoft Research in Redmond, Washington, USA. Thereafter, he re-joined the Quality and Usability Lab at the Technische Universität Berlin. In the fall and winter terms of 2015/16, he was a Visiting Scholar at the Center for Computer Research in Music and Acoustics (CCRMA) at Stanford University, California, USA.

Augusto Sarti: Plenacoustic Capturing and Rendering – A New Paradigm for Immersive Audio

When: Thursday 29th November 2018 @ 5:10 PM

Where: Atrium, Alison House, 12 Nicolson Square

Title: Plenacoustic Capturing and Rendering – A New Paradigm for Immersive Audio

Speaker: Augusto Sarti (Dipartimento di Elettronica, Informazione e Bioingegeria (DEIB), Politecnico di Milano, Italy)


Acoustic signal processing traditionally relies on divide-and-conquer strategies (e.g. Plane-Wave Decomposition) applied to descriptors such as the acoustic pressure, which are scalar functions of space and time, hence the name “Space-Time Audio Processing”. A plethora of applications have been developed, which use arrays or spatial distributions of microphones and/or loudspeakers or cluster thereof. The typical applications are localizing/tracking, characterizing and extracting acoustic sources; as well as estimating, processing and rendering acoustic fields. Such applications, however, are often hindered by the inherent limits of geometrical acoustics; by the far-field hypothesis of Fourier acoustics; and by the adverse acoustics of everyday environments.

In this talk I will discuss viable alternatives to traditional approaches to space-time processing of acoustic fields, based on alternate formulations and representations of soundfields. I will first discuss how the geometric and acoustic properties of the environment’s reflectors can be estimated and even used for boosting space-time audio processing algorithms. I will then introduce a soundfield representation that uses descriptors that are defined in the so-called “ray space” and show how this can lead to applications such as interactive soundfield modeing, processing and rendering. I will finally discuss how to rethink our signal decomposition strategy by introducing a novel wave-field decomposition methodology based on Gabor frames, which is more suitable for local (in the space-time domain) representations. Based on this new framework for computational acoustics, I will introduce the ray-space transform and show how it can be used for efficiently and effectively approaching a far wider range of problems than source source separation and extraction, pushing the boundaries of environment’s inference; object-based manipulation of acoustic wavefields; interactive acoustics; and more.

Speaker Bio

Ph.D. in information engineering (1993) from the University of Padua, Italy, with a joint program with UC Berkeley, focusing on nonlinear system theory. With the Politecnico di Milano since 1993, where he is now a full professor. Double affiliation with UC Davis from 2013 to 2017, as a full professor. Research interests in the area of digital signal processing, with focus on space-time audio processing, sound analysis, synthesis and processing, image analysis, and 3D vision. Main contributions in the area of sound synthesis (nonlinear wave digital filters); of space-time audio processing (plenacoustic processing, visibility-based interactive acoustic modeling, geometry-based acoustic scene reconstruction; soundfield rendering, etc.); nonlinear system theory (Volterra system inversion, Wave Digital circuit simulation); computer vision; image analysis and processing. Senior Member of IEEE, Senior Area Editor of IEEE Signal Processing Letters, Associate Editor of IEEE Tr. on Audio Speech and Language Processing. Member elect of the IEEE Audio and Acoustic Signal Processing TC, founding member of the European Acoustics Association (EAA) TC on Audio Signal Processing. Chairman of the EURASIP Special Area Team of Acoustic, Sound and Music Signal Processing. Member elect of the Board of Directors of EURASIP.

Nick Collins: Musical Machine Listening in the Web Browser

When: Wednesday 14th November 2018 @ 5:10 PM

Where: Atrium, Alison House, 12 Nicolson Square

Title: Musical Machine Listening in the Web Browser

Speaker: Prof Nick Collins (Durham University)


In this seminar, experiments so far with Web Audio API based live audio analysis will be demoed and discussed. The new Musical Machine Listening Library (MMLL) for javascript will be introduced, as well as the current MIMIC (Musically Intelligent Machines Interacting Creatively) joint AHRC funded project between Goldsmiths, Sussex and Durham universities. A minimal code starting point for machine listening work in the browser will be explained, and I will demonstrate some more involved experiments in browser based auditory modelling, onset detection, beat tracking based audio cutting and the like.

Speaker Bio

Nick Collins is a Professor in the Durham University Music Department with strong interests in artificial intelligence techniques applied within music, the computer and programming languages as musical instrument, and the history and practice of electronic music. He is a frequent international performer as composer-programmer-pianist or codiscian, from algoraves to electronic chamber music. Many research papers and much code and music are available from