Score following for expressive piano performance

What is the problem?

Score Following is the problem of designing a computational system that is able to follow a musical performance in real time. It has many applications including page turner, automatic accompaniment, lyrics display, etc. One challenge in score following for piano music is the sustained effect, i.e., the waveform of a note lasts longer than what is notated in the score. This can be caused by expressive performing styles such as the legato articulation and the usage of the sustain and the sostenuto pedals, and can also be caused by the reverberation in the recording environment. This effect creates non-notated overlappings between sustained notes and latter notes in the audio. It decreases the audio-score alignment accuracy and robustness of score following systems, and makes them be prone to delay errors, i.e., aligning audio to a score position that is earlier than the correct position.

What is our approach?

In this project, we propose to modify the feature representation of the audio to attenuate the sustained effect. We first apply onset detection and treat all frames within a region immediately after an onset as frames that potentially contain the sustained effect. We then analyze the spectra of the signal in these frames and detect spectral components that are considered as an extension from previous notes. We reduce the energy of these components in the audio representation to attenuate the sustained effect. We show that this idea can be applied to both the chromagram and the spectral-peak representations, which are commonly used in score following systems.

Our Results

Experiments on the MAPS dataset show that the proposed method significantly improves the alignment accuracy and robustness of state-of-the-art score following systems for piano performances, in both anechoic and highly reverberant environments. More details are presented here.

Android Implementation for a Page Turner

We are developing an Android app for a page-turner based on the score following approaches mentioned above. This app will be able to follow a live piano performance in real-time and display the corresponding sections of the score, regardless of the speed and dynamics changes of the performer. Check back later to download the system.

Fig. 1. Statistics of two causes of the sustained effect. (a) Distribution of the 60 acoustic pieces in the MAPS dataset according to the degree of pedal usage. (b) Reverberation time (RT60) of three famous concert halls.

Fig. 2. Illustration of the sustained effect and the delay error. The grey notes are extended in the audio waveform longer than their notated length in the score


Bochen Li, Zhiyao Duan, An Approach to Score Following for Piano Performances with the Sustained Effect, IEEE Transactions on Audio, Speech, and Language Processing, accepted.

Bochen Li, Zhiyao Duan, Score following for piano performances with sustain-pedal effects, in Proc. International Society for Music Information Retrieval Conference (ISMIR), 2015.