Abstract
Much of contemporary Music Information Research (MIR) concerning symbolic music modelling has shown a growing interest in large-scale datasets and high-capacity deep-learning architectures in recent years. However, many real-world use cases based around sheet music encoding would still benefit from novel methods of scrutinising small, specialised, and domain-specific datasets.
For example, pedagogical exercises that help instrumental learners to develop sight-reading skills often result from meticulous, musician-led curatorial processes. These curated collections of short-form sheet music examples intrinsically embed a progressive complexity that is challenging to formalise in conventional computational terms. They encompass intricate yet interconnected layers of rhythmic, harmonic, musicality, and bodily coordination features. It is therefore timely to study them through novel MIR methods, leveraging a localised small-data approach to help understand how sight-reading and playing difficulty is encoded through notation and through limited-size, score-based specimens.
This submission introduces a compact encoder–decoder pipeline designed to interface directly with MEI/MusicXML data and to provide a structured event representation suitable for machine learning on small datasets. The pipeline couples a hybrid encoding strategy with a lightweight Long Short-Term Memory (LSTM) architecture, enabling versatile feature extraction, analysis, modelling, and generation of piano sight-reading examples. Our goal is to demonstrate how carefully designed encodings can help understand sight-reading difficulty not only as a metadata label, but as a learned framework benchmarked against strict technical and musicality criteria.
For example, pedagogical exercises that help instrumental learners to develop sight-reading skills often result from meticulous, musician-led curatorial processes. These curated collections of short-form sheet music examples intrinsically embed a progressive complexity that is challenging to formalise in conventional computational terms. They encompass intricate yet interconnected layers of rhythmic, harmonic, musicality, and bodily coordination features. It is therefore timely to study them through novel MIR methods, leveraging a localised small-data approach to help understand how sight-reading and playing difficulty is encoded through notation and through limited-size, score-based specimens.
This submission introduces a compact encoder–decoder pipeline designed to interface directly with MEI/MusicXML data and to provide a structured event representation suitable for machine learning on small datasets. The pipeline couples a hybrid encoding strategy with a lightweight Long Short-Term Memory (LSTM) architecture, enabling versatile feature extraction, analysis, modelling, and generation of piano sight-reading examples. Our goal is to demonstrate how carefully designed encodings can help understand sight-reading difficulty not only as a metadata label, but as a learned framework benchmarked against strict technical and musicality criteria.
| Original language | English |
|---|---|
| Title of host publication | Music Encoding Conference (MEC) 2026 |
| Publication status | E-pub ahead of print - May 2026 |
| Event | Music Encoding Conference 2026 - Tokyo University of Science, Tokyo, Japan Duration: 26 May 2026 → 29 May 2026 https://music-encoding.org/conference/2026/ |
Conference
| Conference | Music Encoding Conference 2026 |
|---|---|
| Abbreviated title | MEC |
| Country/Territory | Japan |
| City | Tokyo |
| Period | 26/05/26 → 29/05/26 |
| Internet address |
Fingerprint
Dive into the research topics of 'Understanding Piano Sight-Reading Exercises: A Score-Based Encoder-Decoder LSTM Framework'. Together they form a unique fingerprint.-
AI in/through Creative Practice and Art-Science Collaboration
Ma, B. (Speaker / presenter)
22 May 2026Activity: Talk, presentation, and live performance › Invited talk
-
Understanding Piano Sight-Reading Exercises: A Score-based Encoder-Decoder LSTM Framework
Ma, B. (Speaker / presenter), Fan, H. (Collaborator), Howard, E. (Director), De Roure, D. (Director) & Thormählen, W. (Director)
May 2026Activity: Talk, presentation, and live performance › Oral presentation
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver