This allows us to take a purely data-driven approach using supervised machine learning (ML) with convolutional neural networks (CNN). We then re-formulate the signal-processing pipeline as a deep computational graph with trainable weights. We first propose novel methods using digital signal processing and traditional feature engineering. To improve tempo estimation, we focus mainly on shortcomings of existing approaches, particularly estimates on the wrong metrical level, known as octave errors. Both tasks are well established in MIR research. Key estimation labels music recordings with a chord name describing its tonal center, e.g., C major. Tempo estimation is often defined as determining the number of times a person would “tap” per time interval when listening to music. In this thesis, we propose, explore, and analyze novel data-driven approaches for the two MIR analysis tasks tempo and key estimation for music recordings. Creating such methods is a central part of the research area Music Information Retrieval (MIR). Efficient retrieval from such collections, which goes beyond simple text searches, requires automated music analysis methods. In recent years, we have witnessed the creation of large digital music collections, accessible, for example, via streaming services. When errors are made concerning the phase of the beat, the system recovers quickly to resume correct beat tracking, despite the fact that there is no high level musical knowledge encoded in the system. The calculation of beat times is also robust. The system calculates the tempo correctly in most cases, the most common error being a doubling or halving of the tempo. Results are presented for a range of different musical styles, including classical, jazz, and popular works with a variety of tempi and meters. No prior knowledge of the tempo, meter or musical style is assumed all required information is derived from the data. We show that estimating the perceptual salience of rhythmic events significantly improves the results. Based on these tempo hypotheses, a multiple hypothesis search nds the sequence of beat times which has the best fit to the rhythmic events. The data is processed off-line to detect the salient rhythmic events and the timing of these events is analysed to generate hypotheses of the tempo at various metrical levels. The input data may be either digital audio or a symbolic representation of music such as MIDI. We describe a computer program which is able to estimate the tempo and the times of musical beats in expressively performed music.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |