BaDumTss: Multi-task Learning for Beatbox Transcription

Capítulo de livro Revisado por pares

BaDumTss: Multi-task Learning for Beatbox Transcription

2022; Springer Science+Business Media; Linguagem: Inglês

10.1007/978-3-031-05981-0_14

ISSN

1611-3349

Autores

Priya Mehta, Meet Maheshwari, Brihi Joshi, Tanmoy Chakraborty,

Tópico(s)

Music Technology and Sound Studies

Resumo

The challenge of transcribing audio into symbolic notations is a well-known problem in music information retrieval. In this work, we explore a novel task – automatic music transcription for Beatbox sounds, also known as Vocal Percussions. As Beatbox sounds cannot be created in a synthetic manner, they inherently vary within the same speaker as well as across different speakers. To address this, we propose BaDumTss, which makes use of a pretraining strategy over a novel sequence traversal method, thereby ensuring robustness and efficiency against new Beatbox sequences. Furthermore, BaDumTss is agnostic to time-based stretches and warps, as well as amplitude changes in the Beatbox sequence. It predicts both onsets and frame-set in a multi-task manner while gaining a whopping 56% and 326% relative improvement frame-set and onset-level F1 scores over the best performing baseline respectively. We also release an annotated dataset of monophonic Beatbox sequences along with their corresponding MIDI labels, the first of its kind comprising Beatbox samples with different variations such as time-stretches, pitch shifts, and added noise.

Ver no editor

Altmetric

PlumX

Entrar

Lembrar minha senha

Receber meu e-mail de confirmação

BaDumTss: Multi-task Learning for Beatbox Transcription