Na próxima sexta-feira, dia 5, às 10 horas, na sala 2077 do Instituto de Ciências Exatas da UFMG, o Programa de Pós-Graduação em Ciência da Computação (PPGCC) da UFMG promoverá, como parte da série “Seminários da Pós”, uma palestra com o pós-doutorando do Departamento de Ciência da Computação (DCC), David Perera, orientando do professor do DCC Haniel Moreira Barbosa, intitulada “Ambiguity and Invariance in Machine Listening”. A palestra é aberta ao público interessado e não requer inscrição prévia.
Abstract: Machine listening is a growing field with several important industrial applications such as security (e.g., audio surveillance), manufacturing industry (e.g., predictive maintenance) and bioacoustics (e.g., ecosystem evolution tracking). It covers a wide array of tasks with practical interest: audio captioning, sound source localization, speech separation… Many tasks in this field are ambiguous (i.e., they are supervised tasks where the relation between input and target is non-deterministic). Ambiguous tasks challenge the use of single-prediction neural networks, which is customary in the supervised setting.
In order to mitigate ambiguity, we use Multiple Choice Learning (MCL), a framework that trains a multi-head neural network to provide a small set of plausible and diverse predictions, using a competitive scheme that promotes the specialization of the predictions in distinct regions of the prediction space. We focus on two extensions of this method, which illustrate its flexibility and demonstrate its effectiveness for machine listening, both from the empirical and the theoretical point of view. First, we show how simulated annealing, an optimization procedure inspired by the cooling of materials, can be leveraged to guide the trained network closer to an optimal solution of the MCL objective. Second, we show how to extend the discrete predictions of MCL to build a non-sparse estimator of the target probability distribution that has strong asymptotic convergence guarantees. Through these examples, we demonstrate that MCL is a powerful and versatile training scheme, which offers a wide array of applications and features interesting connections with the literature.
Keywords: multiple choice learning, machine listening.