Acoustic Sound Recognition (ASR) Techniques for Pervasive Applications

Pervasive Technologies
Thesis Code: 
16020

Thesis Type: 6 months Master Thesis (Laurea Magistrale) for students of: Computer Engineering, Communications and ICT Engineering, Mathematical Engineering or equivalent.

Requirements:
- Good programming skills in at least one language: C/C++, Java, Python
- Interest in working on Pattern recognition, Machine Learning, Speech and Language Processing

Description:
Motivation:
Acoustic Sound Recognition (ASR) has gained significant process in both research and application domains. The state of the art of ASR reveals the broad adoption of Gaussian Mixture Hidden Markov Model (HMM-GMM). At the same time, Deep Learning (DL) has achieved tremendous success for large vocabulary size with high accuracy rate, which indicates a new trend towards ASR. ASR can play an important role in Internet of Things Applications, as it eases the interaction among humans and machines. There are plenty of applications and services which providing speech recognition, such as Google Now, Apple Siri, Microsoft Cortana, Amazon Echo, etc. However these applications are mostly devised for environments where an Internet connection is available at all time. In order to have a personal ASR which works offline or in a mixed work mode, it is expected to understand the performance of on-shelf algorithms and implement the appropriate ones, considering the constrained running environment.

Objective:
The goal of this thesis is to analyze the available well-established and emerging ASR methodologies. Based on the initial analysis the involved student will be involved in the selection and extension of specific ASR techniques suitable for challenging environments (noisy environments, multiple audio sources, etc.) and applications (e.g. recognition of speaker location, integration with intelligent environments, etc. ). Priority shall be given to solutions suitable for disconnected operations and suitable to run on constrained, low-cost devices, possibly featuring open-source processing libraries and tools.

Contacts: send a resume specifying the thesis code and title to pert@ismb.it.