Zixing Zhang Zhang Semi-Autonomous Data Enrichment and Optimisation for Intelligent Speech Analysis

Semi-Autonomous Data Enrichment and Optimisation for Intelligent Speech Analysis

von Zixing Zhang

EUR 72,00

Buch in deiner Nähe kaufen


...oder deine aktuelle Postleitzahl eingeben:
oder

Beschreibung

Intelligent Speech Analysis (ISA) plays an essential role in smart conversational agent systems that aim to enable natural, intuitive, and friendly human-computer interaction. It includes not only the long-term developed Automatic Speech Recognition (ASR), but also the young field of Computational Paralinguistics, which has attracted increasing attention in recent years. In real-world applications, however, several challenging issues surrounding data quantity and quality arise. For example, predefined databases for most paralinguistic tasks are normally quite small and few in number, which are insufficient for building a robust model. A distributed structure could be useful for data collection, but original feature sets are always too large to meet the physical transmission requirements, for example, bandwidth limitation. Furthermore, in a hands-free application scenario, reverberation severely distorts speech signals, which results in performance degradation of recognisers. To address these issues, this thesis proposes and analyses semi-autonomous data enrichment and optimisation approaches. More precisely, for the representative paralinguistic task of speech emotion recognition, both labelled and unlabelled data from heterogeneous resources are exploited by methods of data pooling, data selection, confidence-based semi-supervised learning, active learning, as well as cooperative learning. As a result, the manual work for data annotation is greatly reduced. With the advance of networks and information technologies, this thesis extends the traditional ISA system into a modern distributed paradigm, in which Split Vector Quantisation is employed for feature compression. Moreover, for distant-talk ASR, Long Short-Term Memory (LSTM) recurrent neural networks, which are known to be well-suited to context-sensitive pattern recognition, are evaluated to mitigate reverberation. The experimental results demonstrate that the proposed LSTM-based feature enhancement frameworks prevail over the current state-of-the-art methods.

Autor*in

Zixing Zhang

Themen in »Semi-Autonomous Data Enrichment and Optimisation for Intelligent Speech Analysis«

Data Enrichment Intelligent Speech Analysis Optimisation

Stimmen zu »Semi-Autonomous Data Enrichment and Optimisation for Intelligent Speech Analysis«

Details

ISBN: 9783843921480
Verlag: Dr. Hut
Erscheinung: 08.07.2015

Link teilen


Über buchnah.de | Die Buchhandlungen | Die Verlage | Impressum & Kontakt | Datenschutz | Presse


Auf dieser Seite kannst Du Buchhandlungen in der Nähe finden