Focus of this work are pattern recognition related aspects of computer
assisted pronunciation training (CAPT) for second language learning.
An overview of commercial systems shows that pronunciation training is
being addressed by the growing field of computer assisted language
learning only to a small extend, although in the state-of-the-art section
a number of such approaches for automatic assessment can already be presented.
In the present thesis different approaches are extended and combined.
In particular a large set of nearly 200 pronunciation and prosodic features
is developed. By this approach pronunciation scoring is regarded as
classification task in high-dimensional feature space.
Automatic speech recognition is the basis of most pronunciation scoring
algorithms. In this thesis a system is presented, which supports second
language learning at school, i.e. the target users are children. For this
reason a state-of-the-art speech recognition engine is adapted to children
speech, since young speakers are only hardly recognised by automatic systems.
Phonetically motivated rules for typical mispronunciation errors are integrated
into the system to make it suitable for pronunciation scoring.
Evaluating an algorithm for pronunciation assessment is more difficult
than simply counting the correctly recognised mistakes, since there exists no
objective ground truth. This can be shown by evaluating the annotations of
14 teachers. However, with different measures it can be verified that the
accuracy of the system (in comparison with teachers) thoroughly reaches the
agreement among teachers. The evaluation is conducted with native German
speakers learning English.
Christian Hacker
Aussprachebewertung Englisch als Fremdsprache Kindersprache Klassifikation automatische Spracherkennung