Hidden Markov Model Techniques 2

Home
Full List of Titles
1: ICSLP'98 Proceedings
Keynote Speeches
Text-To-Speech Synthesis 1
Spoken Language Models and Dialog 1
Prosody and Emotion 1
Hidden Markov Model Techniques 1
Speaker and Language Recognition 1
Multimodal Spoken Language Processing 1
Isolated Word Recognition
Robust Speech Processing in Adverse Environments 1
Spoken Language Models and Dialog 2
Articulatory Modelling 1
Talking to Infants, Pets and Lovers
Robust Speech Processing in Adverse Environments 2
Spoken Language Models and Dialog 3
Speech Coding 1
Articulatory Modelling 2
Prosody and Emotion 2
Neural Networks, Fuzzy and Evolutionary Methods 1
Utterance Verification and Word Spotting 1 / Speaker Adaptation 1
Text-To-Speech Synthesis 2
Spoken Language Models and Dialog 4
Human Speech Perception 1
Robust Speech Processing in Adverse Environments 3
Speech and Hearing Disorders 1
Prosody and Emotion 3
Spoken Language Understanding Systems 1
Signal Processing and Speech Analysis 1
Spoken Language Generation and Translation 1
Spoken Language Models and Dialog 5
Segmentation, Labelling and Speech Corpora 1
Multimodal Spoken Language Processing 2
Prosody and Emotion 4
Neural Networks, Fuzzy and Evolutionary Methods 2
Large Vocabulary Continuous Speech Recognition 1
Speaker and Language Recognition 2
Signal Processing and Speech Analysis 2
Prosody and Emotion 5
Robust Speech Processing in Adverse Environments 4
Segmentation, Labelling and Speech Corpora 2
Speech Technology Applications and Human-Machine Interface 1
Large Vocabulary Continuous Speech Recognition 2
Text-To-Speech Synthesis 3
Language Acquisition 1
Acoustic Phonetics 1
Speaker Adaptation 2
Speech Coding 2
Hidden Markov Model Techniques 2
Multilingual Perception and Recognition 1
Large Vocabulary Continuous Speech Recognition 3
Articulatory Modelling 3
Language Acquisition 2
Speaker and Language Recognition 3
Text-To-Speech Synthesis 4
Spoken Language Understanding Systems 4
Human Speech Perception 2
Large Vocabulary Continuous Speech Recognition 4
Spoken Language Understanding Systems 2
Signal Processing and Speech Analysis 3
Human Speech Perception 3
Speaker Adaptation 3
Spoken Language Understanding Systems 3
Multimodal Spoken Language Processing 3
Acoustic Phonetics 2
Large Vocabulary Continuous Speech Recognition 5
Speech Coding 3
Language Acquisition 3 / Multilingual Perception and Recognition 2
Segmentation, Labelling and Speech Corpora 3
Text-To-Speech Synthesis 5
Spoken Language Generation and Translation 2
Human Speech Perception 4
Robust Speech Processing in Adverse Environments 5
Text-To-Speech Synthesis 6
Speech Technology Applications and Human-Machine Interface 2
Prosody and Emotion 6
Hidden Markov Model Techniques 3
Speech and Hearing Disorders 2 / Speech Processing for the Speech and Hearing Impaired 1
Human Speech Production
Segmentation, Labelling and Speech Corpora 4
Speaker and Language Recognition 4
Speech Technology Applications and Human-Machine Interface 3
Utterance Verification and Word Spotting 2
Large Vocabulary Continuous Speech Recognition 6
Neural Networks, Fuzzy and Evolutionary Methods 3
Speech Processing for the Speech-Impaired and Hearing-Impaired 2
Prosody and Emotion 7
2: SST Student Day
SST Student Day - Poster Session 1
SST Student Day - Poster Session 2

Author Index
A B C D E F G H I
J K L M N O P Q R
S T U V W X Y Z

Multimedia Files

Real-Time Probabilistic Segmentation for Segment-Based Speech Recognition

Authors:

Steven C. Lee, MIT Lab for Computer Science (USA)
James R. Glass, MIT Lab for Computer Science (USA)

Page (NA) Paper number 594

Abstract:

In this work, we investigate modifications to a probabilistic segmentation algorithm to achieve a real-time, and pipelined capability for a segment-based speech recognizer. The existing algorithm used a Viterbi and backwards A* search to hypothesize phonetic segments. We were able to reduce the computational requirements of this algorithm by reducing the effective search space to acoustic landmarks, and were able to achieve pipelined capability by executing the A* search in blocks defined by reliably detected phonetic boundaries. The new algorithm produces 30% fewer segments, and improves TIMIT phonetic recognition performance by 2.4% over an acoustic segmentation baseline. We were also able to produce 30% fewer segments on a word recognition task in a weather information domain.

SL980594.PDF (From Author) SL980594.PDF (Rasterized)

TOP


Toward Markov Random Field Modeling of Speech

Authors:

Guillaume Gravier, ENST/TSi CNRS-URA 820 (France)
Marc Sigelle, ENST/TSI CBRS-URA 820 (France)
Gérard Chollet, ENST/TSI CNRS-URA 820 (France)

Page (NA) Paper number 560

Abstract:

In this paper, we present a new technique for statistical modeling of speech segments based on Markov random fields. Classical and multi-stream HMMs are particular cases of this more general family of models. However, the Random Field Model (RFM) proposed here can be seen as an extension of the multi-band HMM in which interactions between the frequency bands have been added. In a first experiment, samples are drawn from different models and compared to real observations. This experiment shows that the RFM is able to produce realistic samples but a single HMM still performs better. Isolated word recognition experiments stress the fact that more work must be done on the RFM in order to reach the performances of classical hidden Markov modeling techniques. For the moment, the RFM parameters are estimated using a heuristic. We believe that a real maximum likelihood parameter estimation algorithm should improve the results. The main advantage of this new model is that it can easily be extended since a model is defined by some local interactions and the Gibbs potential functions associated to those interactions.

SL980560.PDF (From Author) SL980560.PDF (Rasterized)

TOP


Hidden Markov Models for Trajectory Modeling

Authors:

Rukmini Iyer, GTE/BBN Technologies (USA)
Herbert Gish, GTE/BBN Technologies (USA)
Man-Hung Siu, GTE/BBN Technologies (USA)
George Zavaliagkos, GTE/BBN Technologies (USA)
Spyros Matsoukas, GTE/BBN Technologies (USA)

Page (NA) Paper number 891

Abstract:

Current state-of-the-art statistical speech recognition systems use hidden Markov models (HMM) for modeling the speech signal. However, it is well known that HMM's do not exploit the time-dependence in the speech process, since they are limited by the assumption of conditional independence of observations given the state sequence. Alternative techniques, such as segment modeling approaches, can effectively exploit time-dependencies in the acoustic signal by discarding the observation independence assumption. However, losing the basic HMM structure is often a high computational price to pay for improved acoustic models. In this paper, we introduce the parallel path HMM that exploits the time-dependence in speech via parametric trajectory models while maintaining the HMM framework. We present preliminary results on Switchboard, a large vocabulary conversational speech recognition task, demonstrating both improved modeling and potential for improved recognition performance.

SL980891.PDF (From Author) SL980891.PDF (Rasterized)

TOP