Recognizing Broadcast News

Home

Transcription of broadcast news - system robustness issues and adaptation techniques

Authors:

Raimo Bakis, IBM T.J. Watson Research Center (U.S.A.)
Scott Schen, IBM T.J. Watson Research Center (U.S.A.)
Ponani Gopalakrishnan, IBM T.J. Watson Research Center (U.S.A.)
Ramesh Gopinath, IBM T.J. Watson Research Center (U.S.A.)
Stéphane Maes, IBM T.J. Watson Research Center (U.S.A.)
Lazaros Polymenakos, IBM T.J. Watson Research Center (U.S.A.)

Volume 2, Page 711

Abstract:

This paper describes some of the main problems and issues specific to the transcription of broadcast news and describes some of the methods for solving them that have been incorporated into the IBM Large Vocabulary Continuous Speech Recognition System.

ic970711.pdf

TOP

Transcribing Broadcast News Shows

Authors:

Jean-Luc Gauvain, LIMSI (France)
Gilles Adda, LIMSI (France)
Lori Lamel, LIMSI (France)
Martine Adda-Decker, LIMSI (France)

Volume 2, Page 715

Abstract:

While significant improvements have been made over the last 5 years in large vocabulary continuous speech recognition of large read-speech corpora such as the ARPA Wall Street Journal-based CSR corpus (WSJ) for American English and the BREF corpus for French, these tasks remain relatively artificial. In this paper we report on our development work in moving from laboratory read speech data to real-world speech data in order to build a system for the new ARPA broadcast news transcription task. The LIMSI Nov96 speech recognizer makes use of continuous density HMMs with Gaussian mixture for acoustic modeling and n-gram statistics estimated on newspaper texts. The acoustic models are trained on the WSJ0/WSJ1, and adapted using MAP estimation with task-specific training data. The overall word error on the Nov96 partitioned evaluation test was 27.1%

ic970715.pdf

TOP

Broadcast News Transcription Using HTK

Authors:

Philip C. Woodland, University of Cambridge (U.K.)
Mark J.F. Gales, University of Cambridge (U.K.)
David Pye, University of Cambridge (U.K.)
Steve J. Young, University of Cambridge (U.K.)

Volume 2, Page 719

Abstract:

This paper examines the issues in extending a large vocabulary speech recognition system designed for clean and noisy read speech tasks to handle broadcast news transcription. Results using the 1995 DARPA H4 evaluation data set are presented for different front-end analyses and use of unsupervised model adaptation using maximum likelihood linear regression (MLLR). The HTK system for the 1996 H4 evaluation is then described. It includes a number of new features over previous HTK large vocabulary systems including decoder-guided segmentation, segment clustering, cache-based language modelling, and combined MAP and MLLR adaptation. The system runs in multiple passes through the data and the detailed results of each pass are given.

ic970719.pdf

TOP

Transcription of Broadcast Television and Radio News: The 1996 Abbot System

Authors:

C.D. Cook, Cambridge University (U.K.)
D.J. Kershaw, Cambridge University (U.K.)
J.D.M. Christie, Cambridge University (U.K.)
C.W. Seymour, Cambridge University (U.K.)
S.R. Waterhouse, Cambridge University (U.K.)

Volume 2, Page 723

Abstract:

This paper describes the development of the CU-CON system which participated in the 1996 ARPA Hub 4 Evaluations. The system is based on ABBOT, a hybrid connectionist-HMM large vocabulary continuous speech recognition system developed at the Cambridge University Engineering Department. The Hub 4 Evaluation task involves the transcription of broadcast television and radio news programmes. This is an extremely demanding task for state-of-the-art speech recognition systems. Typical programmes include a wide variety of speaking styles and acoustic conditions. These range from read speech recorded in the studio to extemporaneous speech recorded over telephone channels.

ic970723.pdf

TOP

Improved Topic Discrimination of Broadcast News Using a Model of Multiple Simultaneous Topics

Authors:

Toru Imai, NHK (Japan)
Richard Schwartz, BBN (U.S.A.)
Francis Kubala, BBN (U.S.A.)
Long Nguyen, BBN (U.S.A.)

Volume 2, Page 727

Abstract:

This paper presents a new method of topic spotting that attempts to retrieve detailed multiple simultaneous topics from broadcast news stories, each of which has about four different topics out of several thousand different topics. A new topic model uses a simple HMM where each state of the HMM represents one topic and the topic state emits topic dependent keywords probabilistically. The model allows (unobserved) transitions among topics, word by word. These characteristics improve the discriminative ability between keywords and general words in a topic model and decrease the probabilistic overlap among the topic models more than the conventional topic models (such as a simple multinomial probability model). In addition, the model is not confused by words from multiple topics within one story. We applied the new method to topic spotting from manually transcribed texts of news shows. The new method showed better results in precision and recall rates than the conventional method.