Session W2A Spoken Language Understanding

Chairperson Ioannis Dologlou ESAT-MI2, France

Home


AUTOMATIC ACQUISITION OF SALIENT GRAMMAR FRAGMENTS FOR CALL-TYPE CLASSIFICATION

Authors: J.H.Wright, A.L.Gorin and G.Riccardi

AT&T Labs – Research, 180 Park Avenue, Florham Park, NJ 07932, USA {jwright,algor,dsp3}@research.att.com

Volume 3 pages 1419 - 1422

ABSTRACT

We present an algorithm for the automatic acquisition of salient grammar fragments in the form of finite state machines (FSMs). Salient phrase fragments are selected using a significance test, then clustered using a combination of string and semantic distortion measures. Each cluster is then compactly represented as an FSM. Flexibility is enhanced by permitting approximate matches to paths through each FSM. Multiple fragment detections are exploited by means of a neural network. The methodology is applied to the "How may I help you?" (HMIHY) call-type classification task.

A0107.pdf

TOP


Stochastically-Based Natural Language Understanding Across Tasks and Languages

Authors: Wolfgang Minker

Spoken Language Processing Group LIMSI-CNRS 91403 Orsay cedex, FRANCE email: minker@limsi.fr http://www.limsi.fr/TLP

Volume 3 pages 1423 - 1426

ABSTRACT

A stochastically-based method for natural language understanding has been ported from the American ATIS (Air Travel Information Services) to the French MASK (Multimodal-Multimedia Automated Service Kiosk) task. The porting was carried out by designing and annotating a corpus of semantic representations via a semi-automatic iterative labeling. The study shows that domain and language porting is rather flexible, since it is sufficient to train the system on data sets specific to the application and language. A limiting factor of the current implementation is the quality of the semantic representation and the use of query preprocessing strategies which strongly suffer fromhuman influence. The performances of the stochastically-based and a rule-based method are compared on both tasks.

A0154.pdf

TOP


Transducer Composition for Context-Dependent Network Expansion

Authors: Michael Riley Fernando Pereira Mehryar Mohri

riley@research.att.com pereira@research.att.com mohri@research.att.com AT&T Labs – Research, 180 Park Avenue, Florham Park, NJ 07932-0971, USA

Volume 3 pages 1427 - 1430

ABSTRACT

Context-dependent models for language units are essential in high-accuracy speech recognition. However, standard speech recognition frameworks are based on the substitution of level models for higher-level units. Since substitution cannot express context-dependency constraints, actual recognizers use restrictive model-structure assumptions and specialized code for context-dependent models, leading to decreased flexibility and lost opportunities for automatic model optimization. Instead, we propose a recognition framework that builds in the possibility of context dependency from the start by using weighted finite-state transduction rather than substitution. The framework is mented with a general demand-driven transducer composition algorithm that allows great flexibility in model structure, form of context dependency and network expansion method, while achieving competitive recognition performance.

A0175.pdf

TOP


GIVING PROSODY A MEANING

Authors: Christian Lieske (4) Johan Bos (1) Martin Emele (2) Bjorn Gamback(3) CJ Rupp (1)

(1) Computational Linguistics, University of Saarland; Postfach 151150; D-660 41 Saarbrucken Tel: +49 681 302 4679, Fax: +49 681 302 4351, {bos,cj}@coli.uni-sb.de (2) Institute of Computational Linguistics, University of Stuttgart; Azenbergstrasse 12; D-70174 Stuttgart Tel: +49 711 121 1372, Fax: +49 711 121 1366, emele@ims.uni-stuttgart.de (3) Centre for Speech Technology, Royal Institute of Technology; S-100 40 Stockholm Tel: +46 8 790 8884, Fax: +46 8 790 7854, gamback@speech.kth.se (4) Computer Science Department, Swiss Federal Institute of Technology; CH-1015 Lausanne Tel: +41 21 693 2589, Fax: +41 21 693 5278, lieske@di.epfl.ch

Volume 3 pages 1431 - 1434

ABSTRACT

Systems for spoken-language understanding can use prosodic information on the speech recognition side as well as the linguistic processing side. In the former case, prosody improves recognition accuracy and speed. In the latter case, it contributes to the computation of meaning. Interfacing prosodic processing to language analysis has so far been mainly concerned with speeding up the parsing process. The actual integration of prosodic information into the semantic part of a language understanding system, or into the transfer part of a translation system, has mostly been left aside. We describe how prosody has been used in the syntactic-semantic and transfer modules of the Verbmobil spoken dialogue translation system. On the syntactic-semantic side, prosody is currently used for the solution of three different problems: insertion of clause boundaries, selection of sentence mood (declarative, question, etc), and assignment of semantic focus. On the transfer side, the prosodic information is allowed to in uence the lexical choice of the system.

A0179.pdf

TOP


FEATURE-BASED LANGUAGE UNDERSTANDING

Authors: K. A. Papineni S. Roukos R. T. Ward

IBM T. J. Watson Research Center Yorktown Heights, NY 10598, USA

Volume 3 pages 1435 - 1438

ABSTRACT

We consider translating natural language sentences into a formal language using a system that is data-driven and built automatically from training data. We use features that capture correlations between automatically determined key phrases in both languages. The features and their associated weights are selected using a training corpus of matched pairs of source and target language sentences to maximize the entropy of the resulting conditional probability model. Given a source-language sentence, we select as the translation a target-language candidate to which the model assigns maximum probability. We report results in Air Travel Information System (ATIS) domain.

A0616.pdf

TOP


SPEECH TRANSLATION BASED ON AUTOMATICALLY TRAINABLE FINITE-STATE MODELS

Authors: J. C. Amengual(1) J. M. Benedi(2) K. Beulen(3) F. Casacuberta(2) A. Castano(1) A. Castellanos(1) V. M. Jimenez(2) D. Llorens(2) A. Marzal(1) H. Ney(3) F. Prat(1) E. Vidal(2) J. M. Vilar(1)

(1) Unidad Predepartamental de Informatica Campus Penyeta Roja, Universitat Jaume I E-12071 Castell´ o (Spain) (2) Departamento de Sistemas Informaticos y Computacion Universidad Politecnica de Valencia E-46071 Valencia (Spain) (3) Lerhstuhl f ur Informatik VI RWTH Aachen, University of Technology D-52056 Aachen (Germany) e-mail: evidal@iti.upv.es

Volume 3 pages 1439 - 1442

ABSTRACT

This paper extends previous work exploring the use of Subsequential Transducers to perform speech-input translation in limited-domain tasks. This is done following an integrated approach in which a Subsequential Transducer replaces the input-language model of a conventional speech recognition system, and is used both as language and translation model. This way, the search for the recognised sentence also produces the corresponding translation. A corpus-based approach is adopted in order to build the required models from training data. Experimental results are presented for the translation task considered in the EUTRANS project: one in the hotel domain with more than 500 words per language and language perplexities near to 10.

A0881.pdf

TOP