ICASSP '98 Main Page
 General Information
 Conference Schedule
 Technical Program

Overview
50th Annivary Events
Plenary Sessions
Special Sessions
Tutorials
Technical Sessions
Invited Speakers
 Registration
 Exhibits
 Social Events
 Coming to Seattle
 Satellite Events
 Call for Papers/ Author's Kit
 Future Conferences
 Help
|
Abstract - SPEC-SP |
 |
SPEC-SP.1
|
Machine Learning and Automatic Linguistic Analysis: The Next Step
E. Brill (Johns Hopkins University, USA)
In order to continue building systems with progressively more complex natural language capabilities, it is crucial that great strides are made toward solving the core linguistic analysis problems for complex and possibly unrestricted domains. A great deal of progress has been made by applying machine learning techniques to automatically train systems from manually annotated corpora to provide detailed linguistic analyses to sentences. This paper examines a number of issues within this paradigm of automatic linguistic knowledge acquisition and how they relate to pushing progress in the field of natural language processing over the next decade.
|
SPEC-SP.2
|
Accessible Technology for Interactive Systems: A New Approach to Spoken Language Research
R. Cole,
S. Sutton,
Y. Yan,
P. Vermeulen,
M. Fanty (Oregon Graduate Institute of Science and Technology, USA)
In this paper, we argue for a paradigm shift in spoken language technology, from transcription tasks to interactive systems. The current paradigm evaluates speech recognition accuracy on large vocabulary transcription tasks, such as telephone conversations or media broadcasts. Systems are evaluated in international competitions, with strict rules for participation and well-defined evaluation metrics. Participation in these competitions is limited to a few elite laboratories that have the resources to develop and field systems. We propose a new, more productive and more accessible paradigm for spoken language research, in which research advances are evaluated in the context of interactive systems that allow people to perform useful tasks, such as accessing information from the World Wide Web, while driving a car. These systems are made available for daily use by ordinary citizens through telephone networks or placement in easily accessible kiosks in public institutions. It is argued [1,2,3] that this new paradigm, which focuses on the goal of universal access to information for all people, better serves the needs of the research community, as well as the welfare of our citizens. We discuss the challenges and rewards of an interactive system approach to spoken language research, and discuss our initial attempts to stimulate a paradigm shift and engage a large community of researchers through free distribution of the CSLU Toolkit.
|
SPEC-SP.3
|
Recognition in a New Key - Towards a Science of Spoken Language
S. Greenberg (International Computer Science Institute, USA)
Automatic speech recognition in the twenty-first century will strive to emulate many properties of human speech understanding that currently lie beyond the capability of present-day systems. Such future-generation recognition will require massive amounts of empirical data in order to derive the organizational principles underlying the generation and decoding of spoken language. Such data can be efficiently collected through systematic computational experimentation designed to identify the important building blocks of speech and delineate the nature of the structural interactions among linguistic tiers associated with the extraction of semantic information.
|
SPEC-SP.4
|
The Challenge of Domain-Independent Speech Understanding
R. Moore (SRI International, USA)
To achieve widespread acceptance, speech understanding technology needs to be domain independent. Deep understanding, however, appears to require knowledge that is domain specific. Speech understanding technology, therefore, must be partitioned into domain-independent and domain-specific components. Development of domain-independent components could be promoted by creation of semantically annotated corpora. Any such corpus, however, would be difficult to produce and would necessarily be controversial because of lack of widespread agreement on principles of semantic analysis. The use of such a corpus for performance evaluation should therefore be left largely up to the research community rather than being imposed by funding agencies.
|
SPEC-SP.5
|
Understanding Speech Understanding
R. Moore (DERA Speech Research Unit, UK)
Despite the significant theoretical and practical advances that have been made in automatic speech recognition in recent years, relatively little effort has been devoted to the evaluation of speech in an interactive multi-modal application interface. This paper introduces a general methodology for assessing speech-based systems and concludes with a proposal for a test scenario which focuses on the understanding component of a spoken language system.
|
SPEC-SP.6
|
Evaluating Dialog Systems Used in the Real World
H. Aust (Philips Speech Processing, Germany);
H. Ney (RWTH Aachen, University of Technology, Germany)
An important aspect of creating high performance natural language dialog systems is the question of how they are evaluated. While a universally accepted method for doing so for pure speech recognition exists, this is not clear for speech understanding or dialog systems. We describe the methods we typically use for our systems and argue that it is not sufficient to evaluate their constituents separately. Instead, a measure for a system in its entity is needed.
|
SPEC-SP.7
|
Next Major Application Systems and Key Techniques in Speech Recognition Technology
K. Tanaka (Electrotechnical Laboratory, Japan)
In this paper, we discuss several major speech recognition applications which will contribute to some human activities in a decade. At first, recent Japanese speech-related national projects directed toward future intelligent systems are briefly reviewed. The we discuss three systems as the next major speech applications: substantially robust systems, multimodal interaction systems and multilingual dialogue systems. Evaluation of the performance of these systems is separately discussed in view of both total systems and specific technologies. We suggest that the degree of the difficulty of some kinds of specific tasks can be even more precisely measured, while the total system performance evaluation will become more difficult in future complex systems. Last, we take up phrase spotting, distance calculation for phonetic symbol sequences, adaptation/learning, and software modularization/multi-agents as the key techniques in constructing the above applications.
|
|