Authors:
Niels Ole Bernsen, Odense University (Denmark)
Laila Dybkjær, Odense University (Denmark)
Page (NA) Paper number 62
Abstract:
Use of speech input to, and speech output from, computer systems is
spreading at a growing pace. This means that, increasingly, developers
of systems and interfaces are faced with the question of whether or
not to use speech input and/or speech output for the applications they
are about to build. This paper presents results from a pilot test of
a theory-based approach to speech functionality. The test uses a corpus
of claims about speech functionality derived from recent literature
on speech and multimodality.
Authors:
Juan Ignacio Godino Llorente, ETSI Telecomunicacion (UPM) (Spain)
Santiago Aguilera Navarro, ETSI Telecomunicacion (UPM) (Spain)
Sira Palazuelos Cagigas, ETSI Telecomunicacion (UPM) (Spain)
Alberto Nieto Altuzarra, Universidad de Alcala de Henares (Spain)
Pedro Gómez Vilda, ETSI Telecomunicacion (UPM) (Spain)
Page (NA) Paper number 558
Abstract:
We present a diagnosis tool for the voice clinic which runs on a personal
computer. The application records different registers in real time.
The signals to be captured and stored are the following: * Videoendoscopic
images recorded with fibroscope or telelaringoscope. * Electroglottographic
signal during fonation * Voice signal * Air flow signal All these different
signals are recorded with specific transducers and standard digitalisation
signal boards, using microphone input and input line simultaneously.
Several systems to help in the diagnostic have been previously developed,
but none of them captures the four mentioned signals simultaneously.
All of them are highly interesting from the clinical point of view,
and assist the expert in making the decision. The main advantage, related
to other systems, is that looking at the videoendoscopic record, clinicians
are able to label voice registers with the associated pathologies.
Authors:
Kaare Sjölander, KTH (Sweden)
Jonas Beskow, KTH (Sweden)
Joakim Gustafson, KTH (Sweden)
Erland Lewin, KTH (Sweden)
Rolf Carlson, KTH (Sweden)
Björn Granström, KTH (Sweden)
Page (NA) Paper number 361
Abstract:
This paper describes the efforts at KTH in creating educational tools
for speech technology. The demand for such tools is increasing with
the advent of speech as a medium for man-machine communication. The
world wide web was chosen as our platform in order to increase the
usability and accessibility of our computer exercises. The aim was
to provide dedicated educational software instead of exercises based
on complex research tools. Currently, the set of exercises comprises
basic speech analysis, multi-modal speech synthesis and spoken dialogue
systems. Students access web pages in which the exercises have been
embedded as applets. This makes it possible to use them in a classroom
setting, as well as from the students? home computers.
Authors:
Stephen Sutton, Center for Spoken Language Understanding, Oregon Graduate Institute (USA)
Ronald A. Cole, Center for Spoken Language Understanding, Oregon Graduate Institute (USA)
Jacques de Villiers, Center for Spoken Language Understanding, Oregon Graduate Institute (USA)
Johan Schalkwyk, Fluent Speech Technologies (USA)
Pieter Vermeulen, Center for Spoken Language Understanding, Oregon Graduate Institute (USA)
Michael W. Macon, Center for Spoken Language Understanding, Oregon Graduate Institute (USA)
Yonghong Yan, Center for Spoken Language Understanding, Oregon Graduate Institute (USA)
Edward Kaiser, Center for Spoken Language Understanding, Oregon Graduate Institute (USA)
Brian Rundle, Center for Spoken Language Understanding, Oregon Graduate Institute (USA)
Khaldoun Shobaki, Center for Spoken Language Understanding, Oregon Graduate Institute (USA)
Paul Hosom, Center for Spoken Language Understanding, Oregon Graduate Institute (USA)
Alex Kain, Center for Spoken Language Understanding, Oregon Graduate Institute (USA)
Johan Wouters, Center for Spoken Language Understanding, Oregon Graduate Institute (USA)
Dominic W. Massaro, University of California, Santa Cruz (USA)
Michael Cohen, University of California, Santa Cruz (USA)
Page (NA) Paper number 649
Abstract:
A set of freely available, universal speech tools is needed to accelerate
progress in the speech technology. The CSLU Toolkit represents an effort
to make core technology and fundamental infrastructure accessible,
affordable and easy to use. The CSLU Toolkit has been under development
for five years. This paper describes recent improvements, additions
and uses of the CSLU Toolkit.
Authors:
Ben Serridge, Universidad de las Americas-Puebla (Mexico)
Alejandro Barbosa, Universidad de las Americas-Puebla (Mexico)
Ronald A. Cole, Center for Spoken Language Understanding, Oregon Graduate Institute of Science and Technology (USA)
Nora Munive, Universidad de las Americas-Puebla (Mexico)
Alcira Vargas, Universidad de las Americas-Puebla (Mexico)
Page (NA) Paper number 923
Abstract:
The CSLU Toolkit is designed to facilitate the rapid development of
spoken dialogue systems for a wide variety of applications, as well
as to provide a framework for conducting research in the underlying
speech technologies. This paper describes the creation of a Mexican
Spanish version of the CSLU Toolkit (both synthesis and recognition)
undertaken at the Universidad de las Américas in Puebla, México.
Based on the Festival Speech Synthesis System of the University of
Edinburgh, we have developed a complete concatenative text-to-speech
system for Mexican Spanish, which is currently incorporated into the
toolkit and includes both a male and female voice. In the area of recognition,
we have created a set of task-specific Spanish recognizers for continuous
digits, spelled words, and yes/no phrases, as well as a "general-purpose"
phonetic recognizer suitable for arbitrary sub-domains. Using the Rapid
Application Developer (RAD) component of the CSLU Toolkit, it is now
possible to quickly prototype spoken dialogue systems in Spanish. The
Spanish components of the CSLU Toolkit are freely available for non-commercial
use from the following web page: http://info.pue.udlap.mx/~sistemas/tlatoa.
Authors:
Carmen García-Mateo, E.T.S.I. de Telecomunicación, University of Vigo (Spain)
Qiru Zhou, Dialogue Systems Research Department, Bell Laboratories (USA)
Chin-Hui Lee, Dialogue Systems Research Department, Bell Laboratories (USA)
Andrew Pargellis, Dialogue Systems Research Department, Bell Laboratories (USA)
Page (NA) Paper number 884
Abstract:
We present a Mexican Spanish voice user interface demonstration system.
It was built on a speech research platform developed at Bell Labs,
which provides major speech technology and interface components, including
automatic speech recognition, text-to-speech synthesis, audio input/output
functions and telephone interface. The application is written in the
PERL script language with an embedded Voice Interface Language (VIL)
that connects the speech and interface modules to PERL. Given the set
of multilingual speech processing capabilities on the platform and
the VIL, we were able to quickly develop a Mexican Spanish system using
PERL with speech-enabled messaging and information access functionality
similar to our English voice user interface demonstration system.
|