Chair: Joern Ostermann, AT&T Labs - Research, USA
Yong Rui, Beckman Institute, University of Illinois-Urbana/Champaign (U.S.A.)
Thomas S. Huang, Beckman Institute, University of Illinois-Urbana/Champaign (U.S.A.)
Shih-Fu Chang, Columbia University (U.S.A.)
Much research activity and interest has emerged recently in two closely related areas: Digital Image/Video Library (DIVL) and MPEG-7. In this paper, we review the critical research issues in DIVL from a signal processing viewpoint, the objectives and scope of MPEG-7, and the relationships between these two.
Chung-Sheng Li, IBM T.J. Watson Research Center (U.S.A.)
Rakesh Mohan, IBM T.J. Watson Research Center (U.S.A.)
John R. Smith, IBM T.J. Watson Research Center (U.S.A.)
There is a growing need for developing a content description language for multimedia that improves searching, indexing and managing of the multimedia content. The MPEG group recently established the MPEG-7 effort to standardize the multimedia content interface. The proposed interface will bridge the gap between various types of content meta-data, such as content features, annotations, relationships, and the search engines. In this paper, we develop a method of handling multimedia content description in a new multi-abstraction, multi-modal content representation framework called InfoPyramid. The InfoPyramid facilitates the search, retrieval, manipulation, and transmission of multimedia data by providing a hierarchy for content descriptors. We illustrate the suitability of the InfoPyramid multimedia content description to MPEG-7 by examining four multimedia retrieval applications: a Web-image search engine, a satellite image retrieval system, an Internet content delivery system, and a TV news storage and retrieval system.
Thomas R. Gardos, Intel Corporation (U.S.A.)
H.263+ is a revision to the 1996 version of ITU-T Recommendation H.263 that brings incremental improvements to compression performance, better support for packet-based networks, expanded support for video formats as well as other new functionality. All the new capabilities can be negotiated individually or disabled for backwards compatibility with H.263. In this paper, we review all the major new features of H.263+.
Schuyler R. Quackenbush, AT&T Labs (U.S.A.)
MPEG-4 standardizes natural audio coding at bitrates ranging from 2 kbit/s, suitable for intelligible speech coding, to 64 kbit/s per channel, suitable for high-quality audio coding. Within this range, three categories of coding are defined: parametric coding, Code Excited Linear Predictive coding (CELP) and time/frequency (T/F) coding. The unique contribution of MPEG-4 audio is that not only does it scale across a wide range of bitrates, but it also scales across a broad set of other parameters, such as sampling rate, bandwidth, voice pitch and complexity. This paper presents an overview of the MPEG-4 natural audio coding framework and each of its component coding techniques.
Eric D. Scheirer, MIT Media Lab (U.S.A.)
The MPEG-4 standard defines numerous tools that represent the state-of-the-art in representation, transmission, and decoding of multimedia data. Among these is a new type of audio standard, termed ""Structured Audio"". The MPEG-4 standard for structured audio allows for the efficient, flexible description of synthetic sound in synchronization with natural sound in interactive multimedia scenes. A discussion of the capabilities, technological underpinnings, and application of MPEG-4 Structured Audio is presented.
Joern Ostermann, AT&T Labs - Research (U.S.A.)
Atul Puri, AT&T Labs - Research (U.S.A.)
The ISO MPEG committee, after successful completion of the MPEG-1 and MPEG-2 standards, has recently completed the Committee Draft for MPEG-4, its third standard. MPEG-4 is designed to be an object-based standard for multimedia coding. The visual part of the standard specifies coding of both natural and synthetic video. The MPEG-4 visual standard supports coding of natural video not only in a conventional manner (using frames) but also as a collection of arbitrary shape objects (using video object planes). Further, it supports functionalities such as spatial and temporal scalability, both conventional and with arbitrary shape objects. It also supports error resilient coding for delivery of coded video on error prone channels. MPEG-4 visual standard also supports coding of synthetic video which includes still texture maps used in 3D graphics models, mesh geometry for object animation, and parameters for facial animation.
M. Reha Civanlar, AT&T Labs - Research (U.S.A.)
The explosive growth of the Internet and the intranets have attracted a great deal of attention to the implementation and performance of networked multimedia services, which involve the transport of real-time multimedia data streams over non-guaranteed quality of service (QoS) networks based on the Internet Protocol (IP). In this paper, I present an overview of the existing architectural elements supporting real-time data transmission over the Internet. Effective implementations of such systems require a thorough understanding of both the network protocols and the coding systems used for compressing the signals to be transmitted in real-time. This paper includes a section discussing the issues to be considered in designing signal compression applications suitable for network use.
Alan Gauton, University of Strathclyde, Scotland (U.K.)
Tariq S Durrani, University of Strathclyde, Scotland (U.K.)
MHEG represents a new multimedia and hypermedia standard proposed by ISO/IEC. This paper presents a new software authoring environment based around MHEG-5 that offers users a vehicle for creating multimedia applications that can interact with external programs which involve intense computational tasks. MEMIS provides a linkage between a multimedia front-end and externally available computational processes. The paper provides a background to the development of the environment, by identifying the facilities offered by MHEG, discusses the efficacy offered by MHEG Vs JAVA for multimedia development; and then covers the development process and includes specific exemplars of the environment for managing multimedia applications which include (a) real-time signal processing embedded within a LabVIEW kernel, (b) ATM, (c) Set Top Box, and (d) Kiosk for Internet commerce that utilises MATLAB type calls. The results includes system level architecture for multimedia implementation, and the timing requirements for such applications.