Spacer ICASSP '98 Main Page

Spacer
General Information
Spacer
Conference Schedule
Spacer
Technical Program
Spacer
    Overview
    50th Annivary Events
    Plenary Sessions
    Special Sessions
    Tutorials
    Technical Sessions
    
By Date
    May 12, Tue
May 13, Wed
May 14, Thur
May 15, Fri
    
By Category
    AE    ANNIV   
COMM    DSP   
IMDSP    MMSP   
NNSP    PLEN   
SP    SPEC   
SSAP    UA   
VLSI   
    
By Author
    A    B    C    D    E   
F    G    H    I    J   
K    L    M    N    O   
P    Q    R    S    T   
U    V    W    X    Y   
Z   

    Invited Speakers
Spacer
Registration
Spacer
Exhibits
Spacer
Social Events
Spacer
Coming to Seattle
Spacer
Satellite Events
Spacer
Call for Papers/
Author's Kit

Spacer
Future Conferences
Spacer
Help

Abstract -  AE2   


 
AE2.1

   
Amplitude Modulated Sinusoidal Modeling Using Least-square Infinite Series Approximation with Applications to Timbre Analysis
W. Goh, K. Chan  (Nanyang Technological University, Singapore)
A least-square infinite series approximation (L-SISA) technique is proposed for modeling amplitude modulated (AM) sinusoidal components of naturally occurring signals, such as those produced by traditional musical instruments. Each AM sinusoid is iteratively extracted using an analysis-by-synthesis technique and the problem of parameter estimation is linearised for least-square approximation through a systematic search in the frequency vector space. Some timbre analysis results obtained using the AM sinusoidal model are presented.
 
AE2.2

   
Multi-Pitch Estimation for Polyphonic Signals
P. Fernandez-Cid, F. Casajus-Quiros  (GAPS-SSR-ETSIT-UPM, Spain)
Automatic Score Transcription goal is to achieve an score-like (notes pitches through time) representation from musical signals. Reliable pitch extraction methods for monophonic signals exist, but polyphonic signals are much more difficult, often ambiguous, to analyze. We propose a computationally efficient technique for automatic recognition of notes from a polyphonic signal. It looks for correctly shaped (magnitude and phase wise) peaks in a, time and frequency oversampled, multiscale decomposition of the signal. Peaks (partial candidates) get accepted/discarded by their match to the window spectrum shape and continuity-across-scale constraints. The final partial list builds a resharpened and equalized spectrum. Note candidates are found searching for harmonic patterns. Perceptual and source based rejection criteria help discard false notes, frame-by-frame. Slightly non-causal postprocessing uses continuity (across a <150 ms. observation time) to kill too short notes, fill in the gaps, and correct (sub)octave jumps.
 
AE2.3

   
Suppression of Transients in Time-Varying Recursive Filters for Audio Signals
V. Valimaki, T. Laakso  (Helsinki University of Technology, Finland)
A new method for suppressing transients in time-varying recursive filters is proposed. The technique is based on modifying the state variables when the filter coefficients are changed so that the filter enters a new state smoothly without transient attacks, as originally proposed by Zetterberg and Zhang. In this contribution we modify the Zetterberg-Zhang algorithm to render it feasible for efficient implementation. We explain how to determine an optimal transient suppresser to cancel the transients down to a desired level at the minimum complexity of implementation. The application of the method to time-varying all-pole and direct-form II filter structures is studied. The algorithm may be generalized for any recursive filter structure. The transient suppression technique finds applications in audio signal processing where the characteristics of a recursive filter needs to be changed in real time, such as in music synthesis, auralization, and equalization.
 
AE2.4

   
An Analysis/Synthesis Tool for Transient Signals That Allows a Flexible Sines+Transients+Noise Model for Audio
T. Verma, T. Meng  (Stanford University, USA)
We present a flexible analysis/synthesis tool for transient signals that extends current sinusoidal and sines+noise models for audio to sines+transients+noise. The explicit handling of transients provides a more realistic and robust signal model. Because the transient model presented is the frequency domain dual to sinusoidal modeling, it has similar flexibility and allows for a wide range of transformations on the parameterized signal. In addition, due to this duality, a major portion of the transient model is sinusoidal modeling performed in a frequency domain. In order to make the transient and sinusoidal models work more effectively together, we present a formulation of sinusoidal modeling (and therefore transient modeling) in terms of matching pursuits and overlap-add synthesis. This formulation provides a tight coupling between the sines+transients+noisemodel because it allows a simple heuristic, based on tonality, as to when an audio signal should be modeled as sines and/or transients and/or noise.
 
AE2.5

   
A New Frequency Domain Approach to Time-Scale Expansion of Audio Signals
A. Ferreira  (FEUP/INESC, Portugal)
We present a new algorithm for time-scale expansion of audio signals that comprises: time interpolation, frequency-scale expansion and modification of a spectral representation of the signal. The algorithm relies on an accurate model of signal analysis and synthesis, and was constrained to a non-iterative modification of the magnitudes and the wrapped phases of the relevant sinusoidal components of the signal. The structure of the algorithm is described and its performance is illustrated. A few examples of time-expanded wideband speech can be found on the Internet.
 
AE2.6

   
Robust Exponential Modeling of Audio Signals
J. Nieuwenhuijse, R. Heusdens, E. Deprettere  (Delft University of Technology, The Netherlands)
In this paper we present a numerically robust method for modeling audio signals which is based on an exponential data model. This model is a generalization of the classical sinusoidal model in the sense that it allows the amplitude of the sinusoids to evolve exponentially. We show that, using this model, so-called attacks can be represented very efficiently and we propose an algorithm for finding the exponentials in a robust way. Moreover, we show that by using a proper segmentation of the input data into variable length segments the signal-to-noise ratio can be drastically improved as compared to a fixed-length analysis.
 
AE2.7

   
Multiresolution Sinusoidal Modeling for Wideband Audio with Modifications
S. Levine  (Stanford / CCRMA, USA);   T. Verma  (Stanford / Center for Integrated Systems, USA);   J. Smith III  (Stanford / CCRMA, USA)
In this paper, we describe an computationally efficient method of generating more accurate sinusoidal parameters {amplitude, frequency, phase} from a wideband polyphonic audio source in a multiresolution, non-aliased fashion. This significantly improves upon previous work of sinusoidal modeling that assumes a single-pitched monophonic source, such as speech or an individual musical instrument, while using approximately the same number of sinusoids. In addition to a more general analysis, we can now perform high-quality modifications such as time-stretching and pitch-shifting on polyphonic audio with ease.
 
AE2.8

   
Efficient Analysis/Synthesis of Percussion Musical Instrument Sounds Using an All-Pole Model
M. Macon  (Oregon Graduate Institute, USA);   A. McCree, W. Lai, V. Viswanathan  (Texas Instruments, USA)
It is well-known that an impulse-excited, all-pole filter is capable of representing many physical phenomena, including the oscillatory modes of percussion musical instruments like woodblocks, xylophones, or chimes. In contrast to the more common application of all-pole models to speech, however, practical problems arise in music synthesis due to the location of poles very close to the unit circle. The objective of this work was to develop algorithms to find excitation and filter parameters for synthesis of percussion instrument sounds using only an inexpensive all-pole filter chip (TI TSP50C1x). The paper describes analysis methods for dealing with pole locations near the unit circle, as well as a general method for modeling the transient attack characteristics of a particular sound while independently controlling the amplitudes of each oscillatory mode.
 
AE2.9

   
Music Recognition Using Note Transition Context
K. Kashino, H. Murase  (NTT Basic Research Laboratories, Japan)
As a typical example of sound-mixture recognition, the recognition of ensemble music is addressed. Here music recognition is defined as recognizing the pitch and the name of an instrument for each musical note in monaural or stereo recordings of real music performances. The first key part of the proposed method is adaptive template matching that can cope with variability in musical sounds. This is employed in the hypothesis-generation stage. The second key part of the proposed method is musical context integration based on the probabilistic networks. This is employed in the hypothesis-verification stage. The evaluation results clearly show the advantages of these two processes.
 
AE2.10

   
A System for Machine Recognition of Music Patterns
E. Coyle  (Purdue University, USA);   I. Shmulevich  (University of Nijmegen, The Netherlands)
We introduce a system for machine recognition of music patterns. The problem is put into a pattern recognition framework in the sense that an error between a target pattern and scanned pattern is minimized. The error takes into account pitch and rhythm information. The pitch error measure consists of an absolute error and a perceptual error. The latter depends on an algorithm for establishing the tonal context which is based on Krumhansl's key-finding algorithm. The sequence of maximum correlations that it outputs is smoothed with a cubic spline and is used to determine weights for perceptual and absolute pitch errors. Maximum correlations are used to create the assigned key sequence, which is then filtered by a recursive median filter to improve the structure of the output of the key finding algorithm. A procedure for choosing the wieghts given to pitch and rhythm errors is discussed.
 

< Previous Abstract - AE1

AE3 - Next Abstract >