Computer Music

Chair: J. Laroche, E-mu, USA

Home

Amplitude Modulated Sinusoidal Modeling Using Least-square Infinite Series Approximation with Applications to Timbre Analysis

Authors:

Wooi-Boon Goh, Nanyang Technological University (Singapore)
Kai-Yun Chan, Nanyang Technological University (Singapore)

Volume 6, Page 3561, Paper number 1287

Abstract:

A least-square infinite series approximation (L-SISA) technique is proposed for modeling amplitude modulated (AM) sinusoidal components of naturally occurring signals, such as those produced by traditional musical instruments. Each AM sinusoid is iteratively extracted using an analysis-by-synthesis technique and the problem of parameter estimation is linearised for least-square approximation through a systematic search in the frequency vector space. Some timbre analysis results obtained using the AM sinusoidal model are presented.

ic981287.pdf (From Postscript)

TOP

Multi-Pitch Estimation for Polyphonic Musical Signals

Authors:

Pablo Fernandez-Cid, GAPS-SSR-ETSIT-UPM (Spain)
Francisco Javier Casajus-Quiros, GAPS-SSR-ETSIT-UPM (Spain)

Volume 6, Page 3565, Paper number 1402

Abstract:

Automatic Score Transcription goal is to achieve an score-like (notes pitches through time) representation from musical signals. Reliable pitch extraction methods for monophonic signals exist, but polyphonic signals are much more difficult, often ambiguous, to analyze.We propose a computationally efficient technique for automatic recognition of notes from a polyphonic signal. It looks for correctly shaped (magnitude and phase wise) peaks in a, time and frequency oversampled, multiscale decomposition of the signal. Peaks (partial candidates) get accepted/discarded by their match to the window spectrum shape and continuity-across-scale constraints. The final partial list builds a resharpened and equalized spectrum. Note candidates are found searching for harmonic patterns. Perceptual and source based rejection criteria help discard false notes, frame-by-frame. Slightly non-causal postprocessing uses continuity (across a <150 ms. observation time) to kill too short notes, fill in the gaps, and correct (sub)octave jumps.

ic981402.pdf (From Postscript)

TOP

Suppression of Transients in Time-Varying Recursive Filters for Audio Signals

Authors:

Vesa Valimaki, Helsinki University of Technology (Finland)
Timo I. Laakso, Helsinki University of Technology (Finland)

Volume 6, Page 3569, Paper number 1497

Abstract:

A new method for suppressing transients in time-varying recursive filters is proposed. The technique is based on modifying the state variables when the filter coefficients are changed so that the filter enters a new state smoothly without transient attacks, as originally proposed by Zetterberg and Zhang. In this contribution we modify the Zetterberg-Zhang algorithm to render it feasible for efficient implementation. We explain how to determine an optimal transient suppresser to cancel the transients down to a desired level at the minimum complexity of implementation. The application of the method to time-varying all-pole and direct-form II filter structures is studied. The algorithm may be generalized for any recursive filter structure. The transient suppression technique finds applications in audio signal processing where the characteristics of a recursive filter needs to be changed in real time, such as in music synthesis, auralization, and equalization.

ic981497.pdf (From Postscript)

TOP

An Analysis/Synthesis Tool for Transient Signals That Allows a Flexible Sines+Transients+Noise Model for Audio

Authors:

Tony S. Verma, Stanford University (U.S.A.)
Teresa H.Y. Meng, Stanford University (U.S.A.)

Volume 6, Page 3573, Paper number 1631

Abstract:

We present a flexible analysis/synthesis tool for transient signals that extends current sinusoidal and sines+noise models for audio to sines+transients+noise. The explicit handling of transients provides a more realistic and robust signal model. Because the transient model presented is the frequency domain dual to sinusoidal modeling, it has similar flexibility and allows for a wide range of transformations on the parameterized signal. In addition, due to this duality, a major portion of the transient model is sinusoidal modeling performed in a frequency domain. In order to make the transient and sinusoidal models work more effectively together, we present a formulation of sinusoidal modeling (and therefore transient modeling) in terms of matching pursuits and overlap-add synthesis. This formulation provides a tight coupling between the sines+transients+noisemodel because it allows a simple heuristic, based on tonality, as to when an audio signal should be modeled as sines and/or transients and/or noise.

ic981631.pdf (From Postscript)

TOP

A New Frequency Domain Approach to Time-Scale Expansion of Audio Signals

Authors:

Anibal J.S. Ferreira, The University of Porto (Portugal)

Volume 6, Page 3577, Paper number 1671

Abstract:

We present a new algorithm for time-scale expansion of audio signals that comprises: time interpolation, frequency-scale expansion and modification of a spectral representation of the signal. The algorithm relies on an accurate model of signal analysis and synthesis, and was constrained to a non-iterative modification of the magnitudes and the wrapped phases of the relevant sinusoidal components of the signal. The structure of the algorithm is described and its performance is illustrated. A few examples of time-expanded wideband speech can be found on the Internet.

ic981671.pdf (From Postscript)

TOP

Robust Exponential Modeling of Audio Signals

Authors:

Joost Nieuwenhuijse, Delft University of Technology (The Netherlands)
Richard Heusdens, Delft University of Technology (The Netherlands)
Ed F. Deprettere, Delft University of Technology (The Netherlands)

Volume 6, Page 3581, Paper number 1997

Abstract:

In this paper we present a numerically robust method for modeling audio signals which is based on an exponential data model. This model is a generalization of the classical sinusoidal model in the sense that it allows the amplitude of the sinusoids to evolve exponentially. We show that, using this model, so-called attacks can be represented very efficiently and we propose an algorithm for finding the exponentials in a robust way. Moreover, we show that by using a proper segmentation of the input data into variable length segments the signal-to-noise ratio can be drastically improved as compared to a fixed-length analysis.

ic981997.pdf (From Postscript)

TOP

Multiresolution Sinusoidal Modeling for Wideband Audio with Modifications

Authors:

Scott N Levine, Stanford University (U.S.A.)
Tony S. Verma, Stanford University (U.S.A.)
Julius O Smith III, Stanford University (U.S.A.)

Volume 6, Page 3585, Paper number 2104

Abstract:

In this paper, we describe an computationally efficient method of generating more accurate sinusoidal parameters {amplitude, frequency, phase} from a wideband polyphonic audio source in a multiresolution, non-aliased fashion. This significantly improves upon previous work of sinusoidal modeling that assumes a single-pitched monophonicsource, such as speech or an individual musical instrument, while using approximately the same number of sinusoids. In addition to a more general analysis, we can now perform high-quality modifications such as time-stretching and pitch-shifting on polyphonic audio with ease.

ic982104.pdf (From Postscript)

TOP

Efficient Analysis/Synthesis of Percussion Musical Instrument Sounds Using an All-Pole Model

Authors:

Michael W Macon, Oregon Graduate Institute (U.S.A.)
Alan V. McCree, Texas Instruments (U.S.A.)
Wai-Ming Lai, Texas Instruments (U.S.A.)
Vishu Viswanathan, Texas Instruments (U.S.A.)

Volume 6, Page 3589, Paper number 2207

Abstract:

It is well-known that an impulse-excited, all-pole filter is capable of representing many physical phenomena, including the oscillatory modes of percussion musical instruments like woodblocks, xylophones, or chimes. In contrast to the more common application of all-pole models to speech, however, practical problems arise in music synthesis due to the location of poles very close to the unit circle. The objective of this work was to develop algorithms to find excitation and filter parameters for synthesis of percussion instrument sounds using only an inexpensive all-pole filter chip (TI TSP50C1x). The paper describes analysis methods for dealing with pole locations near the unit circle, as well as a general method for modeling the transient attack characteristics of a particular sound while independently controlling the amplitudes of each oscillatory mode.

ic982207.pdf (From Postscript)

TOP

Music Recognition Using Note Transition Context

Authors:

Kunio Kashino, NTT Basic Research Laboratories (Japan)
Hiroshi Murase, NTT Basic Research Laboratories (Japan)

Volume 6, Page 3593, Paper number 2234

Abstract:

As a typical example of sound-mixture recognition, the recognition of ensemble music is addressed. Here music recognition is defined as recognizing the pitch and the name of an instrument for each musical note in monaural or stereo recordings of real music performances. The first key part of the proposed method is adaptive template matching that can cope with variability in musical sounds. This is employed in the hypothesis-generation stage. The second key part of the proposed method is musical context integration based on the probabilistic networks. This is employed in the hypothesis-verification stage. The evaluation results clearly show the advantages of these two processes.

ic982234.pdf (From Postscript)

TOP

A System for Machine Recognition of Music Patterns

Authors:

Edward J. Coyle, Purdue University (U.S.A.)
Ilya Shmulevich, University of Nijmegen (The Netherlands)

Volume 6, Page 3597, Paper number 2541

Abstract:

We introduce a system for machine recognition of music patterns. The problem is put into a pattern recognition framework in the sense that an error between a target pattern and scanned pattern is minimized. The error takes into account pitch and rhythm information. The pitch error measure consists of an absolute error and a perceptual error. The latter depends on an algorithm for establishing the tonal context which is based on Krumhansl's key-finding algorithm. The sequence of maximum correlations that it outputs is smoothed with a cubic spline and is used to determine weights for perceptual and absolute pitch errors. Maximum correlations are used to create the assigned key sequence, which is then filtered by a recursive median filter to improve the structure of the output of the key finding algorithm. A procedure for choosing the wieghts given to pitch and rhythm errors is discussed.