[chinese][all]

Timbre is an acoustic feature that is defined conceptually. In general, timbre refers to the "content" of a frame of audio signals, which is ideally not affected much by pitch and intensity. Theoretically, for quasi-periodic audio signals, we can use the waveform within a fundamental period as the timbre of the frame. However, it is difficult to analysis the waveform within a fundamental period directly. Instead, we usually use the fast Fourier transform (or FFT) to transform the time-domain waveform into frequency-domain spectrum for further analysis. The amplitude spectrum in the frequency domain simply represent the intensity of the waveform at each frequency band.

If you want to experience real-time FFT demo, type the following command within the MATLAB command window:

The opened Simulink block system looks like this:

When you start running the system and speak to the microphone, you will able to see the time-varying spectrum:

If we use different colors to represent the height of spectrum, we can obtain the spectrogram, as shown next:

Spectrogram represent the time-varying spectrum displayed in a image map. The same utterance will correspond to the same pattern of spectrogram. Some experienced persons can understand the contents of the speech by viewing the spectragram alone. This is call "spectrogram reading" and related contests and information can be found on the Internet. For instance:


Audio Signal Processing and Recognition (音訊處理與辨識)