20-1 第 20 章課本習題

(*)Obtain info from a mono audio file: Write a MATLAB script that can read the wave file "welcome.wav" and display the following information within this script.

Number of sample points.
Samping rate.
Bit resolution
Number of channels.
Time duration of the recording (in terms of seconds)

請寫一個 MATLAB 程式 wavInfo01.m，讀入 welcome.wav 檔案（位於本書所附光碟之中），並由此程式印出下列資訊：

音訊資料點數
取樣頻率
解析度（每個取樣點由多少位元來表示）
聲道數
錄音時間長度（以秒為單位）

(*)Obtain info from a stereo audio file: Repeat the previous exercise with a MATLAB program to obtain the same information from the wave file "flanger.wav".
重複上一題（程式檔名為 wavInfo02.m），但是將檔案改為 flanger.wav（此檔案位於本書所附光碟之中）。一般的 CD 音樂是採用雙聲道錄音，取樣頻率是 44.1 kHz，解析度是 16 bits/sample。請問在這樣的情況下，一首四分鐘的歌若是寫成 wav 檔案，會大約佔掉多少硬碟空間？（由此習題你就可以知道 MP3 、WMA、AAC 等壓縮功能的可貴了！）
請根據下列程式碼回答問題。

'sync' 代表什麼意思？
如果改為把第一個 'sync' 改為 'async'，會有什麼差別？
Ans:

'sync' 代表將音訊資料循序送到喇叭播放。
我們會同時聽到兩個聲音播放。

以 [y, fs]=wavread('welcome.wav') 讀入音訊資料後，請問下列哪一個敘述有「倒放」的效果？

wavplay(y, 0.5*fs, 'async');
wavplay(-y, fs, 'sync');
wavplay(-y, -fs, 'sync');
wavplay(flipud(y), fs, 'sync');

對於一段聲音訊號 y 及對應的取樣頻率 fs，若以下列方式來播放，何者造成的音量最大且發音最尖銳？

sound(y, fs)
sound(y, fs*2)
sound(y*2, fs)
sound(y*2, fs*2)

MATLAB 使用 wavread 讀入 wav 檔案之後，會將音訊資料儲存成何種資料型態？在哪個區間？

int, [-10,10]
double, [-1,1]
int, [-32768,32767]
double, [-10,10]

wav 檔案內部儲存音訊資料的資料型態，最常見的是哪兩種？?其數值範圍為何？
Ans:

uint8 (unsigned 8-bit integers), [0, 255]
int16 (signed 16-bit integer), [-32768, 32767]

MATLAB 使用 wavread 讀入 wav 檔案之後，如何將原始音訊資料轉換成 MATLAB 內部資料？
Ans:

uint8 ==> (x-128)/128.
int16 ==> x/32768.

「快樂頌」簡譜如下：３３４５│５４３２│１１２３│３．２２ ﹣│ 請寫一段最簡單的 MATLAB 程式來播放此段簡譜。

Ans:
(*)Wave recording: Write a MATLAB script to record 10 seconds of your utterance such as "My name is Roger Jang and I am a senior student at the CS department of National Taiwan University". Save your recording as myVoice.wav. Other recording parameters are: sample rate = 16 KHz, bit resolution = 16 bits. Please use the script to print out answers to the following questions within the MATLAB window.

How much space is taken by the audio data in the MATLAB workspace?
What the data type of the audio data?
How do you compute the amount of the required memory from the recording parameters?
What is the size of myVoice.wav?
How many bytes is used in myVoice.wav to record overheads other than the audio data itself?

請寫一段 MATLAB 程式 recordMyVoice01.m，錄下你講的一段話：「我是xxx，今年xx歲，是xx大學xx系x年級的學生」，並將之儲存成檔案 myvoice.wav。錄音條件如下：錄音時間是 10 秒，取樣頻率是 11025 Hz，解析度是 8 bits/sample。請問：

相關的音訊變數在 MATLAB 的工作空間佔掉多少記憶體？
此變數的資料型態是什麼？
如何由相關的錄音條件來計算得知此變數所佔掉的記憶體大小？
myvoice.wav 的檔案大小為何？
檔案用了多少位元組來記錄除了音訊資料以外的資訊？

上題所錄製的檔案為單聲道。請撰寫一段 MATLAB 程式 one2twoChannel01.m，能夠讀入 myvoice.wav，將其音訊資料轉成雙聲道，並讓其播放時，產生聲源在兩個喇叭之間游移的效果，最後再將此經過處理的雙聲道音訊儲存於 myvoice2.wav。（此題需要讀者發揮一下想像力，可參考 flanger.wav 檔案的左、右聲道的波形。）
(**)Create the illusion of a moving sound source: Record your own voice of "my name is xxx and I am a student at xxx university", and save the mono recording to myVoice.wav. Write a MATLAB script that can read the audio data from myVoice.wav, duplicate the audio data to create a stereo audio, and then modify the volume of each channels such that the playback can create an illusion that the sound source is moving between your two speakers. (Hint: You can observe the waveforms of the two channels in flanger.wav.)
(*)Reverse playback: Write a MATLAB script to accomplish the following tasks:

Record your utterance of "we" and play it backwards. Does it sound like "you"? (Please save the result to a wave file and demo its playback to the TA.)
Record your utterance of "you" and play it backwards. Does it sound like "we"? (Please save the result to a wave file and demo its playback to the TA.)
Record your utterance of "上海自來水來自海上" (for Chinese students) or "We are you" (for international students) and play it backwords. What does it sound like? (Please save the result to a wave file and demo its playback to the TA.)
Record your utterance of "一二三四五六七八九十" and play it backwards. Which digits can you recognize? Why?
Can you think of any other utterances that sound meaningful when played backwords?

請撰寫一段 MATLAB 程式 checkReverse01.m，先錄下你自己講「we」的聲音，然後將波形前後顛倒，再播放出來，請問播出來的聲音像不像是在唸「you」的聲音？重複上述步驟，但是這次錄下你自己講「you」的聲音，將波形前後顛倒後再播放出來，像不像是在唸「we」？為什麼？
(*)Audio signal manipulation: Write a MATLAB script to record your utterance of "today is my birthday". Try to explain the playback effect you observe after you try the following operations on the audio signals.

Multiply the audio signals by -1.
Reverse the audio signals in time axis.
Multiply the audio signals by 10.
Replace each sample by its square root.
Replace each sample by its square.
Clip the waveform such that sample data out of the range [-0.5, 0.5] are set to zero.
Modify the waveform such that samples in the range [-0.5, 0.5] are set to zero; samples out of the range [-0.5, 0.5] are moved toward zero by the amount 0.5.

(*)基本錄音: 請用 MATLAB 寫一小段程式，進行錄音三秒，錄音的內容是「台灣大學資訊系」，其中取樣頻率是 16 KHz，解析度是 8 位元，請將音訊儲存成 myVoice.wav 檔案。

請問檔案大小為何？
若將音訊左右顛倒來播放，會有什麼效果？
若將音訊上下顛倒來播放，或有什麼效果？
若將音訊乘以10倍，或有什麼播放效果？
若將音訊的大小進行開平方，會有什麼播放效果？
若將音訊的大小進行平方，會有什麼播放效果？
若將音訊超過[-0.5, 0.5]的部分均設定為零，會有什麼播放效果？
若將音訊介於[-0.5, 0.5]的部分均砍掉，會有什麼播放效果？
（提示：會用到的指令有 wavrecord, wavwrite, flipud, sign, sound 等。）
(*)Audio signal grafting: Write a MATLAB script to accomplish the following tasks. (You need to find the boundaries by trials and errors, and put the related boundary indices into your MATLAB program for creating the required audio segments.)

For Mandarin-speaking student: Record your utterance of "清華大學資訊系" and save it to a file first.

If you connect the consonant part of "大" to the vowel part of "系", can you get the sound of "地"? (Please save the result to a wave file and demo its playback to the TA.)
If you connect "系" to the vowel part of "大", can you get the sound of "下"? (Please save the result to a wave file and demo its playback to the TA.)

For international students: Record your utterance of "keep it simple" and save it to a file.

Can you get the sound of "pimple" by connecting some portions of "Keep" and "simple"? (Please save the result to a wave file and demo its playback to the TA.)
Can you get the sound of "simplest" by connect some portions of your recording? (Please save the result to a wave file and demo its playback to the TA.)

(**)Experiments on the sample rate: Write a MATLAB script to record your utterance of "my name is ***" with a sample rate of 32 KHz and 8-bit resolution. Try to resample the audio signals at decreasing sample rates of 16 KHz, 8 KHz, 4 KHz, 2 KHz, 1 KHz, and so on. At which sample rate you start to have difficulty in understanding the contents of the utterance?
(**) Experiments on adding noise: Write a MATLAB script to record your utterance of "my name is ***" with a sample rate of 8 KHz and 8-bit resolution. We can add noise to the audio signals by using the following program snippet: Increase the value of k by 0.1 each time and answer the following questions.

At what value of K you start to have difficulty in understanding the content of the playback?
Plot the waveforms at different values of k. At what value of k you start to have difficulty in identifying the fundamental period of the waveform?
(**)Voice signal encryption: Write a function that can take a wave file, encrypt it, and save it as another wave file. The I/O format is
myEncrypt(inputFileName, outputFileName);
where "inputFileName" is a string specifying the input wave file, and "outputFileName" is a string specifying the output wave file. The encryption process is like this (assuming y is the original signal and z is the encrypted signal):

z=y;
if y(i)>0, z(i)=1-y(i) for all i
if y(i)<0, z(i)=-1-y(i) for all i
z=flipud(z);
Note that:

The encrypted file can be converted to the original file using the same function.
Be aware that this is a naive encryption; better methods exist.
(*)Create Sine waves:

Create a sine wave of 4-second duration and 440-Hz frequency, with a sample rate of 16000.
Create a sine wave of 4-second duration, 16000-Hz sample rate, with a frequency linearly varying from 0 to 800 Hz.
(**)Generate sine wave with time-varying frequencies: Write a function to generate a sine wave with time-varying frequencies. The I/O format is
outputSignal=mySine(duration, freq);
where freq is a two-element vector [f1, f2], indicating the frequency of the sine wave should change linearly from f1 (for the first sample point) to f2 (for the last sample point). Note that

The sample rate is 16 KHz.
The first sample is zero, starting from time 0. (In other words, the time vector is (0:duration*fs-1)/fs, and the function to invoke is "sin".)
Note that he instantaneous frequency of $y=sin(2\pi\phi(t))$ is $\phi'(t)$. Therefore you can use the following conditions to find $\phi(t)$: $$ \left\{ \begin{matrix} \phi'(t)=at+b,\\ \phi'(0)=f_1,\\ \phi'(duration-1/fs)=f_2,\\ sin(2\pi\phi(0))=0.\\ \end{matrix} \right. $$
Hint: Here is a snippet for generating a 3-second sine wave of 440Hz: duration=3; f=440; fs=16000; time=(0:duration*fs-1)/fs; y=sin(2*pi*f*time); plot(time, y); sound(y, fs);
(*)Create sirens: Use combination of sine waves to generate the sound of sirens (that you usually hear from an ambulance). Here is an example.
(**)Create beats:

Create a sine wave y1 of 3-second duration and 440-Hz frequency, with a sample rate of 16000.
Create a sine wave y2 of 3-second duration and 444-Hz frequency, with a sample rate of 16000.
Try "sound(y1+y2, fs)", what do you hear? Please use math derivation to explain such phenomenon of "acoustic beats". What is the frequency of the beats in this case?

(**)Create time-varying beats:

Create y1 as the previous exercise.
Create y2 which has a time-varying frequency from 440-Hz to 500-Hz (which changes linearly with time).
Try to play y1+y2 and explain what you hear.

(**)Resample audio signals: Write a MATLAB script to resample the audio signals in "sunday.wav" such that new waveform has a new sample rate of 11025. Plot these two waveform in the suplot(2, 1, 1). Plot their absolute difference in subplot(2, 1, 2).
(*)讀入整數的音訊資料: 請寫一個函數 wavRead2.m，其用法和 MATLAB 內建的函數 wavread.m 相同，用法如下：
[y, fs, nbits] = wavread2('file.wav');
唯一不同點，是所傳回來的音訊變數 y 是整數值，如果 nbits 是 8，則 y 的範圍必須介於 -128 到 127 之間；如果 nbits 是 16，那麼 y 的範圍就會介於 -32768 至 32767 之間。（提示：你必須先瞭解 wavread() 的用法。）
(**)如何建立音框: 請寫一個函數 buffer2.m，用法如下
framedY = buffer2(y, frameSize, overlap);
其中 y 是音訊訊號，frameSize 是音框的點數，overlap 則是相鄰音框重疊的點數，framedY 則是一個矩陣，其列數等於音框的點數，行數則等於音框的數目。（若最後幾點的音訊不足以塞滿一個音框，則捨棄這幾點資料。）使用範例如下：

另，請問這個函數 buffer2.m 和 Signal Processing Toolbox 中的 buffer 函數有何不同？
(*)取樣頻率的影響: 請用 MATLAB 寫一小段程式，進行錄音兩秒，錄音的內容是「我是某某某」，其中取樣頻率是 32 KHz，解析度是 8 位元，請將音訊儲存成 myVoice2.wav 檔案。請對訊號進行重新取樣（Resample），讓取樣頻率變成 16 KHz, 8 KHz, 4 KHz, 2 KHz ...等等，請問當取樣頻率掉到多低時，你已經聽不出來原先的聲音？
(*)雜訊的影響: 請用 MATLAB 寫一小段程式，進行錄音兩秒，錄音的內容是「我是某某某」，其中取樣頻率是 8 KHz，解析度是 8 位元。假設音訊訊號是存在一個行向量 y，我們可以以下列方式加入雜訊：

當 k 值由 0.1、0.2、0.3 等慢慢增大時，慢慢你的聲音會越來越模糊。

請問到 k 值是多少時，你會聽不出來原先講話的內容？
請問到 k 值是多少時，你會看不出來 y2 包含一段聲音的訊號？（換言之，在放大 y2 的圖形後，你已經看不出來有基本週期的存在。）

(**)重新取樣: 請用 interp1 指令，對 myVoice2.wav 的音訊進行重新取樣，讓取樣頻率變成 11025 Hz，並將結果儲存成 myVoice3.wav。
(*)時間反轉播放: 在下列的錄音及播放過程，請自行選定錄音參數，錄音時間 3 秒即可。

請錄製聲音「上海自來水來自海上」，請先正向播放一次，然後再翻轉時間軸來反向播放一次，聽看看有時麼差別。你可以從反向播放的聲音，預測原先正向播放的聲音嗎？
請錄製聲音「we are you」，盡量放平聲調。請先正向播放一次，然後再翻轉時間軸來反向播放一次，聽看看有時麼差別。為何麼反向播放和正向播放聽到類似的聲音？中文是否有類似的範例？
（提醒：假設錄音所得的音訊向量是 y，可以使用 flipud(y) 來進行上下翻轉或 fliplr(y) 來進行左右翻轉。）
(*) 音訊剪接: 請用 MATLAB 完成下列事項：

先進行錄音，取樣頻率為 16 KHz，解析度為 16 Bits，錄音時間三秒，錄音內容是「清華大學資訊系」，請問檔案大小為何？如何直接由錄音設定來計算此大小？
請觀察波形，將所有的氣音設定為靜音（訊號值為零），存檔後播放聽看看。是否能聽出來原先的內容？
請觀察波形，將所有的母音設定為靜音（訊號值為零），存檔後播放聽看看。是否能聽出來原先的內容？
若將「大」的前半部（ㄉ）接到「系」的後半部（ㄧ），會不會得到「低」的聲音？剪接後存成 wav 檔，再播放出來聽看看。
若將「系」的前半部（ㄒ）接到「清」的後半部（ㄧㄥ），會不會得到「星」的聲音？剪接後存成 wav 檔，再播放出來聽看看。

(*) 音訊剪接2: 請改用 CoolEdit 來完成上一題。
(***)單聲道變雙聲道: 本題將一個單聲道音訊檔案經過處理後，變成一個左右游移的雙聲道音訊檔案。

請將此檔案 flanger.wav 的左右聲道波形畫出來。由於左右聲道的音量漸進互補，因此在播放時，會產生音源在左右聲道游移的效果。
請使用 load handel.mat 來載入一個音訊 y 及取樣頻率 Fs，請仿照 flanger.wav 的方式，產生一個雙聲道的音訊檔案 handel.wav，使其聲音游移的特性和 flanger.wav 類似。
MATLAB程式設計：入門篇