- (***) Extension to lcs.m: Write an m-file function myLcs.m which can return multiple LCS strings of the same length. The usage is
[lcsLength, lcsPath, lcsStr, lcsTable] = lcs(str1, str2, showOpt); where "lcsPath" and "lcsStr" are cell arrays for returning multiple LCS results. The program should be able to
- Return the right results.
- Draw the right backtracking paths (including local and global ones) when showPlot is 1. (You need to modify the plotting function for lcs.m.)
Hint: lcs.m is available in the Machine Learning Toolbox.- (***) MATLAB function for type-2 DTW: Write an m-file function dtw2m.m that is equivalent to its mex version (dtw2mex.mex*) in the Machine Learning toolbox. The usage is
[minDistance, dtwPath, dtwTable] = dtw2m(vec1, vec2); You can use the following example as a script to test your program:
For a more extensive test with timing measurement, try the following example:
For simplicity, you should only consider the following situations:The behavior of dtw2mex.mex* should be exactly the same as those described in the text. If you find any discrepancies, feel free to let me know.
- You only need to consider the case of "anchored beginning and anchored end".
- You need to add the global constraints where the feasible region is a parallelogram.
- You can assume the inputs are row vectors.
- (***) Implementation of a speaker-dependent voice-command system using DTW: In this exercise, you are going to use MATLAB to implement a speaker-dependent voice-command system. Basically, the user can record a 3-second utterance and the system will identify the most likely command via DTW. To create such a system, we have two stage for registration and recognition, respectively. During the registration stage, we need to prepare a database where the user can register their utterances for recognition. This involves the following steps:
During the recognition stage, the user can hit any key to start 3-second recording, and the system will demonstrate the top-10 results on the screen according to the DTW distance. The recognition stage involved the following steps:
- Prepare a text file "command.txt" manually. The file should contain at least 10 commands. An English example follows:
one two three five ... You can also use the following Chinese example:牛肉麵 珍珠奶茶 貢丸湯 韭菜盒子 肉粽 ... (In fact, you can use any language you like to prepare the text file.)- Write a script for the users to record these commands twice. To increase the variability, you should record each command once in a run, for two runs. (You can use "textread" command to read the text file.)
- After each recording, you should perform endpoint detection and feature extraction, and store all the information in a structure array recording of size 2*n, where n is the number of voice commands in "command.txt". The structure array recording contains the following fields:
- recording(i).text: Text for the i-th recording
- recording(i).mfcc: MFCC for the i-th recording
- Save the variable recording to recording.mat.
Some hints follow:
- Load recording.mat.
- Obtain the user's 3-sec utterance.
- Perform endpoint detection and feature extraction.
- Compare the MFCC with that of each recording in the array recording using DTW.
- Rank the distance.
- List the top-10 result (by showing the texts) according to the distance.
- Go back to step ii unless ctrl-c is hit.
When you demo your system to TA, make sure you can have at least 3 utterances to have the correct answer in top-1 results.
- All recordings are of the format: 16KHz, 16Bits, mono.
- The system should allow the user to do repeated recordings until ctrl-C is hit.
- You can use epdByVolHod.m for endpoint detection. After each recording, the result of endpoint detection should be displayed on the screen immediately to facilitate debugging. (It is in the SAP Toolbox.)
- You can use wave2mfcc.m for feature extraction. (It is in the SAP Toolbox.)
- You can use either dtw.m (in the Machine Learning toolbox) for DTW.
- (***) Programming contest: Use DTW for speaker-dependent speech recognition: Please follow this link to have more descriptions.
- (***) Programming contest: Dice sum game: In this exercise, you need to design a strategy for playing the dice sum game. The game is very simple. You need to toss a dice 8 times and place the dice values into 4 2-digit numbers. If the sum of these 4 numbers are less than or equal to 150, your score is the sum. Otherwise, your score is zero. Note that you need to place the dice value right after each toss. The sample code is available here. Please run goGameEval.m to see how the TA perform the evaluation of your strategy. You need to submit the following files:
TA will run your program and post the results regularly to the leaderboard.
- myStrategy.m (If you need to load some parameters, please also submit the mat files.)
- method.txt: A one-page summary stating how you tackle the problem.
- (***) Plateau passing via DP: A plateau (高原) is represented by a m-by-n matrix $P$. A path through the plateau can be represented by a function $f(i), i=1,2,\dots,n$, which indicates the row index at column $i$. More precisely, the path can be expressed by a set of $n$ index-pairs into matrix $P$: $$ \{\left(f(i), i\right)|i=1,2,\dots,n, 1 \leq f(i)\leq m\}. $$ Given the above path, the objective function to be maximized can be defined as: $$ obj(\alpha)=\sum_{i=1}^n P(f(i), i) - \sum_{i=1}^{n-1}\alpha |f(i)-f(i+1)| $$ where $\alpha$ is a positive penalty coefficient. In other words, the objective function involves two terms:
The above two criteria are mixed by the penalty coefficient $\alpha$:
- The first term indicates we need to have a path as high as possible.
- The second term indicates we need to have a path as straight as possible.
You mission is to write a function that can maximize the objective function by finding the maximizing path for given $P$ and $\alpha$, based on the principal of DP. The I/O format of the function is shown next:
- If $\alpha=0$, then the path should be located at the max of each column of $P$.
- If $\alpha=\infty$, then the path should be a horizontal straight line obtained from the maximum of row sum of $P$.
[optimValue, optimPath]=myPlateauPass(P, alpha); whereFile to download in order to run the following examples:
- optimValue: The max. value of the objective function
- optimPath: A row vector indicating the maximizing path of $f(i), i=1,2,\dots,n$.
Here are some test examples.
- myPlateauPass.p: P-file of the solution.
- pfMat.mat: Data file to be used in the last example.
- This example shows the case when $\alpha$ is 0, with a small-size $P$.
- This example shows the case when $\alpha$ is $\infty$, with a small-size $P$.
- This example shows the case when $\alpha$ is 0, with a median-size $P$.
- This example shows the case when $\alpha$ is $\infty$, with a median-size $P$.
- This example shows a real case when $P$ is $320 \times 243$. It takes about 0.2 seconds on my notebook. Non-DP approaches will take much longer.
Data Clustering and Pattern Recognition (資料分群與樣式辨認)