Chapter 8: Exercises

  1. (***) Extension to lcs.m: Write an m-file function myLcs.m which can return multiple LCS strings of the same length. The usage is
    [lcsLength, lcsPath, lcsStr, lcsTable] = lcs(str1, str2, showOpt);
    where "lcsPath" and "lcsStr" are cell arrays for returning multiple LCS results. The program should be able to
    1. Return the right results.
    2. Draw the right backtracking paths (including local and global ones) when showPlot is 1. (You need to modify the plotting function for lcs.m.)

    Hint: lcs.m is available in the Machine Learning Toolbox.

  2. (***) MATLAB function for type-2 DTW: Write an m-file function dtw2m.m that is equivalent to its mex version (dtw2mex.mex*) in the Machine Learning toolbox. The usage is
    [minDistance, dtwPath, dtwTable] = dtw2m(vec1, vec2);
    You can use the following example as a script to test your program:

    Example 1: dtw2test01.mcurrDir=pwd; cd([mltRoot, '/private']); vec1=[0 1 2 3 3 2 1 0 0 1 2 1 0]; vec2=[0 0 1 2 3 2 1 0 0 0 1 3 2 0]; [minDist1, dtwPath1, dtwTable1] = dtw2m(vec1, vec2); [minDist2, dtwPath2, dtwTable2] = dtw2mex(vec1, vec2); fprintf('minDist1=%g, minDist2=%g, error=%g\n', minDist1, minDist2, abs(minDist1-minDist2)); dtwPathPlot(vec1, vec2, {dtwPath1, dtwPath2}); cd(currDir);minDist1=3, minDist2=3, error=0

    For a more extensive test with timing measurement, try the following example:

    Example 2: dtw2test02.m% Compare the speed/distance difference between various versions of type-1 DTW currDir=pwd; cd([mltRoot, '/private']); % ====== Output test beginCorner=1; endCorner=1; testNum=100; fprintf('%d runs of output tests:\n', testNum); for i=1:testNum vec1=randn(1, 25); vec2=randn(1, 30); [minDist1, dtwPath1, dtwTable1] = dtw2m(vec1, vec2, beginCorner, endCorner); [minDist2, dtwPath2, dtwTable2] = dtw2mex(vec1, vec2, beginCorner, endCorner); if ~isequal(minDist1, minDist2) | ~isequal(dtwPath1, dtwPath2) | ~isequal(dtwTable1, dtwTable2) figure('name', 'dtw2m()'); dtwPathPlot(vec1, vec2, dtwPath1); figure('name', 'dtw1mex()'); dtwPathPlot(vec1, vec2, dtwPath2); fprintf('Press any key to continue...\n'); pause end end fprintf('\tDone!\n'); % ====== Speed test testNum=200; mat1=randn(testNum, 25); mat2=randn(testNum, 30); fprintf('%d runs of timing tests:\n', testNum); tic for i=1:testNum, dtw2m(mat1(i,:), mat2(i,:)); end time1=toc; tic for i=1:testNum, dtw2mex(mat1(i,:), mat2(i,:)); end time2=toc; fprintf('\ttime1=%g, time2=%g, ratio=time1/time2=%g\n', time1, time2, time1/time2); cd(currDir);100 runs of output tests: Done! 200 runs of timing tests: time1=0.0138575, time2=0.00274573, ratio=time1/time2=5.04693

    For simplicity, you should only consider the following situations:
    • You only need to consider the case of "anchored beginning and anchored end".
    • You need to add the global constraints where the feasible region is a parallelogram.
    • You can assume the inputs are row vectors.
    The behavior of dtw2mex.mex* should be exactly the same as those described in the text. If you find any discrepancies, feel free to let me know.
  3. (***) Implementation of a speaker-dependent voice-command system using DTW: In this exercise, you are going to use MATLAB to implement a speaker-dependent voice-command system. Basically, the user can record a 3-second utterance and the system will identify the most likely command via DTW. To create such a system, we have two stage for registration and recognition, respectively. During the registration stage, we need to prepare a database where the user can register their utterances for recognition. This involves the following steps:
    1. Prepare a text file "command.txt" manually. The file should contain at least 10 commands. An English example follows: one two three five ... You can also use the following Chinese example: 牛肉麵 珍珠奶茶 貢丸湯 韭菜盒子 肉粽 ... (In fact, you can use any language you like to prepare the text file.)
    2. Write a script for the users to record these commands twice. To increase the variability, you should record each command once in a run, for two runs. (You can use "textread" command to read the text file.)
    3. After each recording, you should perform endpoint detection and feature extraction, and store all the information in a structure array recording of size 2*n, where n is the number of voice commands in "command.txt". The structure array recording contains the following fields:
      • recording(i).text: Text for the i-th recording
      • recording(i).mfcc: MFCC for the i-th recording
    4. Save the variable recording to recording.mat.
    During the recognition stage, the user can hit any key to start 3-second recording, and the system will demonstrate the top-10 results on the screen according to the DTW distance. The recognition stage involved the following steps:
    1. Load recording.mat.
    2. Obtain the user's 3-sec utterance.
    3. Perform endpoint detection and feature extraction.
    4. Compare the MFCC with that of each recording in the array recording using DTW.
    5. Rank the distance.
    6. List the top-10 result (by showing the texts) according to the distance.
    7. Go back to step ii unless ctrl-c is hit.
    Some hints follow:
    • All recordings are of the format: 16KHz, 16Bits, mono.
    • The system should allow the user to do repeated recordings until ctrl-C is hit.
    • You can use epdByVolHod.m for endpoint detection. After each recording, the result of endpoint detection should be displayed on the screen immediately to facilitate debugging. (It is in the SAP Toolbox.)
    • You can use wave2mfcc.m for feature extraction. (It is in the SAP Toolbox.)
    • You can use either dtw.m (in the Machine Learning toolbox) for DTW.
    When you demo your system to TA, make sure you can have at least 3 utterances to have the correct answer in top-1 results.

  4. (***) Programming contest: Use DTW for speaker-dependent speech recognition: Please follow this link to have more descriptions.

  5. (***) Programming contest: Dice sum game: In this exercise, you need to design a strategy for playing the dice sum game. The game is very simple. You need to toss a dice 8 times and place the dice values into 4 2-digit numbers. If the sum of these 4 numbers are less than or equal to 150, your score is the sum. Otherwise, your score is zero. Note that you need to place the dice value right after each toss. The sample code is available here. Please run goGameEval.m to see how the TA perform the evaluation of your strategy. You need to submit the following files:
    • myStrategy.m (If you need to load some parameters, please also submit the mat files.)
    • method.txt: A one-page summary stating how you tackle the problem.
    TA will run your program and post the results regularly to the leaderboard.

    It is possible to use DP to solve this problem, but it is rather complicated. You can come up with your own strategy to optimize the overall performance.

Data Clustering and Pattern Recognition (資料分群與樣式辨認)