Programming Contest: Endpoint Detection

Roger Jang (張智星)

The goal of this programming exercise is to let you get familiar with methods/parameters for EPD. You are required to adjust EPD's parameters, or even create your own EPD method, in order to increase the accuracy of EPD.

What to download
- Utility and SAP Toolboxes from Roger's toolbox homepage.
- Baseline program: exampleProgram.rar
- Dataset: A small dataset with 4 speakers comes with the above example program. A bigger one with 22 speakers (of 2012 class) is available here. TA will give the link to download 50% of recordings of students in the class this year.
How to run the example program
- Modify the main program as follows:
  - Add necessary toolboxes to the search path. (See "addpath" in goTest.m.)
  - Assign the corpus path to the variable "auDir". (See "auDir" in goTest.m. You can choose not to change it for the first time, then a small default dataset will be used instead.)
- Run "goTest" under MATLAB to show the overall recognition rate as well as the recognition rate for each person.
- After you have obtained the baseline recognition rate, you can perform error analysis.
  - Run "epdFileCheck(auSet)" to check badly performed files.
  - Run "epdFileCheck(auSet, '921588_Leon')" to check badly performed files from a specific speaker '921588_Leon'.
How to get better accuracy
- For myEpd.m, the most important parameter is epdOpt.volumeRatio, which is set to 0.1 by default. You can check out goPrmTune.m which uses a simple exhaustive search to find the optimum value of volume threshold. (Be aware that it might take a long time to run it. You may want to reduce the number of wave files before invoking the program.)
- The current main program for EPD is myEpd.m, which is a simplified version of the endPointDetect.m in the SAP Toolobx. You should try the existing EPD program "endPointDetect.m" (by calling it from myEpd.m) available in the SAP toolbox. Try all 3 different methods to find the best one, and then try to find the best parameters for this method.
- Besides exhaustive search, you may want to use some simple heuristic search (for instance, fminsearch) for parameter tuning.
- Observe wave files with larget error to see if you can find other EPD parameters that might gave better performance.
You should only upload the following files for performance evaluation:
- myEpd.m: Your main function for EPD
- myEpdOptSet.m: The options (parameters) for myEpd.m.
- myMethod.txt: Please describe your method briefly, including the recognition rates and the lesson you learned.
- The other files that might be used by myEpd.m.
TA will use your program to evaluate the performance of your PT and post the result on the web.
Be aware that
- The given dataset can be used as the training set, which consists of about 50% of the recordings of all students. The final ranking will be based on the hidden test dataset recorded this year. (So do not overfit the training set if you are using machine learning based approaches for this program contest.)
- If the performance tuning is time consuming, you can use a partial dataset to obtain a rough result quickly.
- Some reference results:
  - My results using endPointDetect.m with method='vol', 'volZcr', and 'volHod'
  - Students' results of 2007 (Open this file in IE.)
- The H1 help of the example code
  exampleProgram
  File convention:
  - epd*.m: Functions that you are not allowed to modify or upload.
  - go*.m: Main program that you can execute directly.
  - my*.*: Files you need to modify and upload.