%% Tutorial on Anomaly Sound Detection (by ) % This tutorial covers the basics of using HMM (Hidden Markov Models) % for anomaly sound detection (ASD) for a pounding machine. % The anomaly sounds were collected when there is no object to be pounded under the machine. % Here are the videos to show the situation when the audio data was collected. % % * <../dataSet_extra/正常.mp4 正常情況> % * <../dataSet_extra/異常空打.mp4 異常情況> % %% Preprocessing % Before we start, let's add necessary toolboxes to the search path of MATLAB: addpath d:/users/jang/matlab/toolbox/utility addpath d:/users/jang/matlab/toolbox/sap addpath d:/users/jang/matlab/toolbox/machineLearning %% % All the above toolboxes can be downloaded from the author's . % Make sure you are using the latest toolboxes to work with this script. %% % For compatibility, here we list the platform and MATLAB version that we used to run this script: fprintf('Platform: %s\n', computer); fprintf('MATLAB version: %s\n', version); fprintf('Date & time: %s\n', char(datetime)); scriptStartTime=tic; % Timing for the whole script %% % Most of the modifiable options for ASD are set in asdOptSet.m: type asdOptSet %% % If you want to run this script, you need to change asOpt.audioDir such % that it points to a folder of sound files containing audio files with normal or abnormal pounding sounds. % The dataset used in the script can be downloaded from % . %% Dataset collection and feature extraction % First of all, we can collect all the sound files once they are downloaded. % We can use the commmand "mmDataCollect" to collect all the file information: asdOpt=asdOptSet; opt=mmDataCollect('defaultOpt'); opt.extName='wav'; auSet=mmDataCollect(asdOpt.audioDir, opt, 1); %% % For each audio file, we assume its pounding sounds have been labeled as % cues by using Cooledit. Then we can perform frame-based feature % extraction and obtain the groundtruth for each frame. This is achieved by % the function "asdFeaExtractFromFile.m", as shown by the following % example: auFile='dataSet/abnormal-cue.wav'; au=myAudioRead(auFile); au.signal=au.signal(:,1); % Use a single channle for feature extraction asdOpt=asdOptSet; asdOpt.feaType='mfcc'; fea=asdFeaExtractFromFile(au, asdOpt, 1); %% % Now we can read audio contents from all audio files. We need to do so % since we want to use channel 1 as the training set and channel 2 as the % test set. for i=1:length(auSet) au=myAudioRead(auSet(i).path); % auSet(i)=structCopy(auSet(i), au); auSet(i).signal=au.signal; auSet(i).fs=au.fs; auSet(i).nbits=au.nbits; auSet(i).file=au.file; end %% % Create the training set from channel 1: auSetTrain=auSet; for i=1:length(auSetTrain) auSetTrain(i).signal=auSetTrain(i).signal(:,1); end %% % Create the test set from channel 2: auSetTest=auSet; for i=1:length(auSetTest) auSetTest(i).signal=auSetTest(i).signal(:,2); end %% % Now we can perform feature extraction and attach the features to auSet: myTic=tic; fprintf('Feature extraction from auSetTrain:\n'); auSetTrain=auSetFeaExtract(auSetTrain, asdOpt, 1); fprintf('Feature extraction from auSetTest:\n'); auSetTest=auSetFeaExtract(auSetTest, asdOpt, 1); fprintf('Saving auSetTrain & auSetTest to asdAuSet.mat...\n'); save asdAuSet auSetTrain auSetTest fprintf('time=%g sec\n', toc(myTic)); %% % We can also create a variable DS for all kinds of data visualization and static classification: feature=[auSetTrain.feature]; output=[auSetTrain.tOutput]; ds.input=feature; ds.output=output; %ds.inputName=asdOpt.featureName; ds.outputName=asdOpt.outputName; ds2=ds; ds2.input=inputNormalize(ds2.input); % input normalization fprintf('ds and ds2 created.\n'); %% Data analysis and visualization % We can display data count for each class: figure; [classSize, classLabel]=dsClassSize(ds, 1); %% % We can plot feature distribution among different classes: figure; dsBoxPlot(ds); %% % We can also plot all features in each class with the same scale: figure; dsFeaVecPlot(ds); %% Model training and test using HMM % Using the collected auSet, we can start HMM training for ASD: fprintf('Start HMM training...\n'); figure; myTic=tic; asdHmmModel=hmmTrain4audio(auSetTrain, asdOpt, 1); fprintf('time=%g sec\n', toc(myTic)); fprintf('Saving asdHmmModel...\n'); save asdHmmModel asdHmmModel % Obtain asdHmmModel %% % Now we can perform inside test us the training set: fprintf('HMM inside test using channel 1:\n'); for i=1:length(auSetTrain) fprintf('%d/%d: file=%s\n', i, length(auSetTrain), auSetTrain(i).file); myTic=tic; figure; au=hmmEval4audio(auSetTrain(i), asdOpt, asdHmmModel, 1); fprintf('\tTime=%g sec, accuracy=%g%%\n', toc(myTic), au.rr*100); snapnow; end %% % And the outside test using the test set: fprintf('HMM outside test using channel 2:\n'); for i=1:length(auSetTest) fprintf('%d/%d: file=%s\n', i, length(auSetTest), auSetTest(i).file); myTic=tic; figure; au=hmmEval4audio(auSetTest(i), asdOpt, asdHmmModel, 1); fprintf('\tTime=%g sec, accuracy=%g%%\n', toc(myTic), au.rr*100); snapnow; end %% Summary % This is a brief tutorial on using HMM for ASD. % As shown in the above examples, HMM can achieve a decent result on the training dataset. % In particular, most of the errors occurs at boundaries with possibly % ambiguous human labeling. As a result, the accuracy should be even % higher. % However, since we do not have extra data for test, the capability of HMM cannot be evaluated objectively. % % There are several directions for further improvement: % % * Acquire more audio files for objective evaluation of all classifiers. % * Investigate new features for ASD. % * Change the configuration of the GMM used in HMM. % * Use other classifiers for ASD. % %% Appendix % List of functions, scripts, and datasets used in this script: % % * <../dataset/index4listing.asp Audio files> % * <../list.asp List of files in this folder> % %% % Date and time when finishing this script: fprintf('Date & time: %s\n', char(datetime)); %% % Overall elapsed time: toc(scriptStartTime) %% % , created on datetime %% % If you are interested in the original MATLAB code for this page, you can % type "grabcode(URL)" under MATLAB, where URL is the web address of this % page.