Home > asr > syl2css.m

syl2css

PURPOSE ^

syl2css: Find the confusing syllable set (CSS) from a given Mandarin syllable in HanYu PinYin.

SYNOPSIS ^

function output=syl2css(syl, dict)

DESCRIPTION ^

 syl2css: Find the confusing syllable set (CSS) from a given Mandarin syllable in HanYu PinYin.
    The output CSS is obtained by analyzing Mandarin utterances of native Japanese.

    Usage: output=syl2css(syl)
           output=syl2css(syl, dict)

    The second usage will remove the CSS that are not in the dict from the output.

    For example:

        syl='ren';    % 
        css1=syl2css(syl)
        dictFile='hanyu.dic';
        dict=dictRead(dictFile);
        css2=syl2css(syl, dict)

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SUBFUNCTIONS ^

SOURCE CODE ^

0001 function output=syl2css(syl, dict)
0002 % syl2css: Find the confusing syllable set (CSS) from a given Mandarin syllable in HanYu PinYin.
0003 %    The output CSS is obtained by analyzing Mandarin utterances of native Japanese.
0004 %
0005 %    Usage: output=syl2css(syl)
0006 %           output=syl2css(syl, dict)
0007 %
0008 %    The second usage will remove the CSS that are not in the dict from the output.
0009 %
0010 %    For example:
0011 %
0012 %        syl='ren';    % 
0013 %        css1=syl2css(syl)
0014 %        dictFile='hanyu.dic';
0015 %        dict=dictRead(dictFile);
0016 %        css2=syl2css(syl, dict)
0017 
0018 %    Roger Jang, 20070214
0019 
0020 if nargin<1; selfdemo; return; end
0021 if nargin<2; dict=[]; end
0022 
0023 output={syl};
0024 % ====== Confusing phones at beginning (繷场才睼瞔)
0025 beginPhone1={'ch', 'r', 'c', 'q'};
0026 beginPhone2={'zh', 'l', 'z', 'j'};
0027 for i=1:length(beginPhone1)
0028     pat=['^', beginPhone1{i}];
0029     newSyl=regexprep(syl, pat, beginPhone2{i});
0030     if ~strcmp(syl, newSyl)
0031         output={output{:}, newSyl};
0032         break;
0033     end
0034 end
0035 % ====== Confusing phones at end (Ю场才睼瞔)
0036 endPhone1={'an', 'eng', 'en', 'iu'};
0037 endPhone2={'ang', 'en', 'eng', 'iu2'};
0038 base=output;
0039 for j=1:length(base)
0040     syl=base{j};
0041     for i=1:length(endPhone1)
0042         pat=[endPhone1{i}, '$'];
0043         newSyl=regexprep(syl, pat, endPhone2{i});
0044         if ~strcmp(syl, newSyl)
0045             output={output{:}, newSyl};
0046             break;
0047         end
0048     end
0049 end
0050 % ====== Confusing syllables (俱砰才睼瞔)
0051 endPhone1={'bo', 'fo', 'mo', 'po'};
0052 endPhone2={'bo2', 'fo2', 'mo2', 'po2'};
0053 base=output;
0054 for j=1:length(base)
0055     syl=base{j};
0056     for i=1:length(endPhone1)
0057         pat=['^', endPhone1{i}, '$'];
0058         newSyl=regexprep(syl, pat, endPhone2{i});
0059         if ~strcmp(syl, newSyl)
0060             output={output{:}, newSyl};
0061             break;
0062         end
0063     end
0064 end
0065 
0066 if ~isempty(dict)
0067     % Eliminate syllables not in the dict (埃ぃㄥ郎 syl)
0068     allSyls={dict.word};
0069     index=[];
0070     for i=1:length(output)
0071         if ~any(strcmp(output{i}, allSyls))
0072             index=[index, i];
0073         end
0074     end
0075     output(index)=[];
0076 end
0077 
0078 % ====== Self demo
0079 function selfdemo
0080 syl='ren'    % 
0081 css1=syl2css(syl)
0082 dictFile='hanyu.dic';
0083 dict=dictRead(dictFile);
0084 css2=syl2css(syl, dict)

Generated on Tue 01-Jun-2010 09:50:19 by m2html © 2003