Automatic Pronunciation Assessment



Published in



Automatic Text-Independent Pronunciation Scoring of Foreign Language Student Speech L. Neumeyer
H. Franco
M. Weintraub
P. Price
ICSLP 1996 [PDF] 共提出了四種basic scoring方法: HMM log-likelihood score, segment classification score, segment duration scores, 和 timing scores
Automatic Pronunciation Scoring for Language Instruction H. Franco
L. Neumeyer
Y. Kim
O. Ronen
ICASSP 1997 [PDF] 提出了HMM-based log-posterior probability score, 效果比之前的segment duration score和log-likelihood score好, 整句評分是由每個phone的評分再平均得來, 並提出了使用linear regression/nonlinear regression (neural network)/model estimation的方法來做score combination
Automatic Detection of Mispronunciation for Language Instruction O. Ronen
L. Neumeyer
H. Franco
Eurospeech 1997 [PDF] 使用mispronunciation network
A CALL System Using Speech Recognition to Train the Pronunciation of Japanese Long Vowels, the Mora Nasal and Mora Obstruents G. Kawai
K. Hirose
Eurospeech 1997 [PDF] 自動辨識/回饋日文學習者在以下三種易混淆的音種: long/short vowel, mora/non-mora nasal, and mora/non-mora obstruent, 辨識只用duration長短來看, 會先用native corpus算出每個音的duration分佈
Automatic Pronunciation Scoring of Specific Phone Segments for Language Instruction Y. Kim
H. Franco
L. Neumeyer
Eurospeech 1997 [PDF]
提出了三種評分的方法, 利用 likelihood, posterior probility, 和 duration 來做評分
Using Likelihood Ratios to PerformUtterance Verification in Automatic Pronunciation Assessment F. de Wet
C. Cucchiarini
H. Strik
L. Boves
Eurospeech 1999 [PDF]
Automatic Scoring of Pronunciation Quality L. Neumeyer
H. Franco
V. Digalakis
M. Weintraub
Speech Communication, 30:83–93, 2000. [PDF] 承接 eurospeech97 Kim et al 那篇, 有五種評分方法, 但看起來大同小異
Combination of Machine Scores for Automatic Grading of Pronunciation Quality H. Franco
L. Neumeyer
V. Digalakis
O. Ronen
Speech Communication 2000 [PDF] 提出了多種評分結合的方法, 包括 linear regression, neural network, distribution esitmation, 和 regression tree, 最後是 neural network 表現最好, 但調整參數也最為麻煩
Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning S. Witt
S. Young
Speech Communication, 30((2/3)):95-108, 2000 [PDF]
這篇提出了 GoP 的方法來做 mispronunciation detection, 對 evaluation 的方法也有很詳細的論述
English Speech Database Read by Japanese Learners for CALL System Development N. Minematsu
Y. Tomiyama
K. Yoshimoto
K. Shimizu
S. Nakagawa
M. Dantsuji
S. Makino
International Conference on Language Resources and Evaluation 2002 [PDF] ERJ corpus 介紹 #1
Development of English Speech Database Read by Japanese to Support CALL Research N. Minematsu
Y. Tomiyama
K. Yoshimoto
K. Shimizu
S. Nakagawa
M. Dantsuji
S. Makino
International Conference on Acoustics 2004 [PDF] ERJ corpus 介紹 #2 (使用此 corpus 要 cite 這篇)
Automatic Pronunciation Assessment for Mandarin Chinese J. C. Chen
J. S. R. Jang
J. Y. Li
M. C. Wu
ICME 2004 [PDF]
Segmental errors in Dutch as a second language how to establish priorities for CAPT A. Neri
C. Cucchiarini
H. Srik
Computer Assisted Spoken English Learning for Chinese in Taiwan J. C. Chen
J. L. Lo
J. S. R. Jang
Pronunciation Assessment Based upon the Phonological Distortions Observed in Language Learners Utterances N. Minematsu ICSLP 2004 [PDF]
The Multi-Channel Wall Street Journal Audio Visual Corpus (MC-WSJ-AV): Specification and Initial Experiment M. Lincoln
I. McCowan
J. Vepa
H. K. Maganti
Automatic Speech Recognition and Understanding, 2005 IEEE Workshop, pages 357-362, 2005 [PDF]
ASR-Based Corrective Feedback on Pronunciation Does It Really Work A. Neri
C. Cucchiarini
H. Strik
ICSLP 2006 [PDF]
Non-Native Speech Databases M. Raab
R. Gruhn
E. Noeth
Proc. ASRU 2007 [PDF] Non-Native Speech Databases的詳細整理, 但很多連結都已年久失修了...
Mandarin Vowel Pronunciation Quality Evaluation by a Novel Formant Classification Method and its Combination with Traditional Algorithms F. Pan
Q. Zhao
Y. Yan
The Goodness of Pronunciation Algorithm a Detailed Performance Study S. Kanters
C. Cucchiarini
H. Strik
SLaTE 2009 [PDF] (Best Paper Award)
Implementation of an Extended Recognition Network for Mispronunciation Detection and Diagnosis in Computer-Assisted Pronunciation Training A. M. Harrison
W. Lo
X. Qian
H. Meng
SLaTE 2009 [PDF]
Computer Assisted Language Learning system based on dynamic question generation and error prediction for automatic speech recognition H. Wang
C. J. Waple
T. Kawahara
Speech Communication 2009 [PDF]
Automatic Pronunciation Scoring of Words and Sentences Independent from the Non-Natives First Language T. Cincarek
R. Gruhn
C. Hacker
E. Noth
S. Nakamura
Computer Speech and Language 2009 [PDF] 用了多種評分方法及評分結合方法
Spoken English Assessment System for Non-Native Speakers Using Acoustic and Prosodic Features Q. Shi
K. Li
S. L. Zhang
S. M. Chu
J. Xiao
Z. J. Ou
Integration of Multilayer Regression Analysis with Structure-based Pronunciation Assessment M. Suzuki
Y. Qiao
N. Minematsu
K. Hirose
Interspeech 2010 [PDF]
Automatic Evaluation of English Pronunciation by Japanese Speakers Using Various Acoustic Features and Pattern Recognition Techniques K. Hirabayashi
S. Nakagawa
Interspeech 2010 [PDF] correlation 超高的一篇 paper, 用了15種 scoring 的方式 (有些很怪...), 最高 correlation 超過 0.9, nonnative corpus 包括 TED (Translanguage English Database) 和 ERJ (English Read by Japanese)
Landmark-based Automated Pronunciation Error Detection S. Y. Yoon
M. Hasegawa-Johnson
R. Sproat
Interspeech 2010 [PDF] 很囧的一篇paper, 用 GoP 和 SVM 算出 score 再用 SVM 做 score combination, 方法爛掉, 奇怪的做法也很多, 而且在四個步驟居然用了四個不同的 corpora
Discriminative Acoustic Model for Improving Mispronunciation Detection and Diagnosis in Computer-Aided Pronunciation Training (CAPT) X. Qian
F. Soong
H. Meng
Interspeech 2010 [PDF]
A New Approach for Automatic Tone Error Detection in Strong Accented Mandarin Based on Dominant Set T. Zhu
D. Ke
Z. Chen
B. Xu
Interspeech 2010 [PDF]
Automatic Derivation of Phonological Rules for Mispronunciation Detection in a Computer-Assisted Pronunciation Training System W. K. Lo
S. Zhang
H. Meng
Interspeech 2010 [PDF]
Adapting a Duration Synthesis Model to Rate Childrens Oral Reading Prosody M. Duong
J. Mostow
Interspeech 2010 [PDF]
Predicting Word Accuracy for the Automatic Speech Recognition of Non-Native Speech S. Y. Yoon
L. Chen
K. Zechner
Interspeech 2010 [PDF]
Regularized-MLLR Speaker Adaptation for Computer-Assisted Language Learning System D. Luo
Y. Qiao
N. Minematsu
Y. Yamauchi
K. Hirose
Interspeech 2010 [PDF]
Using Non-Native Error Patterns to Improve Pronunciation Verification J. van Doremalen
C. Cucchiarini
H. Strik
Interspeech 2010 [PDF]
Decision Tree Based Tone Modeling with Corrective Feedbacks for Automatic Mandarin Tone Assessment H. C. Liao
J. C. Chen
S. C. Chang
Y. H. Guan
C. H. Lee
Interspeech 2010 [PDF]
CASTLE a Computer-Assisted Stress Teaching and Learning Environment for Learners of English as a Second Language J. Lu
R. Wang
L. C De Silva
Y. Gao
J. Liu
Interspeech 2010 [PDF]
Exploring goodness of prosody by diverse matching templates S. Huang
H. Li
S. Wang
J. Liang
B. Xu
Interspeech 2010 [PDF]
Automatic reference independent evaluation of prosody quality using multiple knowledge fusions S. Huang
H. Li
S. Wang
J. Liang
B. Xu
Interspeech 2010 [PDF]
Developing A Chinese L2 Speech Database of Japanese Learners With Narrow-Phonetic Labels For Computer Assisted Pronunciation Training W. Cao
D. Wang
J. Zhang
Z. Xiong
Interspeech 2010 [PDF]
Influence of musical training on perception of L2 speech M. Sadakata
L. van der Zanden
K. Sekiyama
Interspeech 2010 [PDF]
Learning words and speech units through natural interactions J. Hornstein
J. Santos-Victor
Interspeech 2010 [PDF]
Quantitative, Notional, and Comprehensive Evaluations of Spontaneous Engaged Speech G. Molholt
M. J. Cabrera
V. K. Kumar
P. Thompsen
The Computer Assisted Language Instruction Consortium (CALICO) 2011 [PDF] 評估口說能力的paper, 還沒看過...
New feature parameters for pronunciation evaluation in English presentations at international conferences H. Kibishi
S. Nakagawa
Interspeech 2011 [PDF] 這篇是他們在 2010 Interspeech 那篇的延續, 我還沒看過
Automatically assessing the ABCs Verification of children's spoken letter-names and letter-sounds M. P. Black
A. Kazemzadeh
J. Tepperman
S. S. Narayanan
TSLP 2011 [PDF]
FLORA Fluent oral reading assessment of children's speech D. Bolanos
R. A. Cole
W. Ward
E. Borts
E. Svirsky
TSLP 2011 [PDF]
Two methods for assessing oral reading prosody M. Duong
J. Mostow
S. Sitaram
TSLP 2011 [PDF]

Learning to Rank



Published in



Pranking with Ranking K. Crammer
Y. Singer
Advances in Neural Information Processing Systems (NIPS) 2001 [PDF]
用 perceptron 來做 learning to rank, 屬於 pointwise 的方法, 經典
Learning to Rank using Gradient Descent C. Burges
T. Shaked
E. Renshaw
A. Lazier
M. Deeds
N. Hamilton
G. Hullender
ICML 2005 [PDF]
RankNet, 用 pairwise 的概念配上 neural network 來做 learning to rank, 也是經典的 paper
Learning to Rank: From Pairwise Approach to Listwise Approach Z. Cao
T. Qin
T. Y. Liu
M. F. Tsai
H. Li
ICML 2007 [PDF]
寫的很棒的一篇 listwise learning to rank 的 paper, 數學推導寫的相當清楚, 很強大的一篇 paper
Learning to Rank with Ties K. Zhou
G. R. Xue
H. Zha
Y. Yu
SIGIR 2008 [PDF]
在 pairwise 的方法裡, 再加上 tie data 的使用, 讓有限資料的使用更有效率, 也是很強大的一篇 paper
Ranking Projection Z. S. Chen 20100203 Lab Meeting [PPT] 致生學長出國前想到的 idea, 後來有比較完整的 journal

Stress Detection



Published in



Detection of Accents, Phrase Boundaries, and Sentence Modality in German with Prosodic Features V. Strom Eurospeech 1995 [PDF]
Acoustic Correlates of Linguistic Stress and Accent in Dutch and American English A. Sluijter
V. van Heuven
ICSLP 1996 [PDF]
Prosodic Prominence Detection in Speech F. Tamburini International Symposium on Signal Processing and its Applications, 2003 [PDF] [Summary]
這篇 paper 說明了做 prominence detection, 用整個 syllable 和只用 syllable nucleus 效果並不會差太多, 另外他也用 pitch accent + stress 來做 prominence detection
Detecting Stress in Spoken English using Decision Trees and Support Vector Machines H. Xie
P. Andreae
M. Zhang
P. Warren
Australasian Information Security, Data Mining and Web Intelligence, and Software Internalisation, 2004 [PDF]
Practical Use of English Pronunciation System for Japanese Students in the CALL Classroom Y. Tsubota
T. Kawahara
M. Dantsuji
Speech Rate Estimation via Temporal Correlation and Selected Sub-Band Correlation S. Narayanan
D. Wang
Automatic Syllable Stress Detection using Prosodic Features for Pronunciation Evaluation of Language Learners J. Tepperman
S. Narayanan
清楚說明prominence/pitch accent/stress之間的差異, 並說明為何pitch相關特徵對stress detection仍是有用的特徵
Loudness Predicts Prominence Fundamental Frequency Lends Little G. Kochanski
E. Grabe
J. Coleman
B. Rosner
Journal of Acoustic Society of America, 2005 [PDF]
An Automatic System for Detecting Prosodic Prominence in American English Continuous Speech F. Tamburini
C. Caini
International Journal of Speech Technology, 2005 [PDF]
An Acoustic Measure for Word Prominence in Spontaneous Speech D. Wang
S. Narayanan
TASLP 2007 [PDF]
Word stress assessment for computer aided language learning J. P. Arias
N. B. Yoma
H. Vivanco
Interspeech 2009 [PDF] [Summary] 這個方法比較兩句的音高和能量來決定兩句(字)的重音位置是否一樣, 需要老師音檔才能做評量
Automatic Prediction of Childrens Reading Ability for High-level Literacy Assessment M. P. Black
J. Tepperman
S. Narayanan
TASLP 2011 [PDF]

last updated: 2013/01/28