SLT 2012:
Miami, FL, USA
2012 IEEE Spoken Language Technology Workshop (SLT), Miami, FL, USA, December 2-5, 2012.
IEEE 2012, ISBN 978-1-4673-5125-6
- Teruhisa Misu, Hideki Kashioka:
Simultaneous feature selection and parameter optimization for training of dialog policy by reinforcement learning.
1-6

- Filip Jurcícek:
Reinforcement learning for spoken dialogue systems using off-policy natural gradient method.
7-12

- Zhuoran Wang, Oliver Lemon:
A nonparametric Bayesian approach to learning multimodal interaction management.
1-6

- Sajad Shirali-Shahreza, Gerald Penn:
Realistic answer verification: An analysis of user errors in a sentence-repetition task.
19-24

- Svetlana Stoyanchev, Philipp Salletmayr, Jingbo Yang, Julia Hirschberg:
Localized detection of speech recognition errors.
25-30

- Milica Gasic, Matthew Henderson, Blaise Thomson, Pirros Tsiakoulis, Steve Young:
Policy optimisation of POMDP-based dialogue systems without state space compression.
31-36

- Blaise Thomson, Milica Gasic, Matthew Henderson, Pirros Tsiakoulis, Steve Young:
N-best error simulation for training spoken dialogue systems.
37-42

- Manolis Perakakis, Alexandros Potamianos:
Affective evaluation of a mobile multimodal dialogue system using brain signals.
43-48

- Fabrizio Morbini, Kartik Audhkhasi, Ron Artstein, Maarten Van Segbroeck, Kenji Sagae, Panayiotis S. Georgiou, David R. Traum, Shrikanth S. Narayanan:
A reranking approach for recognition and classification of speech input in conversational dialogue systems.
49-54

- Jason D. Williams:
A critical analysis of two statistical spoken dialog systems in public use.
55-60

- Sungjin Lee, Maxine Eskenazi:
POMDP-based Let's Go system for spoken dialog challenge.
61-66

- Gina-Anne Levow, Siwei Wang:
Employing boosting to compare cues to verbal feedback in multi-lingual dialog.
67-72

- William Yang Wang, Dan Bohus, Ece Kamar, Eric Horvitz:
Crowdsourcing the acquisition of natural language corpora: Methods and observations.
73-78

- Kornel Laskowski:
Exploiting loudness dynamics in stochastic models of turn-taking.
79-84

- Felix Stahlberg, Tim Schlippe, Stephan Vogel, Tanja Schultz:
Word segmentation through cross-lingual word-to-phoneme alignment.
85-90

- Arseniy Gorin, Denis Jouvet:
Class-based speech recognition using a maximum dissimilarity criterion and a tolerance classification margin.
91-96

- Nicolas Obin, Marco Liuni:
On the generalization of Shannon entropy for speech recognition.
97-102

- Shuji Komeiji, Takayuki Arakawa, Takafumi Koshinaka:
A noise-robust speech recognition method composed of weak noise suppression and weak Vector Taylor Series Adaptation.
103-106

- Fabian Triefenbach, Kris Demuynck, Jean-Pierre Martens:
Improving large vocabulary continuous speech recognition by combining GMM-based and reservoir-based acoustic modeling.
107-112

- Atsunori Ogawa, Takaaki Hori, Atsushi Nakamura:
Recognition rate estimation based on word alignment network and discriminative error type classification.
113-118

- Taehwan Kim, Karen Livescu, Gregory Shakhnarovich:
American sign language fingerspelling recognition with phonological feature-based tandem models.
119-124

- Satoshi Kobashikawa, Takaaki Hori, Yoshikazu Yamaguchi, Taichi Asami, Hirokazu Masataki, Satoshi Takahashi:
Efficient prior and incremental beam width control to suppress excessive speech recognition time based on score range estimation.
125-130

- Jinyu Li, Dong Yu, Jui-Ting Huang, Yifan Gong:
Improving wideband speech recognition using mixed-bandwidth training data in CD-DNN-HMM.
131-136

- Cong-Thanh Do, Mohammad J. Taghizadeh, Philip N. Garner:
Combining cepstral normalization and cochlear implant-like speech processing for microphone array-based speech recognition.
137-142

- Gang Li, Huifeng Zhu, Gong Cheng, Kit Thambiratnam, Behrooz Chitsaz, Dong Yu, Frank Seide:
Context-dependent Deep Neural Networks for audio indexing of real-life data.
143-148

- Yosuke Kashiwagi, Masayuki Suzuki, Nobuaki Minematsu, Keikichi Hirose:
Audio-visual feature integration based on piecewise linear transformation for noise robust automatic speech recognition.
149-152

- Gopala Krishna Anumanchipalli, Luís C. Oliveira, Alan W. Black:
Intent transfer in speech-to-speech machine translation.
153-158

- Alex Marin, Tom Kwiatkowski, Mari Ostendorf, Luke S. Zettlemoyer:
Using syntactic and confusion network structure for out-of-vocabulary word detection.
159-164

- Md. Akmal Haidar, Douglas D. O'Shaughnessy:
Topic n-gram count language model adaptation for speech recognition.
165-169

- Naoyuki Kanda, Ryu Takeda, Yasunari Obuchi:
Using rhythmic features for Japanese spoken term detection.
170-175

- Matthew Henderson, Milica Gasic, Blaise Thomson, Pirros Tsiakoulis, Kai Yu, Steve Young:
Discriminative spoken language understanding using word confusion networks.
176-181

- Hung-yi Lee, Tsung-Hsien Wen, Lin-Shan Lee:
Improved semantic retrieval of spoken content by language models enhanced with acoustic similarity graph.
182-187

- Tsung-Hsien Wen, Hung-yi Lee, Tai-Yuan Chen, Lin-Shan Lee:
Personalized language modeling by crowd sourcing with social network data for voice access of cloud applications.
188-193

- Fernando García, Lluís F. Hurtado, Encarna Segarra, Emilio Sanchis, Giuseppe Riccardi:
Combining multiple translation systems for Spoken Language Understanding portability.
194-198

- Ali Orkan Bayer, Giuseppe Riccardi:
Joint language models for automatic speech recognition and understanding.
199-203

- Teppei Ohno, Tomoyosi Akiba:
Incorporating syllable duration into line-detection-based spoken term detection.
204-209

- Li Deng, Gökhan Tür, Xiaodong He, Dilek Z. Hakkani-Tür:
Use of kernel deep convex networks and end-to-end learning for spoken language understanding.
210-215

- Asli Çelikyilmaz, Dilek Z. Hakkani-Tür, Gökhan Tür:
Statistical semantic interpretation modeling for spoken language understanding with enriched semantic features.
216-221

- Timothy J. Hazen, Fred Richardson:
Modeling multiword phrases with constrained phrase trees for improved topic modeling of conversational speech.
222-227

- Larry P. Heck, Dilek Hakkani-Tür:
Exploiting the Semantic Web for unsupervised spoken language understanding.
228-233

- Tomas Mikolov, Geoffrey Zweig:
Context dependent recurrent neural network language model.
234-239

- Florian Hinterleitner, Christoph Norrenbrock, Sebastian Möller, Ulrich Heute:
What makes this voice sound so bad? A multidimensional analysis of state-of-the-art text-to-speech systems.
240-245

- Pawel Swietojanski, Arnab Ghoshal, Steve Renals:
Unsupervised cross-lingual knowledge transfer in DNN-based LVCSR.
246-251

- Maria Astrinaki, Nicolas D'Alessandro, Benjamin Picart, Thomas Drugman, Thierry Dutoit:
Reactive and continuous control of HMM-based speech synthesis.
252-257

- Oliver Jokisch, Yitagessu Birhanu, Rüdiger Hoffmann:
Syllable-based prosodic analysis of Amharic read speech.
252-257

- David Imseng, Hervé Bourlard, Holger Caesar, Philip N. Garner, Gwénolé Lecorve, Alexandre Nanchen:
MediaParl: Bilingual mixed language accented speech database.
263-268

- Jianbo Jiang, Zhiyong Wu, Mingxing Xu, Jia Jia, Lianhong Cai:
Comparison of adaptation methods for GMM-SVM based speech emotion recognition.
269-273

- Mireia Díez, Amparo Varona, Mikel Peñagarikano, Luis Javier Rodríguez-Fuentes, Germán Bordel:
On the use of phone log-likelihood ratios as features in spoken language recognition.
274-279

- Marc Ferras, Herve Boudard:
Speaker diarization and linking of large corpora.
280-285

- Adriana Stan, Peter Bell, Simon King:
A grapheme-based method for automatic alignment of speech and text data.
286-290

- Benjamin Picart, Thomas Drugman, Thierry Dutoit:
Statistical methods for varying the degree of articulation in new HMM-based voices.
291-296

- Éva Székely, Tamás Gábor Csapó, Bálint Tóth, Péter Mihajlik, Julie Carson-Berndsen:
Synthesizing expressive speech from amateur audiobook recordings.
297-302

- Kyu J. Han, Jason W. Pelecanos:
Frame-based phonotactic Language Identification.
303-306

- Sriram Ganapathy, Mohamed Kamal Omar, Jason Kamal Pelecanos:
Noisy channel adaptation in language identification.
307-312

- Ryoichi Takashima, Tetsuya Takiguchi, Yasuo Ariki:
Exemplar-based voice conversion in noisy environment.
313-317

- L. Paola García-Perera, Juan Arturo Nolazco-Flores, Bhiksha Raj, Richard M. Stern:
Optimization of the DET curve in speaker verification.
318-323

- P. J. Bell, M. J. F. Gales, P. Lanchantin, Xunying Liu, Y. Long, S. Renals, P. Swietojanski, Philip C. Woodland:
Transcription of multi-genre media archives using out-of-domain data.
324-329

- Mohamed Bouallegue, Emmanuel Ferreira, Driss Matrouf, Georges Linares, Maria Goudi, Pascal Nocera:
Acoustic modeling for under-resourced languages based on vectorial HMM-states representation using Subspace Gaussian Mixture Models.
330-335

- Karel Veselý, Martin Karafiát, Frantisek Grézl, Milos Janda, Ekaterina Egorova:
The language-independent bottleneck features.
336-341

- Stefan Ziegler, Bogdan Ludusan, Guillaume Gravier:
Towards a new speech event detection approach for landmark-based speech recognition.
342-347

- João Miranda, João Paulo Neto, Alan W. Black:
Recovery of acronyms, out-of-lattice words and pronunciations from parallel multilingual speech.
348-353

- Daniel Bolanos:
The Bavieca open-source speech recognition toolkit.
354-359

- Udhyakumar Nallasamy, Florian Metze, Tanja Schultz:
Active learning for accent adaptation in Automatic Speech Recognition.
360-365

- Kaisheng Yao, Dong Yu, Frank Seide, Hang Su, Li Deng, Yifan Gong:
Adaptation of context-dependent deep neural networks for automatic speech recognition.
366-369

- Leonardo Badino, Claudia Canevari, Luciano Fadiga, Giorgio Metta:
Deep-level acoustic-to-articulatory mapping for DBN-HMM based phone recognition.
370-375

- Andrew Rosenberg:
Modeling intensity contours and the interaction of pitch and intensity to improve automatic prosodic event detection and classification.
376-381

- Ann Lee, James R. Glass:
A comparison-based approach to mispronunciation detection.
382-387

- Mostafa Ali Shahin, Beena Ahmed, Kirrie J. Ballard:
Automatic classification of unequal lexical stress patterns using machine learning algorithms.
388-391

- K. Riedhammer, M. Gropp, E. Noth:
The FAU Video Lecture Browser system.
392-397

- Ghada AlHarbi, Thomas Hain:
Automatic transcription of academic lectures from diverse disciplines.
398-403

- Heather Friedberg, Diane J. Litman, Susannah B. F. Paletz:
Lexical entrainment and success in student engineering groups.
404-409

- Sandrine Brognaux, Thomas Drugman, Richard Beaufort:
Automatic detection and correction of syntax-based prosody annotation errors.
410-415

- Sandrine Brognaux, Sophie Roekhaut, Thomas Drugman, Richard Beaufort:
Train&align: A new online tool for automatic phonetic alignment.
416-421

- Luiza Orosanu, Denis Jouvet, Dominique Fohr, Irina Illina, Anne Bonneau:
Combining criteria for the detection of incorrect entries of non-native speech in the context of foreign language learning.
422-427

- Yi Luan, Masayuki Suzuki, Yutaka Yamauchi, Nobuaki Minematsu, Shuhei Kato, Keikichi Hirose:
Performance improvement of automatic pronunciation assessment in a noisy classroom.
428-431

- Sechun Kang, Gary Geunbae Lee, Ho-Young Lee, Byeongchang Kim:
An automatic pitch accent feedback system for english learners with adaptation of an english corpus spoken by Koreans.
432-437

- Meysam Asgari, Izhak Shafran, Alireza Bayestehtashk:
Robust detection of voiced segments in samples of everyday conversations using unsupervised HMMS.
438-442

- Kyusong Lee, Soo-Ok Kweon, Hongsuck Seo, Gary Geunbae Lee:
Generating grammar questions using corpus data in L2 learning.
443-448

- Ian Kaplan, Andrew Rosenberg:
Analysis of speech transcripts to predict winners of U.S. Presidential and Vice-Presidential debates.
449-454

- N. Yang, R. Muraleedharan, J. Kohl, Ilker Demirkol, Wendi Rabiner Heinzelman, Melissa Sturge-Apple:
Speech-based emotion classification using multiclass SVM with hybrid kernel and thresholding fusion.
455-460

- Yun-Nung Chen, Florian Metze:
Two-layer mutually reinforced random walk for improved multi-party meeting summarization.
461-466

- Anthony McCallum, Gerald Penn, Cosmin Munteanu, Xiaodan Zhu:
Ecological validity and the evaluation of speech summarization quality.
467-472

- Tongmu Zhao, Akemi Hoshino, Masayuki Suzuki, Nobuaki Minematsu, Keikichi Hirose:
Automatic Chinese pronunciation error detection using SVM trained with structural features.
473-478

- Deana Pennell, Yang Liu:
Evaluating the effect of normalizing informal text on TTS output.
479-483

Last update Thu May 23 17:59:11 2013
CET by the DBLP Team —
Data released under the ODC-BY 1.0 license — See also our legal information page