ASRU 2009:
Merano/Meran, Italy
2009 IEEE Workshop on Automatic Speech Recognition & Understanding, ASRU 2009, Merano/Meran, Italy, December 13-17, 2009.
IEEE 2009
- Sadaoki Furui:
Generalization problem in ASR acoustic model training and adaptation.
1-10

- Daniel Jurafsky:
It's not you, it's me: Automatically extracting social meaning from speed dates.
11

- Dekai Wu:
Toward machine translation with statistics and syntax and semantics.
12-21

- Gerasimos Potamianos:
Audio-visual automatic speech recognition and related bimodal speech technologies: A review of the state-of-the-art and open problems.
22

- Holger Schwenk:
Trends and challenges in language modeling for speech recognition and machine translation.
23

- Jont B. Allen, Feipeng Li:
Manipulation of consonants in natural speech.
24

- Jason D. Williams:
Spoken dialogue systems: Challenges, and opportunities for research.
25

- Lin-Shan Lee, Yi-Cheng Pan:
Voice-based information retrieval - how far are we from the text-based information retrieval ?
26-43

- Mark J. F. Gales:
Acoustic modelling for speech recognition: Hidden Markov models and beyond?
44

- Nicolò Cesa-Bianchi:
Online discriminative learning: theory and applications.
45

- Tatsuya Kawahara:
New perspectives on spoken language understanding: Does machine need to fully understand speech?
46-50

- Tanja Schultz:
Rapid language adaptation tools for multilingual speech processing.
51

- Simon Wiesler, Markus Nußbaum-Thom, Georg Heigold, Ralf Schlüter, Hermann Ney:
Investigations on features for log-linear acoustic models in continuous speech recognition.
52-57

- Abhijeet Sangwan, John H. L. Hansen:
Leveraging speech production knowledge for improved speech recognition.
58-63

- Etienne Marcheret, Vaibhava Goel, Peder A. Olsen:
Optimal quantization and bit allocation for compressing large discriminative feature space transforms.
64-69

- Barbara Schuppler, Joost van Doremalen, Odette Scharenborg, Bert Cranen, Lou Boves:
Using temporal information for improving articulatory-acoustic feature classification.
70-75

- Muhammad Ali Tahir, Georg Heigold, Christian Plahl, Ralf Schlüter, Hermann Ney:
Generalized likelihood ratio discriminant analysis.
76-81

- Karen Livescu, Mark Stoehr:
Multi-view learning of acoustic features for speaker recognition.
82-86

- Chih-Chieh Cheng, Fei Sha, Lawrence K. Saul:
Large-margin feature adaptation for automatic speech recognition.
87-92

- Upendra V. Chaudhari, Michael Picheny:
Articulatory feature detection with Support Vector Machines for integration into ASR and phone recognition.
93-98

- Spiros Dimopoulos, Eric Fosler-Lussier, Chin-Hui Lee, Alexandros Potamianos:
Transition features for CRF-based speech recognition and boundary detection.
99-102

- Pirros Tsiakoulis, Alexandros Potamianos, Dimitrios Dimitriadis:
Short-time instantaneous frequency and bandwidth features for speech recognition.
103-106

- Yun-Hsuan Sung, Daniel Jurafsky:
Hidden Conditional Random Fields for phone recognition.
107-112

- Peter Bell, Simon King:
Diagonal priors for full covariance speech recognition.
113-117

- Yu Qiao, Masayuki Suzuki, Nobuaki Minematsu:
A study on Hidden Structural Model and its application to labeling sequences.
118-123

- Daniel Vásquez, Guillermo Aradilla, Rainer Gruhn, Wolfgang Minker:
A hierarchical structure for modeling inter and intra phonetic information for phoneme recognition.
124-129

- Keith Vertanen, Per Ola Kristensson:
Automatic selection of recognition errors by respeaking the intended text.
130-135

- Xiaodong Cui, Jian Xue, Bowen Zhou:
Improving online incremental speaker adaptation with eigen feature space MLLR.
136-140

- Jui-Ting Huang, Xi Zhou, Mark Hasegawa-Johnson, Thomas S. Huang:
Kernel metric learning for phonetic classification.
141-145

- Kshitij Gupta, John D. Owens:
Three-layer optimizations for fast GMM computations on GPU-like parallel processors.
146-151

- Geoffrey Zweig, Patrick Nguyen:
A segmental CRF approach to large vocabulary continuous speech recognition.
152-157

- Hung-Shin Lee, Berlin Chen:
Generalized likelihood ratio discriminant analysis.
158-163

- Sriram Ganapathy, Samuel Thomas, Hynek Hermansky:
Temporal envelope subtraction for robust speech recognition using modulation spectrum.
164-169

- Federico Flego, Mark J. F. Gales:
Discriminative adaptive training with VTS and JUD.
170-175

- Steven J. Rennie, John R. Hershey, Peder A. Olsen:
Hierarchical variational loopy belief propagation for multi-talker speech recognition.
176-181

- Philip N. Garner:
SNR features for automatic speech recognition.
182-187

- Chanwoo Kim, Richard M. Stern:
Power function-based power distribution normalization algorithm for robust speech recognition.
188-193

- Wooil Kim, John H. L. Hansen:
Mask estimation employing Posterior-based Representative Mean for missing-feature speech recognition with time-varying background noise.
194-198

- Kaustubh Kalgaonkar, Michael L. Seltzer, Alex Acero:
Noise robust model adaptation using linear spline interpolation.
199-204

- Mark J. F. Gales, Anton Ragni, H. AlDamarki, C. Gautier:
Support vector machines for noise robust ASR.
205-210

- Florian Müller, Eugene Belilovsky, Alfred Mertins:
Generalized cyclic transformations in speaker-independent speech recognition.
211-215

- Yoo Rhee Oh, Hong Kook Kim:
MLLR/MAP adaptation using pronunciation variation for non-native speech recognition.
216-221

- Haitian Xu, Mark J. F. Gales, K. K. Chin:
Improving joint uncertainty decoding performance by predictive methods for noise robust speech recognition.
222-227

- Jing Huang, Karthik Visweswariah:
Improved decision trees for multi-stream HMM-based audio-visual continuous speech recognition.
228-231

- Takayuki Arakawa, Haitham Al-Hassanieh, Masanori Tsujikawa, Ryosuke Isotani:
Extended Minimum Classification Error Training in Voice Activity Detection.
232-236

- Hadi Veisi, Hossein Sameti:
An improved parallel model combination method for noisy speech recognition.
237-242

- Chanwoo Kim, Kshitiz Kumar, Richard M. Stern:
Robust speech recognition using a Small Power Boosting algorithm.
243-248

- Negar Ghourchian, Sid-Ahmed Selouani, Douglas D. O'Shaughnessy:
Robust distributed speech recognition using two-stage Filtered Minima Controlled Recursive Averaging.
249-254

- Xiong Xiao, Jinyu Li, Engsiong Chng, Haizhou Li, Chin-Hui Lee:
A study on hidden Markov model's generalization capability for speech recognition.
255-260

- Wen-Hsiang Tu, Sheng-Yuan Huang, Jeih-Weih Hung:
Sub-band modulation spectrum compensation for robust speech recognition.
261-265

- Md. Jahangir Alam, Sid-Ahmed Selouani, Douglas D. O'Shaughnessy:
An improved perceptual speech enhancement technique employing a psychoacoustically motivated weighting factor.
266-270

- Yu Tsao, Shigeki Matsuda, Satoshi Nakamura, Chin-Hui Lee:
MAP estimation of online mapping parameters in ensemble speaker and speaking environment modeling.
271-275

- Hagen Soltau, George Saon:
Dynamic network decoding revisited.
276-281

- Anoop Deoras, Frederick Jelinek:
Iterative decoding: A novel re-scoring framework for confusion networks.
282-286

- Tara N. Sainath:
Island-driven search using broad phonetic classes.
287-292

- Emilian Stoimenov, Tanja Schultz:
A multiplatform speech recognition decoder based on weighted finite-state transducers.
293-298

- Stanley F. Chen, Lidia Mangu, Bhuvana Ramabhadran, Ruhi Sarikaya, Abhinav Sethy:
Scaling shrinkage-based language models.
299-304

- Vladimir Magdin, Hui Jiang:
Discriminative training of n-gram language models for speech recognition via linear programming.
305-310

- Ariya Rastrow, Abhinav Sethy, Bhuvana Ramabhadran:
Constrained discriminative training of N-gram language models.
311-316

- Puyang Xu, Damianos Karakos, Sanjeev Khudanpur:
Self-supervised discriminative training of statistical language models.
317-322

- Nigel G. Ward, Alejandro Vega:
Towards the use of inferred cognitive states in language modeling.
323-326

- Hong-Kwang Jeff Kuo, Lidia Mangu, Ahmad Emami, Imed Zitouni, Young-Suk Lee:
Syntactic features for Arabic speech recognition.
327-332

- Ngoc Thang Vu, Tanja Schultz:
Vietnamese large vocabulary continuous speech recognition.
333-338

- Kris Demuynck, Antti Puurula, Dirk Van Compernolle, Patrick Wambacq:
The ESAT 2008 system for N-Best Dutch speech recognition benchmark.
339-344

- Daniel Vásquez, Guillermo Aradilla, Rainer Gruhn, Wolfgang Minker:
On speeding phoneme recognition in a hierarchical MLP structure.
345-348

- Martin Wöllmer, Florian Eyben, Björn Schuller, Gerhard Rigoll:
Robust vocabulary independent keyword spotting with graphical models.
349-353

- Hasim Sak, Murat Saraclar, Tunga Güngör:
Integrating morphology into automatic speech recognition.
354-358

- Tara N. Sainath, Bhuvana Ramabhadran, Michael Picheny:
An exploration of large vocabulary tools for small vocabulary phonetic recognition.
359-364

- Joel Pinto, Mathew Magimai-Doss, Hervé Bourlard:
MLP based hierarchical system for task adaptation in ASR.
365-370

- Frederik Stouten, Dominique Fohr, Irina Illina:
Detection of OOV words by combining acoustic confidence measures with linguistic features.
371-375

- Florian Eyben, Martin Wöllmer, Björn Schuller, Alex Graves:
From speech to letters - using a novel neural network architecture for grapheme based ASR.
376-380

- Hui Lin, Jeff A. Bilmes, Shasha Xie:
Graph-based submodular selection for extractive summarization.
381-386

- Shasha Xie, Dilek Hakkani-Tür, Benoît Favre, Yang Liu:
Integrating prosodic features in extractive meeting summarization.
387-391

- Justin Jian Zhang, Ricky Ho Yin Chan, Pascale Fung:
Extractive speech summarization by active learning.
392-397

- Yaodong Zhang, James R. Glass:
Unsupervised spoken keyword spotting via segmental DTW on Gaussian posteriorgrams.
398-403

- Carolina Parada, Abhinav Sethy, Bhuvana Ramabhadran:
Query-by-example Spoken Term Detection For OOV terms.
404-409

- Hung-yi Lee, Yueh-Lien Tang, Hao Tang, Lin-Shan Lee:
Spoken term detection from bilingual spontaneous speech using code-switched lattice-based structures for words and subword units.
410-415

- Upendra V. Chaudhari, Michael Picheny:
Improved vocabulary independent search with approximate match based on Conditional Random Fields.
416-420

- Timothy J. Hazen, Wade Shen, Christopher M. White:
Query-by-example spoken term detection using phonetic posteriorgram templates.
421-426

- Doris Baum:
Topic-based speaker recognition for German parliamentary speeches.
427-431

- David Imseng, Gerald Friedland:
Robust Speaker Diarization for short speech recordings.
432-437

- Silvia Quarteroni, Marco Dinarelli, Giuseppe Riccardi:
Ontology-based grounding of Spoken Language Understanding.
438-443

- Pierre Gotab, Frédéric Béchet, Géraldine Damnati:
Active learning for rule-based and corpus-based Spoken Language Understanding models.
444-449

- Peter A. Heeman:
Representing the Reinforcement Learning state in a negotiation dialogue.
450-455

- Milica Gasic, Fabrice Lefèvre, Filip Jurcícek, Simon Keizer, François Mairesse, Blaise Thomson, Kai Yu, Steve Young:
Back-off action selection in summary space-based POMDP dialogue systems.
456-461

- Ryota Nishimura, Seiichi Nakagawa:
Response timing generation and response type selection for a spontaneous spoken dialog system.
462-467

- Michael Levit, Shuangyu Chang, Bruce Buntschuh:
Garbage modeling with decoys for a sequential recognition scenario.
468-473

- Cheongjae Lee, Sungjin Lee, Sangkeun Jung, Kyungduk Kim, Donghyeon Lee, Gary Geunbae Lee:
Correlation-based query relaxation for example-based dialog modeling.
474-478

- Sebastian Varges, Giuseppe Riccardi, Silvia Quarteroni, Alexei V. Ivanov:
The exploration/exploitation trade-off in Reinforcement Learning for dialogue management.
479-484

- Kofi Boakye, Benoît Favre, Dilek Hakkani-Tür:
Any questions? Automatic question detection in meetings.
485-489

- Chiori Hori, Kiyonori Ohtake, Teruhisa Misu, Hideki Kashioka, Satoshi Nakamura:
Weighted finite state transducer based statistical dialog management.
490-495

- Matthias Paulik, Alex Waibel:
Automatic translation from parallel speech: Simultaneous interpretation as MT training data.
496-501

- Jia Cui, Yonggang Deng, Bowen Zhou:
Reinforcing language model for speech translation with auxiliary data.
502-506

- Sakriani Sakti, Noriyuki Kimura, Michael Paul, Chiori Hori, Eiichiro Sumita, Satoshi Nakamura, Jun Park, Chai Wutiwiwatchai, Bo Xu, Hammam Riza, Karunesh Arora, Chi Mai Luong, Haizhou Li:
The Asian network-based speech-to-speech translation system.
507-512

- Bing Xiang, Bowen Zhou, Martin Cmejrek:
Towards integrated machine translation using structural alignment from syntax-augmented synchronous parsing.
513-518

- Daniele Falavigna, Matteo Gerosa, Roberto Gretter, Diego Giuliani:
Phone-to-word decoding through statistical machine translation and complementary system combination.
519-524

- Hassan Al-Haj, Roger Hsiao, Ian R. Lane, Alan W. Black, Alex Waibel:
Pronunciation modeling for dialectal arabic speech recognition.
525-528

- Qin Jin, Arthur R. Toth, Tanja Schultz, Alan W. Black:
Speaker de-identification via voice transformation.
529-533

- Michael Feld, Etienne Barnard, Charl Johannes van Heerden, Christian A. Müller:
Multilingual speaker age recognition: Regression analyses on the Lwazi corpus.
534-539

- Fernando Batista, Isabel Trancoso, Nuno J. Mamede:
Comparing automatic rich transcription for Portuguese, Spanish and English Broadcast News.
540-545

- Khe Chai Sim:
Discriminative Product-of-Expert acoustic mapping for cross-lingual phone recognition.
546-551

- Björn Schuller, Bogdan Vlasenko, Florian Eyben, Gerhard Rigoll, Andreas Wendemuth:
Acoustic emotion recognition: A benchmark comparison of performances.
552-557

- Richard Dufour, Yannick Estève, Paul Deléglise, Frédéric Béchet:
Local and global models for spontaneous speech segment detection and characterization.
558-561

- Timo Mertens, Daniel Schneider, Arild Brandrud Næss, Torbjørn Svendsen:
Lexicon adaptation for subword speech recognition.
562-567

- Kartik Audhkhasi, Panayiotis G. Georgiou, Shrikanth S. Narayanan:
Lattice-based lexical cues for word fragment detection in conversational speech.
568-573

- Masayuki Suzuki, Nobuaki Minematsu, Dean Luo, Keikichi Hirose:
Sub-structure-based estimation of pronunciation proficiency and classification of learners.
574-579

- Joost van Doremalen, Catia Cucchiarini, Helmer Strik:
Automatic detection of vowel pronunciation errors using multiple information sources.
580-585

- Wenzhu Shen, Roger Peng Yu, Frank Seide, Ji Wu:
Automatic punctuation generation for speech.
586-589

Last update Sat May 25 02:00:07 2013
CET by the DBLP Team —
Data released under the ODC-BY 1.0 license — See also our legal information page