INTERSPEECH 2005:
Lisbon, Portugal
INTERSPEECH 2005 - Eurospeech, 9th European Conference on Speech Communication and Technology, Lisbon, Portugal, September 4-8, 2005.
ISCA 2005
Keynote Papers
- Graeme M. Clark:
The multiple-channel cochlear implant: interfacing electronic technology to human consciousness.
1-4

Speech Recognition - Language Modelling I-III
Prosody in Language Performance I, II
Spoken Language Extraction / Retrieval I, II
- Olivier Siohan, Michiel Bacchiani:
Fast vocabulary-independent audio search using path-based graph indexing.
53-56

- John Makhoul, Alex Baron, Ivan Bulyko, Long Nguyen, Lance A. Ramshaw, David Stallard, Richard M. Schwartz, Bing Xiang:
The effects of speech recognition and punctuation on information extraction performance.
57-60

- Ciprian Chelba, Alex Acero:
Indexing uncertainty for spoken document search.
61-64

- Tomoyosi Akiba, Hiroyuki Abe:
Exploiting passage retrieval for n-best rescoring of spoken questions.
65-68

- BalaKrishna Kolluru, Heidi Christensen, Yoshihiko Gotoh:
Multi-stage compaction approach to broadcast news summarisation.
69-72

- Chien-Lin Huang, Chia-Hsin Hsieh, Chung-Hsien Wu:
Audio-video summarization of TV news using speech recognition and shot change detection.
73-76

The Blizzard Challenge 2005
- Alan W. Black, Keiichi Tokuda:
The blizzard challenge - 2005: evaluating corpus-based speech synthesis on common datasets.
77-80

- Shinsuke Sakai, Han Shu:
A probabilistic approach to unit selection for corpus-based speech synthesis.
81-84

- John Kominek, Christina L. Bennett, Brian Langner, Arthur R. Toth:
The blizzard challenge 2005 CMU entry - a method for improving speech synthesis systems.
85-88

- H. Timothy Bunnell, Christopher A. Pennington, Debra Yarrington, John Gray:
Automatic personal synthetic voice construction.
89-92

- Heiga Zen, Tomoki Toda:
An overview of nitech HMM-based speech synthesis system for blizzard challenge 2005.
93-96

- Wael Hamza, Raimo Bakis, Zhiwei Shuang, Heiga Zen:
On building a concatenative speech synthesis system from the blizzard challenge speech databases.
97-100

- Robert A. J. Clark, Korin Richmond, Simon King:
Multisyn voices from ARCTIC data for the blizzard challenge.
101-104

- Christina L. Bennett:
Large scale evaluation of corpus-based synthesizers: results and lessons from the blizzard challenge 2005.
105-108

New Applications
- Berlin Chen, Yi-Ting Chen, Chih-Hao Chang, Hung-Bin Chen:
Speech retrieval of Mandarin broadcast news via mobile devices.
109-112

- Michiaki Katoh, Kiyoshi Yamamoto, Jun Ogata, Takashi Yoshimura, Futoshi Asano, Hideki Asoh, Nobuhiko Kitawaki:
State estimation of meetings by information fusion using Bayesian network.
113-116

- Roger K. Moore:
Results from a survey of attendees at ASRU 1997 and 2003.
117-120

- Reinhold Haeb-Umbach, Basilis Kladis, Joerg Schmalenstroeer:
Speech processing in the networked home environment - a view on the amigo project.
121-124

- Masahide Sugiyama:
Fixed distortion segmentation in efficient sound segment searching.
125-128

- Tin Lay Nwe, Haizhou Li:
Identifying singers of popular songs.
129-132

- Jun Ogata, Masataka Goto:
Speech repair: quick error correction just by using selection operation for speech input interfaces.
133-136

- Dirk Olszewski, Fransiskus Prasetyo, Klaus Linhard:
Steerable highly directional audio beam loudspeaker.
137-140

- Hassan Ezzaidi, Jean Rouat:
Automatic music genre classification using second-order statistical measures for the prescriptive approach.
141-144

- Alberto Abad, Dusan Macho, Carlos Segura, Javier Hernando, Climent Nadeu:
Effect of head orientation on the speaker localization performance in smart-room environment.
145-148

- Corinne Fredouille, Gilles Pouchoulin, Jean-François Bonastre, M. Azzarello, Antoine Giovanni, Alain Ghio:
Application of automatic speaker recognition techniques to pathological voice assessment (dysphonia).
149-152

- Upendra V. Chaudhari, Ganesh N. Ramaswamy, Edward A. Epstein, Sasha Caskey, Mohamed Kamal Omar:
Adaptive speech analytics: system, infrastructure, and behavior.
153-156

E-learning and Spoken Language Processing
- Katherine Forbes-Riley, Diane J. Litman:
Correlating student acoustic-prosodic profiles with student learning in spoken tutoring dialogues.
157-160

- Diane J. Litman, Katherine Forbes-Riley:
Speech recognition performance and learning in spoken dialogue tutoring.
161-164

- Satoshi Asakawa, Nobuaki Minematsu, Toshiko Isei-Jaakkola, Keikichi Hirose:
Structural representation of the non-native pronunciations.
165-168

- Fu-Chiang Chou:
Ya-ya language box - a portable device for English pronunciation training with speech recognition technologies.
169-172

- Akinori Ito, Yen-Ling Lim, Motoyuki Suzuki, Shozo Makino:
Pronunciation error detection method based on error rule clustering using a decision tree.
173-176

- Abhinav Sethy, Shrikanth Narayanan, Nicolaus Mote, W. Lewis Johnson:
Modeling and automating detection of errors in Arabic language learner speech.
177-180

- Felicia Zhang, Michael Wagner:
Effects of F0 feedback on the learning of Chinese tones by native speakers of English.
181-184

E-inclusion and Spoken Language Processing I, II
- Tom Brøndsted, Erik Aaskoven:
Voice-controlled internet browsing for motor-handicapped users. design and implementation issues.
185-188

- Briony Williams, Delyth Prys, Ailbhe Ní Chasaide:
Creating an ongoing research capability in speech technology for two minority languages: experiences from the WISPR project.
189-192

- Anestis Vovos, Basilis Kladis, Nikolaos D. Fakotakis:
Speech operated smart-home control system for users with special needs.
193-196

- Takatoshi Jitsuhiro, Shigeki Matsuda, Yutaka Ashikari, Satoshi Nakamura, Ikuko Eguchi Yairi, Seiji Igi:
Spoken dialog system and its evaluation of geographic information system for elderly persons' mobility support.
197-200

- Daniele Falavigna, Toni Giorgino, Roberto Gretter:
A frame based spoken dialog system for home care.
201-204

Acoustic Processing for ASR I-III
- Matthias Wölfel:
Frame based model order selection of spectral envelopes.
205-208

- Vivek Tyagi, Christian Wellekens, Hervé Bourlard:
On variable-scale piecewise stationary spectral analysis of speech signals for ASR.
209-212

- Arlo Faria, David Gelbart:
Efficient pitch-based estimation of VTLN warp factors.
213-216

- Yanli Zheng, Richard Sproat, Liang Gu, Izhak Shafran, Haolang Zhou, Yi Su, Daniel Jurafsky, Rebecca Starr, Su-Youn Yoon:
Accent detection and speech recognition for Shanghai-accented Mandarin.
217-220

- Loïc Barrault, Renato de Mori, Roberto Gemello, Franco Mana, Driss Matrouf:
Variability of automatic speech recognition systems using different features.
221-224

- Slavomír Lihan, Jozef Juhar, Anton Cizmar:
Crosslingual and bilingual speech recognition with Slovak and Czech speechdat-e databases.
225-228

- Carmen Peláez-Moreno, Qifeng Zhu, Barry Y. Chen, Nelson Morgan:
Automatic data selection for MLP-based feature extraction for ASR.
229-232

- Thilo Köhler, Christian Fügen, Sebastian Stüker, Alex Waibel:
Rapid porting of ASR-systems to mobile devices.
233-236

- Hugo Meinedo, João Paulo Neto:
A stream-based audio segmentation, classification and clustering pre-processing system for broadcast news using ANN models.
237-240

- Etienne Marcheret, Karthik Visweswariah, Gerasimos Potamianos:
Speech activity detection fusing acoustic phonetic and energy features.
241-244

- Zoltán Tüske, Péter Mihajlik, Zoltán Tobler, Tibor Fegyó:
Robust voice activity detection based on the entropy of noise-suppressed spectrum.
245-248

- Masamitsu Murase, Shun'ichi Yamamoto, Jean-Marc Valin, Kazuhiro Nakadai, Kentaro Yamada, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno:
Multiple moving speaker tracking by microphone array on mobile robot.
249-252

Speech Recognition - Adaptation I, II
- Yaxin Zhang, Bian Wu, Xiaolin Ren, Xin He:
A speaker biased SI recognizer for embedded mobile applications.
253-256

- Bart Bakker, Carsten Meyer, Xavier L. Aubert:
Fast unsupervised speaker adaptation through a discriminative eigen-MLLR algorithm.
257-260

- Rusheng Hu, Jian Xue, Yunxin Zhao:
Incremental largest margin linear regression and MAP adaptation for speech separation in telemedicine applications.
261-264

- Giulia Garau, Steve Renals, Thomas Hain:
Applying vocal tract length normalization to meeting recordings.
265-268

- S. Umesh, András Zolnay, Hermann Ney:
Implementing frequency-warping and VTLN through linear transformation of conventional MFCC.
269-272

- Xiaodong Cui, Abeer Alwan:
MLLR-like speaker adaptation based on linearization of VTLN with MFCC features.
273-276

- Chandra Kant Raut, Takuya Nishimoto, Shigeki Sagayama:
Model adaptation by state splitting of HMM for long reverberation.
277-280

- Daben Liu, Daniel Kiecza, Amit Srivastava, Francis Kubala:
Online speaker adaptation and tracking for real-time speech recognition.
281-284

- Masafumi Nishida, Yasuo Horiuchi, Akira Ichikawa:
Automatic speech recognition based on adaptation and clustering using temporal-difference learning.
285-288

- Hui Ye, Steve Young:
Improving the speech recognition performance of beginners in spoken conversational interaction for language learning.
289-292

- Randy Gomez, Akinobu Lee, Hiroshi Saruwatari, Kiyohiro Shikano:
Rapid unsupervised speaker adaptation based on multi-template HMM sufficient statistics in noisy environments.
293-296

- Dong-jin Choi, Yung-Hwan Oh:
Rapid speaker adaptation for continuous speech recognition using merging eigenvoices.
297-300

Signal Analysis, Processing and Feature Estimation I-III
- Jian Liu, Thomas Fang Zheng, Jing Deng, Wenhu Wu:
Real-time pitch tracking based on combined SMDSF.
301-304

- András Bánhalmi, Kornél Kovács, András Kocsor, László Tóth:
Fundamental frequency estimation by least-squares harmonic model fitting.
305-308

- Siu Wa Lee, Frank K. Soong, Pak-Chung Ching:
Harmonic filtering for joint estimation of pitch and voiced source with single-microphone input.
309-312

- Marián Képesi, Luis Weruaga:
High-resolution noise-robust spectral-based pitch estimation.
313-316

- John-Paul Hosom:
F0 estimation for adult and children's speech.
317-320

- Ben Milner, Xu Shao, Jonathan Darch:
Fundamental frequency and voicing prediction from MFCCs for speech reconstruction from unconstrained speech.
321-324

- Nelly Barbot, Olivier Boëffard, Damien Lolive:
F0 stylisation with a free-knot b-spline model and simulated-annealing optimization.
325-328

- Friedhelm R. Drepper:
Voiced excitation as entrained primary response of a reconstructed glottal master oscillator.
329-332

- Damien Vincent, Olivier Rosec, Thierry Chonavel:
Estimation of LF glottal source parameters based on an ARX model.
333-336

- Leigh D. Alsteris, Kuldip K. Paliwal:
Some experiments on iterative reconstruction of speech from STFT phase and magnitude spectra.
337-340

- R. Muralishankar, Abhijeet Sangwan, Douglas D. O'Shaughnessy:
Statistical properties of the warped discrete cosine transform cepstrum compared with MFCC.
341-344

- Aníbal J. S. Ferreira:
New signal features for robust identification of isolated vowels.
345-348

- Jonathan Pincas, Philip J. B. Jackson:
Amplitude modulation of frication noise by voicing saturates.
349-352

- Ron M. Hecht, Naftali Tishby:
Extraction of relevant speech features using the information bottleneck method.
353-356

- Mohammad Firouzmand, Laurent Girin, Sylvain Marchand:
Comparing several models for perceptual long-term modeling of amplitude and phase trajectories of sinusoidal speech.
357-360

- Hynek Hermansky, Petr Fousek:
Multi-resolution RASTA filtering for TANDEM-based ASR.
361-364

- Woojay Jeon, Biing-Hwang Juang:
A category-dependent feature selection method for speech signals.
365-368

- Trausti T. Kristjansson, Sabine Deligne, Peder A. Olsen:
Voicing features for robust speech detection.
369-372

Robust Speech Recognition I-IV
- Svein Gunnar Pettersen, Magne Hallstein Johnsen, Tor André Myrvoll:
Joint Bayesian predictive classification and parallel model combination for robust speech recognition.
373-376

- Glauco F. G. Yared, Fábio Violaro, Lívio C. Sousa:
Gaussian elimination algorithm for HMM complexity reduction in continuous speech recognition systems.
377-380

- Luis Buera, Eduardo Lleida, Antonio Miguel, Alfonso Ortega:
Robust speech recognition in cars using phoneme dependent multi-environment linear normalization.
381-384

- Yi Chen, Lin-Shan Lee:
Energy-based frame selection for reliable feature normalization and transformation in robust speech recognition.
385-388

- Yoshitaka Nakajima, Hideki Kashioka, Kiyohiro Shikano, Nick Campbell:
Remodeling of the sensor for non-audible murmur (NAM).
389-392

- Amarnag Subramanya, Jeff Bilmes, Chia-Ping Chen:
Focused word segmentation for ASR.
393-396

Speech Perception I, II
Spoken Language Understanding I, II
- Ian R. Lane, Tatsuya Kawahara:
Utterance verification incorporating in-domain confidence and discourse coherence measures.
421-424

- Constantinos Boulis, Mari Ostendorf:
Using symbolic prominence to help design feature subsets for topic classification and clustering of natural human-human conversations.
425-428

- Katsuhito Sudoh, Hajime Tsukada:
Tightly integrated spoken language understanding using word-to-concept translation.
429-432

- Ruhi Sarikaya, Hong-Kwang Jeff Kuo, Vaibhava Goel, Yuqing Gao:
Exploiting unlabeled data using multiple classifiers for improved natural language call-routing.
433-436

- Hong-Kwang Jeff Kuo, Vaibhava Goel:
Active learning with minimum expected error for spoken language understanding.
437-440

- Matthias Thomae, Tibor Fábián, Robert Lieb, Günther Ruske:
Lexical out-of-vocabulary models for one-stage speech interpretation.
441-444

E-inclusion and Spoken Language Processing I, II
Paralinguistic and Nonlinguistic Information in Speech
- Nick Campbell, Hideki Kashioka, Ryo Ohara:
No laughing matter.
465-468

- Christophe Blouin, Valérie Maffiolo:
A study on the automatic detection and characterization of emotion in a voice service context.
469-472

- Raul Fernandez, Rosalind W. Picard:
Classical and novel discriminant features for affect recognition from speech.
473-476

- Jaroslaw Cichosz, Krzysztof Slot:
Low-dimensional feature space derivation for emotion recognition.
477-480

- Carlos Toshinori Ishi, Hiroshi Ishiguro, Norihiro Hagita:
Proposal of acoustic measures for automatic detection of vocal fry.
481-484

- Khiet P. Truong, David A. van Leeuwen:
Automatic detection of laughter.
485-488

- Anton Batliner, Stefan Steidl, Christian Hacker, Elmar Nöth, Heinrich Niemann:
Tales of tuning - prototyping for automatic classification of emotional user states.
489-492

- Iker Luengo, Eva Navas, Inmaculada Hernáez, Jon Sánchez:
Automatic emotion recognition using prosodic parameters.
493-496

- Sungbok Lee, Serdar Yildirim, Abe Kazemzadeh, Shrikanth Narayanan:
An articulatory study of emotional speech production.
497-500

- Gregor Hofer, Korin Richmond, Robert A. J. Clark:
Informed blending of databases for emotional speech synthesis.
501-504

- Fabio Tesser, Piero Cosi, Carlo Drioli, Graziano Tisato:
Emotional FESTIVAL-MBROLA TTS synthesis.
505-508

- Felix Burkhardt:
Emofilt: the simulation of emotional speech by prosody-transformation.
509-512

- Andrew Rosenberg, Julia Hirschberg:
Acoustic/prosodic and lexical correlates of charismatic speech.
513-516

- Yoko Greenberg, Minoru Tsuzaki, Hiroaki Kato, Yoshinori Sagisaka:
Communicative speech synthesis using constituent word attributes.
517-520

- Angelika Braun, Matthias Katerbow:
Emotions in dubbed speech: an intercultural approach with respect to F0.
521-524

- Nicolas Audibert, Véronique Aubergé, Albert Rilliard:
The prosodic dimensions of emotion in speech: the relative weights of parameters.
525-528

- Susanne Schötz:
Stimulus duration and type in perception of female and male speaker age.
529-532

- Cecilia Ovesdotter Alm, Richard Sproat:
Perceptions of emotions in expressive storytelling.
533-536

- Hideki Kawahara, Alain de Cheveigné, Hideki Banno, Toru Takahashi, Toshio Irino:
Nearly defect-free F0 trajectory extraction for expressive speech modifications based on STRAIGHT.
537-540

- Tomoko Yonezawa, Noriko Suzuki, Kenji Mase, Kiyoshi Kogure:
Gradually changing expression of singing voice based on morphing.
541-544

Issues in Large Vocabulary Decoding
- I. Lee Hetherington:
A multi-pass, dynamic-vocabulary approach to real-time, large-vocabulary speech recognition.
545-548

- George Saon, Daniel Povey, Geoffrey Zweig:
Anatomy of an extremely fast LVCSR decoder.
549-552

- Dong Yu, Li Deng, Alex Acero:
Evaluation of a long-contextual-Span hidden trajectory model and phonetic recognizer using a* lattice search.
553-556

- Takaaki Hori, Atsushi Nakamura:
Generalized fast on-the-fly composition algorithm for WFST-based speech recognition.
557-560

- Hiroaki Nanjo, Teruhisa Misu, Tatsuya Kawahara:
Minimum Bayes-risk decoding considering word significance for information retrieval system.
561-564

- Arthur Chan, Mosur Ravishankar, Alexander I. Rudnicky:
On improvements to CI-based GMM selection.
565-568

- Dominique Massonié, Pascal Nocera, Georges Linares:
Scalable language model look-ahead for LVCSR.
569-572

- Miroslav Novak:
Memory efficient approximative lattice generation for grammar based decoding.
573-576

- Dong-Hoon Ahn, Su-Byeong Oh, Minhwa Chung:
Improved semi-dynamic network decoding using WFSTs.
577-580

- Janne Pylkkönen:
New pruning criteria for efficient decoding.
581-584

- Tibor Fábián, Robert Lieb, Günther Ruske, Matthias Thomae:
A confidence-guided dynamic pruning approach - utilization of confidence measurement in speech recognition.
585-588

Spoken Language Extraction / Retrieval I, II
- Toru Taniguchi, Akishige Adachi, Shigeki Okawa, Masaaki Honda, Katsuhiko Shirai:
Discrimination of speech, musical instruments and singing voices using the temporal patterns of sinusoidal segments in audio signals.
589-592

- Gabriel Murray, Steve Renals, Jean Carletta:
Extractive summarization of meeting recordings.
593-596

- Arjan van Hessen, Jaap Hinke:
IR-based classification of customer-agent phone calls.
597-600

- Benoît Favre, Frédéric Béchet, Pascal Nocera:
Mining broadcast news data: robust information extraction from word lattices.
601-604

- Mikko Kurimo, Ville T. Turunen:
To recover from speech recognition errors in spoken document retrieval.
605-608

- Edgar González, Jordi Turmo:
Unsupervised clustering of spontaneous speech documents.
609-612

- Masahide Yamaguchi, Masaru Yamashita, Shoichi Matsunaga:
Spectral cross-correlation features for audio indexing of broadcast news and meetings.
613-616

- Chiori Hori, Alex Waibel:
Spontaneous speech consolidation for spoken language applications.
617-620

- Sameer Maskey, Julia Hirschberg:
Comparing lexical, acoustic/prosodic, structural and discourse features for speech summarization.
621-624

- Te-Hsuan Li, Ming-Han Lee, Berlin Chen, Lin-Shan Lee:
Hierarchical topic organization and visual presentation of spoken documents using probabilistic latent semantic analysis (PLSA) for efficient retrieval/browsing applications.
625-628

- Janez Zibert, France Mihelic, Jean-Pierre Martens, Hugo Meinedo, João Paulo Neto, Laura Docío Fernández, Carmen García-Mateo, Petr David, Jindrich Zdánský, Matús Pleva, Anton Cizmar, Andrej Zgank, Zdravko Kacic, Csaba Teleki, Klára Vicsi:
The COST278 broadcast news segmentation and speaker clustering evaluation - overview, methodology, systems, results.
629-632

- Igor Szöke, Petr Schwarz, Pavel Matejka, Lukas Burget, Martin Karafiát, Michal Fapso, Jan Cernocký:
Comparison of keyword spotting approaches for informal continuous speech.
633-636

- Teruhisa Misu, Tatsuya Kawahara:
Dialogue strategy to clarify user's queries for document retrieval system with speech interface.
637-640

- Nicolas Moreau, Shan Jin, Thomas Sikora:
Comparison of different phone-based spoken document retrieval methods with text and spoken queries.
641-644

Signal Analysis, Processing and Feature Estimation I-III
- Pedro Gómez Vilda, Francisco Díaz, Agustín Álvarez Marquina, Rafael Martínez, Victoria Rodellar, Roberto Fernández-Baíllo, Alberto Nieto, Francisco J. Fernandez:
PCA of perturbation parameters in voice pathology detection.
645-648

- Anindya Sarkar, T. V. Sreenivas:
Dynamic programming based segmentation approach to LSF matrix reconstruction.
649-652

- T. Nagarajan, Douglas D. O'Shaughnessy:
Explicit segmentation of speech based on frequency-domain AR modeling.
653-656

- Petr Motlícek, Lukás Burget, Jan Cernocký:
Non-parametric speaker turn segmentation of meeting data.
657-660

- Petri Korhonen, Unto K. Laine:
Unsupervised segmentation of continuous speech using vector autoregressive time-frequency modeling errors.
661-664

- P. Vijayalakshmi, M. Ramasubba Reddy:
The analysis on band-limited hypernasal speech using group delay based formant extraction technique.
665-668

- Jindrich Zdánský, Jan Nouza:
Detection of acoustic change-points in audio records via global BIC maximization and dynamic programming.
669-672

- Md. Khademul Islam Molla, Keikichi Hirose, Nobuaki Minematsu:
Multi-band approach of audio source discrimination with empirical mode decomposition.
673-676

- Minoru Tsuzaki, Satomi Tanaka, Hiroaki Kato, Yoshinori Sagisaka:
Application of auditory image model for speech event detection.
677-680

- José Anibal Arias:
Unsupervised identification of speech segments using kernel methods for clustering.
681-684

- Georgios Evangelopoulos, Petros Maragos:
Speech event detection using multiband modulation energy.
685-688

- John Kominek, Alan W. Black:
Measuring unsupervised acoustic clustering through phoneme pair merge-and-split tests.
689-692

- Fabio Valente, Christian Wellekens:
Variational Bayesian speaker change detection.
693-696

- Sarah Borys, Mark Hasegawa-Johnson:
Distinctive feature based SVM discriminant features for improvements to phone recognition on telephone band speech.
697-700

- P. Vijayalakshmi, M. Ramasubba Reddy:
Detection of hypernasality using statistical pattern classifiers.
701-704

- Luis Weruaga, Marián Képesi:
Self-organizing chirp-sensitive artificial auditory cortical model.
705-708

- Sotiris Karabetsos, Pirros Tsiakoulis, Stavroula-Evita Fotinea, Ioannis Dologlou:
On the use of a decimative spectral estimation method based on eigenanalysis and SVD for formant and bandwidth tracking of speech signals.
709-712

- Alexei V. Ivanov, Marek Parfieniuk, Alexander A. Petrovsky:
Frequency-domain auditory suppression modelling (FASM) - a WDFT-based anthropomorphic noise-robust feature extraction algorithm for speech recognition.
713-716

Keynote Papers
Speech Recognition - Language Modelling I-III
Spoken Language Acquisition, Development and Learning I, II
Multi-modal / Multi-media Processing I, II
- Nick Campbell:
Non-verbal speech processing for a communicative agent.
769-772

- Stuart N. Wrigley, Guy J. Brown:
Physiologically motivated audio-visual localisation and tracking.
773-776

- Jing Huang, Daniel Povey:
Discriminatively trained features using fMPE for multi-stream audio-visual speech recognition.
777-780

- Graziano Tisato, Piero Cosi, Carlo Drioli, Fabio Tesser:
INTERFACE: a new tool for building emotive/expressive talking heads.
781-784

- Pascual Ejarque, Javier Hernando:
Variance reduction by using separate genuine- impostor statistics in multimodal biometrics.
785-788

- Volker Schubert, Stefan W. Hamerich:
The dialog application metalanguage GDialogXML.
789-792

- Jonas Beskow, Mikael Nordenberg:
Data-driven synthesis of expressive visual speech using an MPEG-4 talking head.
793-796

- Oytun Türk, Marc Schröder, Baris Bozkurt, Levent M. Arslan:
Voice quality interpolation for emotional text-to-speech synthesis.
797-800

- Murtaza Bulut, Carlos Busso, Serdar Yildirim, Abe Kazemzadeh, Chul Min Lee, Sungbok Lee, Shrikanth Narayanan:
Investigating the role of phoneme-level modifications in emotional speech resynthesis.
801-804

- Björn Schuller, Ronald Müller, Manfred K. Lang, Gerhard Rigoll:
Speaker independent emotion recognition by early fusion of acoustic and linguistic features within ensembles.
805-808

- Jonghwa Kim, Elisabeth André, Matthias Rehm, Thurid Vogt, Johannes Wagner:
Integrating information from speech and physiological signals to achieve emotional sensitivity.
809-812

- Ellen Douglas-Cowie, Laurence Devillers, Jean-Claude Martin, Roddy Cowie, Suzie Savvidou, Sarkis Abrilian, Cate Cox:
Multimodal databases of everyday emotion: facing up to complexity.
813-816

Spoken / Multi-modal Dialogue Systems I, II
- Francisco Torres, Emilio Sanchis, Encarna Segarra:
Learning of stochastic dialog models through a dialog simulation technique.
817-820

- Lesley-Ann Black, Michael F. McTear, Norman D. Black, Roy Harper, Michelle Lemon:
Evaluating the DI@l-log system on a cohort of elderly, diabetic patients: results from a preliminary study.
821-824

- Pavel Král, Christophe Cerisara, Jana Klecková:
Combination of classifiers for automatic recognition of dialog acts.
825-828

- Xiaojun Wu, Thomas Fang Zheng, Michael Brasser, Zhanjiang Song:
Rapidly developing spoken Chinese dialogue systems with the d-ear SDS SDK.
829-832

- Daniela Oria, Akos Vetek:
Robust algorithms and interaction strategies for voice spelling.
833-836

- Ioannis Toptsis, Axel Haasch, Sonja Hwel, Jannik Fritsch, Gernot A. Fink:
Modality integration and dialog management for a robotic assistant.
837-840

- Norbert Reithinger, Daniel Sonntag:
An integration framework for a mobile multimodal dialogue system accessing the semantic web.
841-844

- Ryuichi Nisimura, Akinobu Lee, Masashi Yamada, Kiyohiro Shikano:
Operating a public spoken guidance system in real environment.
845-848

- Esa-Pekka Salonen, Markku Turunen, Jaakko Hakulinen, Leena Helin, Perttu Prusi, Anssi Kainulainen:
Distributed dialogue management for smart terminal devices.
849-852

- Jaakko Hakulinen, Markku Turunen, Esa-Pekka Salonen:
Visualization of spoken dialogue systems for demonstration, debugging and tutoring.
853-856

- César González Ferreras, Valentín Cardeñoso-Payo:
Development and evaluation of a spoken dialog system to access a newspaper web site.
857-860

- Olivier Pietquin, Richard Beaufort:
Comparing ASR modeling methods for spoken dialogue simulation and optimal strategy learning.
861-864

- Shiu-Wah Chu, Ian M. O'Neill, Philip Hanna, Michael F. McTear:
An approach to multi-strategy dialogue management.
865-868

- Anna Hjalmarsson:
Towards user modelling in conversational dialogue systems: a qualitative study of the dynamics of dialogue parameters.
869-872

- Kouichi Katsurada, Kazumine Aoki, Hirobumi Yamada, Tsuneo Nitta:
Reducing the description amount in authoring MMI applications.
873-876

- Kazunori Komatani, Naoyuki Kanda, Tetsuya Ogata, Hiroshi G. Okuno:
Contextual constraints based on dialogue models in database search task for spoken dialogue systems.
877-880

- Mihai Rotaru, Diane J. Litman:
Using word-level pitch features to better predict student emotions during spoken tutoring dialogues.
881-884

- Antoine Raux, Brian Langner, Dan Bohus, Alan W. Black, Maxine Eskenazi:
Let's go public! taking a spoken dialog system to the real world.
885-888

- Shinya Fujie, Kenta Fukushima, Tetsunori Kobayashi:
Back-channel feedback generation using linguistic and nonlinguistic information and its application to spoken dialogue system.
889-892

- Kallirroi Georgila, James Henderson, Oliver Lemon:
Learning user simulations for information state update dialogue systems.
893-896

- Darío Martín-Iglesias, Yago Pereiro-Estevan, Ana I. García-Moral, Ascensión Gallardo-Antolín, Fernando Díaz-de-María:
Design of a voice-enabled interface for real-time access to stock exchange from a PDA through GPRS.
897-900

- William Schuler, Tim Miller:
Integrating denotational meaning into a DBN language model.
901-904

- Louis ten Bosch:
Improving out-of-coverage language modelling in a multimodal dialogue system using small training sets.
905-908

- Olivier Galibert, Gabriel Illouz, Sophie Rosset:
Ritel: an open-domain, human-computer dialog system.
909-912

Robust Speech Recognition I-IV
- Reinhold Haeb-Umbach, Joerg Schmalenstroeer:
A comparison of particle filtering variants for speech feature enhancement.
913-916

- Ilyas Potamitis, Nikolaos D. Fakotakis:
Enhancement of mel log-power spectrum of speech using particle filtering.
917-920

- Makoto Shozakai, Goshu Nagino:
Improving robustness of speech recognition performance to aggregate of noises by two-dimensional visualization.
921-924

- Woohyung Lim, Bong Kyoung Kim, Nam Soo Kim:
Feature compensation based on switching linear dynamic model and soft decision.
925-928

- Shilei Huang, Xiang Xie, Jingming Kuang:
Using output probability distribution for improving speech recognition in adverse environment.
929-932

- Eric H. C. Choi:
A generalized framework for compensation of mel-filterbank outputs in feature extraction for robust ASR.
933-936

- Hesham Tolba, Zili Li, Douglas D. O'Shaughnessy:
Robust automatic speech recognition using a perceptually-based optimal spectral amplitude estimator speech enhancement algorithm in various low-SNR environments.
937-940

- Stephen So, Kuldip K. Paliwal:
Improved noise-robustness in distributed speech recognition via perceptually-weighted vector quantisation of filterbank energies.
941-944

- Babak Nasersharif, Ahmad Akbari:
Sub-band weighted projection measure for robust sub-band speech recognition.
945-948

- Jianping Deng, Martin Bouchard, Tet Hin Yeap:
Noise compensation using interacting multiple kalman filters.
949-952

- Veronique Stouten, Hugo Van Hamme, Patrick Wambacq:
Kalman and unscented kalman filter feature enhancement for noise robust ASR.
953-956

- Chia-Yu Wan, Lin-Shan Lee:
Histogram-based quantization (HQ) for robust and scalable distributed speech recognition.
957-960

- Yong-Joo Chung:
A data-driven approach for the model parameter compensation in noisy speech recognition.
961-964

- Satoshi Kobashikawa, Satoshi Takahashi, Yoshikazu Yamaguchi, Atsunori Ogawa:
Rapid response and robust speech recognition by preliminary model adaptation for additive and convolutional noise.
965-968

- Saurabh Prasad, Stephen A. Zahorian:
Nonlinear and linear transformations of speech features to compensate for channel and noise effects.
969-972

- Motoyuki Suzuki, Yusuke Kato, Akinori Ito, Shozo Makino:
Construction method of acoustic models dealing with various background noises based on combination of HMMs.
973-976

- Haitian Xu, Zheng-Hua Tan, Paul Dalsgaard, Børge Lindberg:
Robust speech recognition based on noise and SNR classification - a multiple-model framework.
977-980

- Hwa Jeon Song, Hyung Soon Kim:
Eigen-environment based noise compensation method for robust speech recognition.
981-984

- Martin Graciarena, Horacio Franco, Gregory K. Myers, Victor Abrash:
Robust feature compensation in nonstationary and multiple noise environments.
985-988

- Jasha Droppo, Alex Acero:
Maximum mutual information SPLICE transform for seen and unseen conditions.
989-992

- Sven E. Krüger, Martin Schafföner, Marcel Katz, Edin Andelic, Andreas Wendemuth:
Speech recognition with support vector machines in a hybrid system.
993-996

- Vincent Barreaud, Douglas D. O'Shaughnessy, Jean-Guy Dahan:
Experiments on speaker profile portability.
997-1000

- Daniele Colibro, Luciano Fissore, Claudio Vair, Emanuele Dalmasso, Pietro Laface:
A confidence measure invariant to language and grammar.
1001-1004

- Ken Schutte, James R. Glass:
Robust detection of sonorant landmarks.
1005-1008

Speech Production I
- Amélie Rochet-Capellan, Jean-Luc Schwartz:
The labial-coronal effect and CVCV stability during reiterant speech production: an acoustic analysis.
1009-1012

- Amélie Rochet-Capellan, Jean-Luc Schwartz:
The labial-coronal effect and CVCV stability during reiterant speech production: an articulatory analysis.
1013-1016

- Mitsuhiro Nakamura:
Articulatory constraints and coronal stops: an EPG study.
1017-1020

- Vincent Robert, Brigitte Wrobel-Dautcourt, Yves Laprie, Anne Bonneau:
Strategies of labial coarticulation.
1021-1024

- Jianwu Dang, Jianguo Wei, Takeharu Suzuki, Pascal Perrier:
Investigation and modeling of coarticulation during speech.
1025-1028

- Fang Hu:
Tongue kinematics in diphthong production in Ningbo Chinese.
1029-1032

- Takayuki Arai:
Comparing tongue positions of vowels in oral and nasal contexts.
1033-1036

- Slim Ouni:
Can we retrieve vocal tract dynamics that produced speech? toward a speaker articulatory strategy model.
1037-1040

- Pascal Perrier, Liang Ma, Yohan Payan:
Modeling the production of VCV sequences via the inversion of a biomechanical model of the tongue.
1041-1044

- Xiaochuan Niu, Alexander Kain, Jan P. H. van Santen:
Estimation of the acoustic properties of the nasal tract during the production of nasalized vowels.
1045-1048

- Kohichi Ogata:
A web-based articulatory speech synthesis system for distance education.
1049-1052

- Paavo Alku, Matti Airas, Tomas Bäckström, Hannu Pulakka:
Group delay function as a means to assess quality of glottal inverse filtering.
1053-1056

- Eva Björkner, Johan Sundberg, Paavo Alku:
Subglottal pressure and NAQ variation in voice production of classically trained baritone singers.
1057-1060

- Gunnar Fant, Anita Kruckenberg:
Covariation of subglottal pressure, F0 and intensity.
1061-1064

- Javier Pérez, Antonio Bonafonte:
Automatic voice-source parameterization of natural speech.
1065-1068

- Chakir Zeroual, John H. Esling, Lise Crevier-Buchman:
Physiological study of whispered speech in Moroccan Arabic.
1069-1072

- C. P. Moura, D. Andrade, L. M. Cunha, M. J. Cunha, H. Vilarinho, H. Barros, Diamantino Freitas, M. Pais-Clemente:
Voice quality in down syndrome children treated with rapid maxillary expansion.
1073-1076

- Julien Hanquinet, Francis Grenez, Jean Schoentgen:
Synthesis of disordered speech.
1077-1080

- Julie Fontecave, Frédéric Berthommier:
Quasi-automatic extraction of tongue movement from a large existing speech cineradiographic database.
1081-1084

- Shimon Sapir, Ravit Cohen Mimran:
The working memory token test (WMTT): preliminary findings in young adults with and without dyslexia.
1085-1088

- Sérgio Paulo, Luís C. Oliveira:
Reducing the corpus-based TTS signal degradation due to speaker's word pronunciations.
1089-1092

- Wai-Sum Lee:
A phonetic study of the "er-hua" rimes in Beijing Mandarin.
1093-1096

Acoustic Processing for ASR I-III
- Li Deng, Dong Yu, Alex Acero:
Learning statistically characterized resonance targets in a hidden trajectory model of speech coarticulation and reduction.
1097-1100

- Daniil Kocharov, András Zolnay, Ralf Schlüter, Hermann Ney:
Articulatory motivated acoustic features for speech recognition.
1101-1104

- Shinji Watanabe, Atsushi Nakamura:
Effects of Bayesian predictive classification using variational Bayesian posteriors for sparse training data in speech recognition.
1105-1108

- Yu Tsao, Jinyu Li, Chin-Hui Lee:
A study on separation between acoustic models and its applications.
1109-1112

- Mohamed Afify:
Extended baum-welch reestimation of Gaussian mixture models based on reverse Jensen inequality.
1113-1116

- Asela Gunawardana, Milind Mahajan, Alex Acero, John C. Platt:
Hidden conditional random fields for phone classification.
1117-1120

Signal Analysis, Processing and Feature Estimation I-III
- Francesco Gianfelici, Giorgio Biagetti, Paolo Crippa, Claudio Turchetti:
Asymptotically exact AM-FM decomposition based on iterated hilbert transform.
1121-1124

- Athanassios Katsamanis, Petros Maragos:
Advances in statistical estimation and tracking of AM-FM speech components.
1125-1128

- Jonathan Darch, Ben P. Milner, Saeed Vaseghi:
Formant frequency prediction from MFCC vectors in noisy environments.
1129-1132

- S. R. Mahadeva Prasanna, B. Yegnanarayana:
Detection of vowel onset point events using excitation information.
1133-1136

- João P. Cabral, Luís C. Oliveira:
Pitch-synchronous time-scaling for prosodic and voice quality transformations.
1137-1140

- Yasunori Ohishi, Masataka Goto, Katunobu Itou, Kazuya Takeda:
Discrimination between singing and speaking voices.
1141-1144

Spoken Language Resources and Technology Evaluation I, II
- Douglas A. Jones, Wade Shen, Elizabeth Shriberg, Andreas Stolcke, Teresa M. Kamm, Douglas A. Reynolds:
Two experiments comparing reading with listening for human processing of conversational telephone speech.
1145-1148

- Sylvain Galliano, Edouard Geoffrois, Djamel Mostefa, Khalid Choukri, Jean-François Bonastre, Guillaume Gravier:
The ESTER phase II evaluation campaign for the rich transcription of French broadcast news.
1149-1152

- Takashi Saito:
A method of multi-layered speech segmentation tailored for speech synthesis.
1153-1156

- Sérgio Paulo, Luís C. Oliveira:
Generation of word alternative pronunciations using weighted finite state transducers.
1157-1160

- Helmer Strik, Diana Binnenpoorte, Catia Cucchiarini:
Multiword expressions in spontaneous speech: do we really speak like that?
1161-1164

- Jáchym Kolár, Jan Svec, Stephanie Strassel, Christopher Walker, Dagmar Kozlíková, Josef Psutka:
Czech spontaneous speech corpus with structural metadata.
1165-1168

Early Language Acquisition
Multi-modal / Multi-media Processing I, II
- Raghunandan S. Kumaran, Karthik Narayanan, John N. Gowdy:
Myoelectric signals for multimodal speech recognition.
1189-1192

- Philippe Daubias:
Is color information really useful for lip-reading ? (or what is lost when color is not used).
1193-1196

- Islam Shdaifat, Rolf-Rainer Grigat:
A system for audio-visual speech recognition.
1197-1200

- Norihide Kitaoka, Hironori Oshikawa, Seiichi Nakagawa:
Multimodal interface for organization name input based on combination of isolated word recognition and continuous base-word recognition.
1201-1204

- Yosuke Matsusaka:
Recognition of (3) party conversation using prosody and gaze.
1205-1208

- Dongdong Li, Yingchun Yang, Zhaohui Wu:
Combining voiceprint and face biometrics for speaker identification using SDWS.
1209-1212

- Neil Cooke, Martin Russell:
Using the focus of visual attention to improve spontaneous speech recognition.
1213-1216

- Sabri Gurbuz:
Real-time outer lip contour tracking for HCI applications.
1217-1220

- Jing Huang, Karthik Visweswariah:
Improving lip-reading with feature space transforms for multi-stream audio-visual speech recognition.
1221-1224

- Hansjörg Mixdorff, Denis Burnham, Guillaume Vignali, Patavee Charnvivit:
Are there facial correlates of Thai syllabic tones?
1225-1228

- Rowan Seymour, Ji Ming, Darryl Stewart:
A new posterior based audio-visual integration method for robust speech recognition.
1229-1232

Bridging the Gap ASR-HSR
- Sorin Dusan, Lawrence R. Rabiner:
On integrating insights from human speech perception into automatic speech recognition.
1233-1236

- Odette Scharenborg:
Parallels between HSR and ASR: how ASR can contribute to HSR.
1237-1240

- Louis ten Bosch, Odette Scharenborg:
ASR decoding in a computational model of human word recognition.
1241-1244

- Viktoria Maier, Roger K. Moore:
An investigation into a simulation of episodic memory for automatic speech recognition.
1245-1248

- Eric Fosler-Lussier, C. Anton Rytting, Soundararajan Srinivasan:
Phonetic ignorance is bliss: investigating the effects of phonetic information reduction on ASR performance.
1249-1252

- Marcus Holmberg, David Gelbart, Ulrich Ramacher, Werner Hemmert:
Automatic speech recognition with neural spike trains.
1253-1256

- Michael J. Carey, Tuan P. Quang:
A speech similarity distance weighting for robust recognition.
1257-1260

- Takao Murakami, Kazutaka Maruyama, Nobuaki Minematsu, Keikichi Hirose:
Japanese vowel recognition based on structural representation of speech.
1261-1264

- Soundararajan Srinivasan, DeLiang Wang:
Modeling the perception of multitalker speech.
1265-1268

- Sue Harding, Jon P. Barker, Guy J. Brown:
Binaural feature selection for missing data speech recognition.
1269-1272

- Thorsten Wesker, Bernd T. Meyer, Kirsten Wagener, Jörn Anemüller, Alfred Mertins, Birger Kollmeier:
Oldenburg logatome speech corpus (OLLO) for speech recognition experiments with humans and machines.
1273-1276

Speech Recognition - Language Modelling I-III
- Jen-Wei Kuo, Berlin Chen:
Minimum word error based discriminative training of language models.
1277-1280

- A. Ghaoui, François Yvon, Chafic Mokbel, Gérard Chollet:
On the use of morphological constraints in n-gram statistical language model.
1281-1284

- Elvira I. Sicilia-Garcia, Ji Ming, F. Jack Smith:
A posteriori multiple word-domain language model.
1285-1288

- Javier Dieguez-Tirado, Carmen García-Mateo, Antonio Cardenal López:
Effective topic-tree based language model adaptation.
1289-1292

- Abhinav Sethy, Panayiotis G. Georgiou, Shrikanth Narayanan:
Building topic specific language models from webdata using competitive models.
1293-1296

- Carlos Troncoso, Tatsuya Kawahara:
Trigger-based language model adaptation for automatic meeting transcription.
1297-1300

- Jacques Duchateau, Dong Hoon Van Uytsel, Hugo Van Hamme, Patrick Wambacq:
Statistical language models for large vocabulary spontaneous speech recognition in dutch.
1301-1304

- Alexandre Allauzen, Jean-Luc Gauvain:
Diachronic vocabulary adaptation for broadcast news transcription.
1305-1308

- Vesa Siivola, Bryan L. Pellom:
Growing an n-gram language model.
1309-1312

- Harald Hning, Manuel Kirschner, Fritz Class, André Berton, Udo Haiber:
Embedding grammars into statistical language models.
1313-1316

- Simo Broman, Mikko Kurimo:
Methods for combining language models in speech recognition.
1317-1320

- Airenas Vaiciunas, Gailius Raskinis:
Review of statistical modeling of highly inflected lithuanian using very large vocabulary.
1321-1324

- Genevieve Gorrell, Brandyn Webb:
Generalized hebbian algorithm for incremental latent semantic analysis.
1325-1328

- Arnar Thor Jensson, Edward W. D. Whittaker, Koji Iwano, Sadaoki Furui:
Language model adaptation for resource deficient languages using translated data.
1329-1332

- Petra Witschel, Sergey Astrov, Gabriele Bakenecker, Josef G. Bauer, Harald Höge:
POS-based language models for large vocabulary speech recognition on embedded systems.
1333-1336

Speech Recognition - Pronunciation Modelling
- Je Hun Jeon, Minhwa Chung:
Automatic generation of domain-dependent pronunciation lexicon with data-driven rules and rule adaptation.
1337-1340

- Michael Tjalve, Mark Huckvale:
Pronunciation variation modelling using accent features.
1341-1344

- Khiet P. Truong, Ambra Neri, Febe de Wet, Catia Cucchiarini, Helmer Strik:
Automatic detection of frequent pronunciation errors made by L2-learners.
1345-1348

- Josef Psutka, Pavel Ircing, Josef V. Psutka, Jan Hajic, William J. Byrne, Jirí Mírovský:
Automatic transcription of Czech, Russian, and Slovak spontaneous speech in the MALACH project.
1349-1352

- Stéphane Dupont, Christophe Ris, Laurent Couvreur, Jean-Marc Boite:
A study of implicit and explicit modeling of coarticulation and pronunciation variation.
1353-1356

- Shinya Takahashi, Tsuyoshi Morimoto, Sakashi Maeda, Naoyuki Tsuruta:
Detection of coughs from user utterances using imitated phoneme model.
1357-1360

- V. Ramasubramanian, P. Srinivas, T. V. Sreenivas:
Stochastic pronunciation modeling by ergodic-HMM of acoustic sub-word units.
1361-1364

- Chen Liu, Lynette Melnar:
An automated linguistic knowledge-based cross-language transfer method for building acoustic models for a language without native training data.
1365-1368

- Ghazi Bouselmi, Dominique Fohr, Irina Illina, Jean Paul Haton:
Fully automated non-native speech recognition using confusion-based acoustic model integration.
1369-1372

Prosodic Structure
- Véronique Aubergé, Albert Rilliard:
The focus prosody: more than a simple binary function.
1373-1376

- Martha Dalton, Ailbhe Ní Chasaide:
Peak timing in two dialects of connaught irish.
1377-1380

- Janet Fletcher:
Compound rises and "uptalk" in spoken English.
1381-1384

- Li-chiung Yang:
Duration and the temporal structure of Mandarin discourse.
1385-1388

- Bei Wang:
Prosodic realization of split noun phrases in Mandarin Chinese compared in topic and focus contexts.
1389-1392

- Ziyu Xiong:
Downstep effect on disyllabic words of citation forms in standard Chinese.
1393-1396

- Jinfu Ni, Hisashi Kawai, Keikichi Hirose:
Estimation of intonation variation with constrained tone transformations.
1397-1400

- Ho-hsien Pan:
Voice quality of falling tones in taiwan min.
1401-1404

- Chiu-yu Tseng, Bau-Ling Fu:
Duration, intensity and pause predictions in relation to prosody organization.
1405-1408

- Jiahong Yuan, Jason M. Brenier, Daniel Jurafsky:
Pitch accent prediction: effects of genre and speaker.
1409-1412

- Hiroya Fujisaki, Sumio Ohno:
Analysis and modeling of fundamental frequency contours of hindi utterances.
1413-1416

- Natasha Govender, Etienne Barnard, Marelie H. Davel:
Fundamental frequency and tone in isizulu: initial experiments.
1417-1420

- Judith Bishop, Marc Peake, Dmitry Sityaev:
Intonational sequences in tuscan Italian.
1421-1424

- Caterina Petrone:
Effects of raddoppiamento sintattico on tonal alignment in Italian.
1425-1428

- Tomás Dubeda, Jan Votrubec:
Acoustic analysis of Czech stress: intonation, duration and intensity revisited.
1429-1432

- Mohamed Yeou:
Variability of F0 peak alignment in moroccan Arabic accentual focus.
1433-1436

- Anne Lacheret, Ch. Lyche, Michel Morel:
Phonological analysis of schwa and liaison within the PFC project (phonologie du fran ais contemporain): how determinant are the prosodic factors?
1437-1440

- Plínio A. Barbosa, Pablo Arantes, Alexsandro R. Meireles, Jussara M. Vieira:
Abstractness in speech-metronome synchronisation: P-centres as cyclic attractors.
1441-1444

Applications of Confidence Related Measures to ASR
- Makoto Yamada, Tsuneo Kato, Masaki Naito, Hisashi Kawai:
Improvement of rejection performance of keyword spotting using anti-keywords derived from large vocabulary considering acoustical similarity to keywords.
1445-1448

- Ralf Schlüter, T. Scharrenbach, Volker Steinbiss, Hermann Ney:
Bayes risk minimization using metric loss functions.
1449-1452

- Akio Kobayashi, Kazuo Onoe, Shoei Sato, Toru Imai:
Word error rate minimization using an integrated confidence measure.
1453-1456

- Bin Dong, Qingwei Zhao, Yonghong Yan:
Fast confidence measure algorithm for continuous speech recognition.
1457-1460

- Hamed Ketabdar, Jithendra Vepa, Samy Bengio, Hervé Bourlard:
Developing and enhancing posterior based speech recognition systems.
1461-1464

- Peng Liu, Ye Tian, Jian-Lai Zhou, Frank K. Soong:
Background model based posterior probability for measuring confidence.
1465-1468

Multilingual TTS
- Laura Mayfield Tomokiyo, Alan W. Black, Kevin A. Lenzo:
Foreign accents in synthetic speech: development and evaluation.
1469-1472

- Raul Fernandez, Wei Zhang, Ellen Eide, Raimo Bakis, Wael Hamza, Yi Liu, Michael Picheny, John F. Pitrelli, Yong Qing, Zhiwei Shuang, Li Qin Shen:
Toward multiple-language TTS: experiments in English and Mandarin.
1473-1476

- Javier Latorre, Koji Iwano, Sadaoki Furui:
Cross-language synthesis with a polyglot synthesizer.
1477-1480

- Mucemi Gakuru, Frederick K. Iraki, Roger C. F. Tucker, Ksenia Shalonova, Kamanda Ngugi:
Development of a Kiswahili text to speech system.
1481-1484

- Jaime Botella Ordinas, Volker Fischer, Claire Waast-Richard:
Multilingual models in the IBM bilingual text-to-speech systems.
1485-1488

- Artur Janicki, Piotr Herman:
Reconstruction of Polish diacritics in a text-to-speech system.
1489-1492

Speech Bandwidth Extension
- Hiroyuki Ehara, Toshiyuki Morii, Masahiro Oshikiri, Koji Yoshida, Kouichi Honma:
Design of bandwidth scalable LSF quantization using interframe and intraframe prediction.
1493-1496

- Bernd Geiser, Peter Jax, Peter Vary:
Artificial bandwidth extension of speech supported by watermark-transmitted side information.
1497-1500

- Rongqiang Hu, Venkatesh Krishnan, David V. Anderson:
Speech bandwidth extension by improved codebook mapping towards increased phonetic classification.
1501-1504

- Dhananjay Bansal, Bhiksha Raj, Paris Smaragdis:
Bandwidth expansion of narrowband speech using non-negative matrix factorization.
1505-1508

- Michael L. Seltzer, Alex Acero, Jasha Droppo:
Robust bandwidth extension of noise-corrupted narrowband speech.
1509-1512

- João P. Cabral, Luís C. Oliveira:
Pitch-synchronous time-scaling for high-frequency excitation regeneration.
1513-1516

Spoken Language Resources and Technology Evaluation I, II
- Felix Burkhardt, Astrid Paeschke, M. Rolfes, Walter F. Sendlmeier, Benjamin Weiss:
A database of German emotional speech.
1517-1520

- Philippe Boula de Mareüil, Christophe d'Alessandro, Gérard Bailly, Frédéric Béchet, Marie-Neige Garcia, Michel Morel, Romain Prudon, Jean Véronis:
Evaluating the pronunciation of proper names by four French grapheme-to-phoneme converters.
1521-1524

- Filip Jurcícek, Jirí Zahradil, Libor Jelínek:
A human-human train timetable dialogue corpus.
1525-1528

- Gloria Branco, Luís Almeida, Rui Gomes, Nuno Beires:
A Portuguese spoken and multi-modal dialog corpora.
1529-1532

- Joyce Y. C. Chan, P. C. Ching, Tan Lee:
Development of a Cantonese-English code-mixing speech corpus.
1533-1536

- Andrej Zgank, Darinka Verdonik, Aleksandra Zögling Markus, Zdravko Kacic:
BNSI Slovenian broadcast news database - speech and text corpus.
1537-1540

- Jan Volín, Radek Skarnitzl, Petr Pollák:
Confronting HMM-based phone labelling with human evaluation of speech production.
1541-1544

- Stephanie Strassel, Jáchym Kolár, Zhiyi Song, Leila Barclay, Meghan Lammie Glenn:
Structural metadata annotation: moving beyond English.
1545-1548

- Delphine Charlet, Sacha Krstulovic, Frédéric Bimbot, Olivier Boëffard, Dominique Fohr, Odile Mella, Filip Korkmazsky, Djamel Mostefa, Khalid Choukri, Arnaud Vallée:
Neologos: an optimized database for the development of new speech processing algorithms.
1549-1552

- Cheng-Yuan Lin, Kuan-Ting Chen, Jyh-Shing Roger Jang:
A hybrid approach to automatic segmentation and labeling for Mandarin Chinese speech corpus.
1553-1556

- Yuang-Chin Chiang, Min-Siong Liang, Hong-Yi Lin, Ren-Yuan Lyu:
The multiple pronunciations in Taiwanese and the automatic transcription of Buddhist sutra with augmented read speech.
1557-1560

- Marelie H. Davel, Etienne Barnard:
Bootstrapping pronunciation dictionaries: practical issues.
1561-1564

- Nigel G. Ward, Anais G. Rivera, Karen Ward, David G. Novick:
Root causes of lost time and user stress in a simple dialog system.
1565-1568

- Julie A. Parisi, Douglas Brungart:
Evaluating communication effectiveness in team collaboration.
1569-1572

- David Conejero, Alan Lounds, Carmen García-Mateo, Leandro Rodríguez Liñares, Raquel Mochales, Asunción Moreno:
Bilingual aligned corpora for speech to speech translation for Spanish, English and Catalan.
1573-1576

- Hynek Boril, Petr Pollák:
Design and collection of Czech Lombard speech database.
1577-1580

- Abe Kazemzadeh, Hong You, Markus Iseli, Barbara Jones, Xiaodong Cui, Margaret Heritage, Patti Price, Elaine Andersen, Shrikanth Narayanan, Abeer Alwan:
TBALL data collection: the making of a young children's speech corpus.
1581-1584

- Hitomi Tohyama, Shigeki Matsubara, Nobuo Kawaguchi, Yasuyoshi Inagaki:
Construction and utilization of bilingual speech corpus for simultaneous machine interpretation research.
1585-1588

- Rebecca A. Bates, Patrick Menning, Elizabeth Willingham, Chad Kuyper:
Meeting acts: a labeling system for group interaction in meetings.
1589-1592

- Marius-Calin Silaghi, Rachna Vargiya:
A new evaluation criteria for keyword spotting techniques and a new algorithm.
1593-1596

- Christoph Draxler, Alexander Steffen:
Phattsessionz: recording 1000 adolescent speakers in schools in Germany.
1597-1600

- Solomon Teferra Abate, Wolfgang Menzel, Bairu Tafila:
An Amharic speech corpus for large vocabulary continuous speech recognition.
1601-1604

- Hans Dolfing, David Reitter, Luís Almeida, Nuno Beires, Michael Cody, Rui Gomes, Kerry Robinson, Roman Zielinski:
The FASil speech and multimodal corpora.
1605-1608

- Karin Müller:
Revealing phonological similarities between German and dutch.
1609-1612

Large Vocabulary Speech Recognition Systems
- Dimitra Vergyri, Katrin Kirchhoff, Venkata Ramana Rao Gadde, Andreas Stolcke, Jing Zheng:
Development of a conversational telephone speech recognizer for Levantine Arabic.
1613-1616

- Bhuvana Ramabhadran:
Exploiting large quantities of spontaneous speech for unsupervised training of acoustic models.
1617-1620

- Che-Kuang Lin, Lin-Shan Lee:
Improved spontaneous Mandarin speech recognition by disfluency interruption point (IP) detection using prosodic features.
1621-1624

- Jeff Z. Ma, Spyros Matsoukas:
Improvements to the BBN RT04 Mandarin conversational telephone speech recognition system.
1625-1628

- Sakriani Sakti, Satoshi Nakamura, Konstantin Markov:
Incorporating a Bayesian wide phonetic context model for acoustic rescoring.
1629-1632

- Abdelkhalek Messaoudi, Lori Lamel, Jean-Luc Gauvain:
Modeling vowels for Arabic BN transcription.
1633-1636

- Mohamed Afify, Long Nguyen, Bing Xiang, Sherif Abdou, John Makhoul:
Recent progress in Arabic broadcast news transcription at BBN.
1637-1640

- Spyros Matsoukas, Rohit Prasad, Srinivas Laxminarayan, Bing Xiang, Long Nguyen, Richard M. Schwartz:
The 2004 BBN 1xRT recognition systems for English broadcast news and conversational telephone speech.
1641-1644

- Rohit Prasad, Spyros Matsoukas, Chia-Lin Kao, Jeff Z. Ma, D.-X. Xu, Thomas Colthurst, Owen Kimball, Richard M. Schwartz, Jean-Luc Gauvain, Lori Lamel, Holger Schwenk, Gilles Adda, Fabrice Lefèvre:
The 2004 BBN/LIMSI 20xRT English conversational telephone speech recognition system.
1645-1648

- Bing Xiang, Long Nguyen, Xuefeng Guo, Dongxin Xu:
The BBN Mandarin broadcast news transcription system.
1649-1652

- Paul Deléglise, Yannick Estève, Sylvain Meignier, Téva Merlin:
The LIUM speech transcription system: a CMU Sphinx III-based system for French broadcast news.
1653-1656

- Lori Lamel, Gilles Adda, Eric Bilinski, Jean-Luc Gauvain:
Transcribing lectures and seminars.
1657-1660

- Thomas Hain, John Dines, Giulia Garau, Martin Karafiát, Darren Moore, Vincent Wan, Roeland Ordelman, Steve Renals:
Transcription of conference room meetings: an investigation.
1661-1664

- Jean-Luc Gauvain, Gilles Adda, Martine Adda-Decker, Alexandre Allauzen, Véronique Gendner, Lori Lamel, Holger Schwenk:
Where are we in transcribing French broadcast news?
1665-1668

- Odette Scharenborg, Stephanie Seneff:
Two-pass strategy for handling OOVs in a large vocabulary recognition task.
1669-1672

- Long Nguyen, Bing Xiang, Mohamed Afify, Sherif Abdou, Spyros Matsoukas, Richard M. Schwartz, John Makhoul:
The BBN RT04 English broadcast news transcription system.
1673-1676

- Rong Zhang, Ziad Al Bawab, Arthur Chan, Ananlada Chotimongkol, David Huggins-Daines, Alexander I. Rudnicky:
Investigations on ensemble based semi-supervised acoustic model training.
1677-1680

- Jan Nouza, Jindrich Zdánský, Petr David, Petr Cerva, Jan Kolorenc, Dana Nejedlová:
Fully automated system for Czech spoken broadcast transcription with very large (300k+) lexicon.
1681-1684

- Mike Schuster, Takaaki Hori, Atsushi Nakamura:
Experiments with probabilistic principal component analysis in LVCSR.
1685-1688

- Thang Tat Vu, Dung Tien Nguyen, Chi Mai Luong, John-Paul Hosom:
Vietnamese large vocabulary continuous speech recognition.
1689-1692

- Takahiro Shinozaki, Mari Ostendorf, Les E. Atlas:
Data sampling for improved speech recognizer training.
1693-1696

Speech Perception I, II
- Do Dat Tran, Eric Castelli, Jean-François Serignat, Van Loan Trinh, Le Xuan Hung:
Influence of F0 on Vietnamese syllable perception.
1697-1700

- Barbara Schwanhäußer, Denis Burnham:
Lexical tone and pitch perception in tone and non-tone language speakers.
1701-1704

- Isabel Falé, Isabel Hub Faria:
Intonational contrasts in EP: a categorical perception approach.
1705-1708

- Bettina Braun, Andrea Weber, Matthew W. Crocker:
Does narrow focus activate alternative referents?
1709-1712

- Kiyoaki Aikawa, Hayato Hashimoto:
Audiovisual interaction on the perception of frequency glide of linear sweep tones.
1713-1716

- Kei Omata, Ken Mogi:
Audiovisual integration in dichotic listening.
1717-1720

- Gunilla Svanfeldt, Dirk Olszewski:
Perception experiment combining a parametric loudspeaker and a synthetic talking head.
1721-1724

- Catherine Mayo, Robert A. J. Clark, Simon King:
Multidimensional scaling of listener responses to synthetic speech.
1725-1728

- Hiroko Terasawa, Malcolm Slaney, Jonathan Berger:
A timbre space for speech.
1729-1732

- Abdellah Kacha, Francis Grenez, Jean Schoentgen:
Voice quality assessment by means of comparative judgments of speech tokens.
1733-1736

- Toshio Irino, Satoru Satou, Shunsuke Nomura, Hideki Banno, Hideki Kawahara:
Speech intelligibility derived from time-frequency and source smearing.
1737-1740

- Nahoko Hayashi, Takayuki Arai, Nao Hodoshima, Yusuke Miyauchi, Kiyohiro Kurisu:
Steady-state pre-processing for improving speech intelligibility in reverberant environments: evaluation in a hall with an electrical reverberator.
1741-1744

- Patrick C. M. Wong, Kiara M. Lee, Todd B. Parrish:
Neural bases of listening to speech in noise.
1745-1748

- P. Jongmans, Frans J. M. Hilgers, Louis C. W. Pols, C. J. van As-Brooks:
The intelligibility of tracheoesophageal speech: first results.
1749-1752

- Guy J. Brown, Kalle J. Palomäki:
A computational model of the speech reception threshold for laterally separated speech and noise.
1753-1756

- Esther Janse:
Lexical inhibition effects in time-compressed speech.
1757-1760

- Caroline Jacquier, Fanny Meunier:
Perception of time-compressed rapid acoustic cues in French CV syllables.
1761-1764

- Claire-Léonie Grataloup, Michel Hoen, François Pellegrino, E. Veuillet, Lionel Collet, Fanny Meunier:
Reversed speech comprehension depends on the auditory efferent system functionality.
1765-1768

- Won Tokuma, Shinichi Tokuma:
Perceptual space of English fricatives for Japanese learners.
1769-1772

- Ioana Vasilescu, Maria Candea, Martine Adda-Decker:
Perceptual salience of language-specific acoustic differences in autonomous fillers across eight languages.
1773-1776

- Marc D. Pell:
Effects of cortical and subcortical brain damage on the processing of emotional prosody.
1777-1780

Keynote Papers
- Elizabeth Shriberg:
Spontaneous speech: how people really talk and why engineers should care.
1781-1784

Speech Recognition - Adaptation I, II
Prosody Modelling and Speech Technology I, II
Detecting and Synthesizing Speaker State
- Julia Hirschberg, Stefan Benus, Jason M. Brenier, Frank Enos, Sarah Friedman, Sarah Gilman, Cynthia Girand, Martin Graciarena, Andreas Kathol, Laura Michaelis, Bryan L. Pellom, Elizabeth Shriberg, Andreas Stolcke:
Distinguishing deceptive from non-deceptive speech.
1833-1836

- Jackson Liscombe, Julia Hirschberg, Jennifer J. Venditti:
Detecting certainness in spoken tutorial dialogues.
1837-1840

- Laurence Vidrascu, Laurence Devillers:
Detection of real-life emotions in call centers.
1841-1844

- Jackson Liscombe, Giuseppe Riccardi, Dilek Z. Hakkani-Tür:
Using context to improve emotion detection in spoken dialog systems.
1845-1848

- Irena Yanushevskaya, Christer Gobl, Ailbhe Ní Chasaide:
Voice quality and f0 cues for affect expression: implications for synthesis.
1849-1852

- Toru Takahashi, Takeshi Fujii, Masashi Nishi, Hideki Banno, Toshio Irino, Hideki Kawahara:
Voice and emotional expression transformation based on statistics of vowel parameters in an emotional speech database.
1853-1856

Rapid Development of Spoken Dialogue Systems
- Giuseppe Di Fabbrizio, Gökhan Tür, Dilek Z. Hakkani-Tür:
Automated wizard-of-oz for spoken dialogue systems.
1857-1860

- Kouichi Katsurada, Kunitoshi Sato, Hiroaki Adachi, Hirobumi Yamada, Tsuneo Nitta:
A rapid prototyping tool for constructing web-based MMI applications.
1861-1864

- Philip Hanna, Ian M. O'Neill, Xingkun Liu, Michael F. McTear:
Developing extensible and reusable spoken dialogue components: an examination of the Queen's communicator.
1865-1868

- Ye-Yi Wang, Alex Acero:
SGStudio: rapid semantic grammar development for spoken language understanding.
1869-1872

- Murat Akbacak, Yuqing Gao, Liang Gu, Hong-Kwang Jeff Kuo:
Rapid transition to new spoken dialogue domains: language model training using knowledge from previous domain applications and web text resources.
1873-1876

- Manny Rayner, Pierrette Bouillon, Nikos Chatzichrisafis, Beth Ann Hockey, Marianne Santaholma, Marianne Starlander, Hitoshi Isahara, Kyoko Kanzaki, Yukie Nakao:
A methodology for comparing grammar-based and robust approaches to speech understanding.
1877-1880

Text-to-Speech I, II
- François Mairesse, Marilyn A. Walker:
Learning to personalize spoken generation for dialogue systems.
1881-1884

- S. Revelin, Didier Cadic, Claire Waast-Richard:
Optimization of text-to-speech phonetic transcriptions using a-posteriori signal comparison.
1885-1888

- Özgül Salor, Mübeccel Demirekler:
Voice transformation using principle component analysis based LSF quantization and dynamic programming approach.
1889-1892

- Hai Ping Li, Wei Zhang:
Adapt Mandarin TTS system to Chinese dialect TTS systems.
1893-1896

- Min Zheng, Qin Shi, Wei Zhang, Lianhong Cai:
Grapheme-to-phoneme conversion based on TBL algorithm in Mandarin TTS system.
1897-1900

- Paolo Massimino, Alberto Pacchiotti:
An automaton-based machine learning technique for automatic phonetic transcription.
1901-1904

- Tasanawan Soonklang, Robert I. Damper, Yannick Marchand:
Comparative objective and subjective evaluation of three data-driven techniques for proper name pronunciation.
1905-1908

- Olov Engwall:
Articulatory synthesis using corpus-based estimation of line spectrum pairs.
1909-1912

- Aoju Chen, Els den Os:
Effects of pitch accent type on interpreting information status in synthetic speech.
1913-1916

- Perttu Prusi, Anssi Kainulainen, Jaakko Hakulinen, Markku Turunen, Esa-Pekka Salonen, Leena Helin:
Towards generic spatial object model and route guidance grammar for speech-based systems.
1917-1920

- Chi-Chun Hsia, Chung-Hsien Wu, Te-Hsien Liu:
Duration-embedded bi-HMM for expressive voice conversion.
1921-1924

- Toshio Hirai, Hisashi Kawai, Minoru Tsuzaki, Nobuyuki Nishizawa:
Analysis of major factors of naturalness degradation in concatenative synthesis.
1925-1928

- Jilei Tian, Jani Nurminen, Imre Kiss:
Duration modeling and memory optimization in a Mandarin TTS system.
1929-1932

- Min-Siong Liang, Ke-Chun Chuang, Rhuei-Cheng Yang, Yuang-Chin Chiang, Ren-Yuan Lyu:
A bi-lingual Mandarin-to-taiwanese text-to-speech system.
1933-1936

- Uwe D. Reichel, Florian Schiel:
Using morphology and phoneme history to improve grapheme-to-phoneme conversion.
1937-1940

- Olga Goubanova, Simon King:
Predicting consonant duration with Bayesian belief networks.
1941-1944

- Per-Anders Jande:
Inducing decision tree pronunciation variation models from annotated speech data.
1945-1948

- Lijuan Wang, Yong Zhao, Min Chu, Frank K. Soong, Zhigang Cao:
Phonetic transcription verification with generalized posterior probability.
1949-1952

- Hua Cheng, Fuliang Weng, Niti Hantaweepant, Lawrence Cavedon, Stanley Peters:
Training a maximum entropy model for surface realization.
1953-1956

- Tomoki Toda, Kiyohiro Shikano:
NAM-to-speech conversion with Gaussian mixture models.
1957-1960

- Michelina Savino, Mario Refice, Massimo Mitaritonna:
Which Italian do current systems speak? a first step towards pronunciation modelling of Italian varieties.
1961-1964

- Dominika Oliver, Robert A. J. Clark:
Modelling pitch accent types for Polish speech synthesis.
1965-1968

- Chatchawarn Hansakunbuntheung, Ausdang Thangthai, Chai Wutiwiwatchai, Rungkarn Siricharoenchai:
Learning methods and features for corpus-based phrase break prediction on Thai.
1969-1972

- Paul Taylor:
Hidden Markov models for grapheme to phoneme conversion.
1973-1976

Speaker Characterization and Recognition I-IV
- Longbiao Wang, Norihide Kitaoka, Seiichi Nakagawa:
Robust distant speaker recognition based on position dependent cepstral mean normalization.
1977-1980

- David A. van Leeuwen:
Speaker adaptation in the NIST speaker recognition evaluation 2004.
1981-1984

- Jacob Goldberger, Hagai Aronowitz:
A distance measure between GMMs based on the unscented transform and its application to speaker recognition.
1985-1988

- Sorin Dusan:
Estimation of speaker's height and vocal tract length from speech signal.
1989-1992

- Doroteo Torre Toledano, Carlos Fombella, Joaquin Gonzalez-Rodriguez, Luis A. Hernández Gómez:
On the relationship between phonetic modeling precision and phonetic speaker recognition accuracy.
1993-1996

- J. Fortuna, P. Sivakumaran, Aladdin M. Ariyaeeinia, Amit S. Malegaonkar:
Open-set speaker identification using adapted Gaussian mixture models.
1997-2000

- James McAuley, Ji Ming, Pat Corr:
Speaker verification in noisy conditions using correlated subband features.
2001-2004

- Mikaël Collet, Yassine Mami, Delphine Charlet, Frédéric Bimbot:
Probabilistic anchor models approach for speaker verification.
2005-2008

- Mijail Arcienega, Anil Alexander, Philipp Zimmermann, Andrzej Drygajlo:
A Bayesian network approach combining pitch and spectral envelope features to reduce channel mismatch in speaker verification and forensic speaker recognition.
2009-2012

- Kwok-Kwong Yiu, Man-Wai Mak, Sun-Yuan Kung:
Channel robust speaker verification via Bayesian blind stochastic feature transformation.
2013-2016

- Tomoko Matsui, Kunio Tanabe:
dPLRM-based speaker identification with log power spectrum.
2017-2020

- Xianxian Zhang, John H. L. Hansen, Pongtep Angkititrakul, Kazuya Takeda:
Speaker verification using Gaussian mixture models within changing real car environments.
2021-2024

- Kanae Amino, Tsutomu Sugawara, Takayuki Arai:
The correspondences between the perception of the speaker individualities contained in speech sounds and their acoustic properties.
2025-2028

- Samuel Kim, Sung-Wan Yoon, Thomas Eriksson, Hong-Goo Kang, Dae Hee Youn:
A noise-robust pitch synchronous feature extraction algorithm for speaker recognition systems.
2029-2032

- Jing Deng, Thomas Fang Zheng, Zhanjiang Song, Jian Liu:
Modeling high-level information by using Gaussian mixture correlation for GMM-UBM based speaker recognition.
2033-2036

- Xianxian Zhang, John H. L. Hansen:
In-set/out-of-set speaker identification based on discriminative speech frame selection.
2037-2040

- Zhenchun Lei, Yingchun Yang, Zhaohui Wu:
Mixture of support vector machines for text-independent speaker recognition.
2041-2044

- Shilei Zhang, Junmei Bai, Shuwu Zhang, Bo Xu:
Optimal model order selection based on regression tree in speaker identification.
2045-2048

- Marcos Faúndez-Zanuy, Jordi Solé-Casals:
Speaker verification improvement using blind inversion of distortions.
2049-2052

Single-channel Speech Enhancement
- Israel Cohen:
Supergaussian GARCH models for speech signals.
2053-2056

- Athanasios Mouchtaris, Jan Van der Spiegel, Paul Mueller, Panagiotis Tsakalides:
A spectral conversion approach to feature denoising and speech enhancement.
2057-2060

- Alfonso Ortega, Eduardo Lleida, Enrique Masgrau, Luis Buera, Antonio Miguel:
Acoustic feedback cancellation in speech reinforcement systems for vehicles.
2061-2064

- Julien Bourgeois, Jürgen Freudenberger, Guillaume Lathoud:
Implicit control of noise canceller for speech enhancement.
2065-2068

- T. M. Sunil Kumar, T. V. Sreenivas:
Speech enhancement using Markov model of speech segments.
2069-2072

- Vladimir Braquet, Takao Kobayashi:
A wavelet based noise reduction algorithm for speech signal corrupted by coloured noise.
2073-2076

- Esfandiar Zavarehei, Saeed Vaseghi:
Speech enhancement in temporal DFT trajectories using Kalman filters.
2077-2080

- Qin Yan, Saeed Vaseghi, Esfandiar Zavarehei, Ben P. Milner:
Formant-tracking linear prediction models for speech processing in noisy environments.
2081-2084

- Hui Jiang, Qian-Jie Fu:
Statistical noise compensation for cochlear implant processing.
2085-2088

- Tuan Van Pham, Gernot Kubin:
WPD-based noise suppression using nonlinearly weighted threshold quantile estimation and optimal wavelet shrinking.
2089-2092

- Weifeng Li, Katunobu Itou, Kazuya Takeda, Fumitada Itakura:
Subjective and objective quality assessment of regression-enhanced speech in real car environments.
2093-2096

- Masashi Unoki, Masaaki Kubo, Atsushi Haniu, Masato Akagi:
A model for selective segregation of a target instrument sound from the mixed sound of various instruments.
2097-2100

- Richard C. Hendriks, Richard Heusdens, Jesper Jensen:
Improved decision directed approach for speech enhancement using an adaptive time segmentation.
2101-2104

- Heinrich W. Löllmann, Peter Vary:
Generalized filter-bank equalizer for noise reduction with reduced signal delay.
2105-2108

- Nicoleta Roman, DeLiang Wang:
A pitch-based model for separation of reverberant speech.
2109-2112

- David Y. Zhao, W. Bastiaan Kleijn:
On noise gain estimation for HMM-based speech enhancement.
2113-2116

- Om Deshmukh, Carol Y. Espy-Wilson:
Speech enhancement using auditory phase opponency model.
2117-2120

Acoustic Modelling for LVCSR
- Brian Mak, Jeff Siu-Kei Au-Yeung, Yiu-Pong Lai, Man-Hung Siu:
High-density discrete HMM with the use of scalar quantization indexing.
2121-2124

- Jing Zheng, Andreas Stolcke:
Improved discriminative training using phone lattices.
2125-2128

- Qifeng Zhu, Barry Y. Chen, Frantisek Grézl, Nelson Morgan:
Improved MLP structures for data-driven feature extraction for ASR.
2129-2132

- Wolfgang Macherey, Lars Haferkamp, Ralf Schlüter, Hermann Ney:
Investigations on error minimizing training criteria for discriminative training in automatic speech recognition.
2133-2136

- K. C. Sim, M. J. F. Gales:
Temporally varying model parameters for large vocabulary continuous speech recognition.
2137-2140

- Qifeng Zhu, Andreas Stolcke, Barry Y. Chen, Nelson Morgan:
Using MLP features in SRI's conversational speech recognition system.
2141-2144

Speech Production I
- Matti Airas, Hannu Pulakka, Tomas Bäckström, Paavo Alku:
A toolkit for voice inverse filtering and parametrisation.
2145-2148

- Denisse Sciamarella, Christophe d'Alessandro:
Stylization of glottal-flow spectra produced by a mechanical vocal-fold model.
2149-2152

- Hideyuki Nomura, Tetsuo Funada:
Numerical glottal sound source model as coupled problem between vocal cord vibration and glottal flow.
2153-2156

- Marianne Pouplier, Maureen Stone:
A tagged-cine MRI investigation of German vowels.
2157-2160

- Antoine Serrurier, Pierre Badin:
A three-dimensional linear articulatory model of velum based on MRI data.
2161-2164

- Anne Cros, Didier Demolin, Ana Georgina Flesia, Antonio Galves:
On the relationship between intra-oral pressure and speech sonority.
2165-2168

Speaker Characterization and Recognition I-IV
- Mohamed Kamal Omar, Jiri Navratil, Ganesh N. Ramaswamy:
Maximum conditional mutual information modeling for speaker verification.
2169-2172

- Luciana Ferrer, M. Kemal Sönmez, Sachin S. Kajarekar:
Class-dependent score combination for speaker recognition.
2173-2176

- Hagai Aronowitz, Dror Irony, David Burshtein:
Modeling intra-speaker variability for speaker recognition.
2177-2180

- Girija Chetty, Michael Wagner:
Liveness detection using cross-modal correlations in face-voice person authentication.
2181-2184

- Taichi Asami, Koji Iwano, Sadaoki Furui:
Stream-weight optimization by LDA and adaboost for multi-stream speaker verification.
2185-2188

- Yosef A. Solewicz, Moshe Koppel:
Considering speech quality in speaker verification fusion.
2189-2192

Gender and Age Issues in Speech and Language Research I, II
- Matteo Gerosa, Diego Giuliani, Fabio Brugnara:
Speaker adaptive acoustic modeling with mixture of adult and children's speech.
2193-2196

- Shona D'Arcy, Martin J. Russell:
A comparison of human and computer recognition accuracy for children's speech.
2197-2200

- Piero Cosi, Bryan L. Pellom:
Italian children's speech recognition for advanced interactive literacy tutors.
2201-2204

- Martine Adda-Decker, Lori Lamel:
Do speech recognizers prefer female speakers?
2205-2208

- Serdar Yildirim, Chul Min Lee, Sungbok Lee, Alexandros Potamianos, Shrikanth Narayanan:
Detecting Politeness and frustration state of a child in a conversational computer game.
2209-2212

- Diana Binnenpoorte, Christophe Van Bael, Els den Os, Lou Boves:
Gender in everyday speech and language: a corpus-based study.
2213-2216

Spoken Language Acquisition, Development and Learning I, II
- Shigeaki Amano:
Developmental change of phoneme duration in a Japanese infant and mother.
2217-2220

- Haiping Jia, Hiroki Mori, Hideki Kasuya:
Mora timing organization in producing contrastive geminate/single consonants and long/short vowels by native and non-native speakers of Japanese: effects of speaking rate.
2221-2224

- Hongyan Wang, Vincent J. van Heuven:
Mutual intelligibility of american, Chinese and dutch-accented speakers of English.
2225-2228

- Peter Juel Henrichsen:
Deriving a bi-lingual dictionary from raw transcription data.
2229-2232

- Kei Ohta, Seiichi Nakagawa:
A statistical method of evaluating pronunciation proficiency for Japanese words.
2233-2236

Language and Dialect Identification I, II
- Pavel Matejka, Petr Schwarz, Jan Cernocký, Pavel Chytil:
Phonotactic language identification using high quality phoneme recognition.
2237-2240

- Rongqing Huang, John H. L. Hansen:
Advances in word based dialect/accent classification.
2241-2244

- Rym Hamdi, Salem Ghazali, Melissa Barkat-Defradas:
Syllable structure in spoken Arabic: a comparative investigation.
2245-2248

- J. C. Marcadet, Volker Fischer, Claire Waast-Richard:
A transformation-based learning approach to language identification for mixed-lingual text-to-speech synthesis.
2249-2252

- Shuichi Itahashi, Shiwei Zhu, Mikio Yamamoto:
Constructing family trees of multilingual speech using Gaussian mixture models.
2253-2256

- Jean-Luc Rouas:
Modeling long and short-term prosody for language identification.
2257-2260

Spoken Language Translation I, II
- Matthias Paulik, Christian Fügen, Sebastian Stüker, Tanja Schultz, Thomas Schaaf, Alex Waibel:
Document driven machine translation enhanced ASR.
2261-2264

- Shahram Khadivi, András Zolnay, Hermann Ney:
Automatic text dictation in computer-assisted translation.
2265-2268

- Luis Rodríguez, Jorge Civera, Enrique Vidal, Francisco Casacuberta, César Martínez:
On the use of speech recognition in computer assisted translation.
2269-2272

- Andreas Kathol, Kristin Precoda, Dimitra Vergyri, Wen Wang, Susanne Riehemann:
Speech translation for low-resource languages: the case of Pashto.
2273-2276

- David Picó, Jorge González, Francisco Casacuberta, Diamantino Caseiro, Isabel Trancoso:
Finite-state transducer inference for a speech-input Portuguese-to-English machine translation system.
2277-2280

- Kenko Ohta, Keiji Yasuda, Gen-ichiro Kikui, Masuzo Yanagida:
Quantitative evaluation of effects of speech recognition errors on speech translation quality.
2281-2284

Multi-channel Speech Enhancement
- Thomas Lotter, Bastian Sauert, Peter Vary:
A stereo input-output superdirective beamformer for dual channel noise reduction.
2285-2288

- Ulrich Klee, Tobias Gehrig, John W. McDonough:
Kalman filters for time delay of arrival-based source localization.
2289-2292

- Osamu Ichikawa, Masafumi Nishimura:
Simultaneous adaptation of echo cancellation and spectral subtraction for in-car speech recognition.
2293-2296

- Rong Hu, Yunxin Zhao:
Variable step size adaptive decorrelation filtering for competing speech separation.
2297-2300

- Daisuke Saitoh, Atsunobu Kaminuma, Hiroshi Saruwatari, Tsuyoki Nishikawa, Akinobu Lee:
Speech extraction in a car interior using frequency-domain ICA with rapid filter adaptations.
2301-2304

- Rongqiang Hu, Sunil D. Kamath, David V. Anderson:
Speech enhancement using non-acoustic sensors.
2305-2308

- Marc Delcroix, Takafumi Hikichi, Masato Miyoshi:
Improved blind dereverberation performance by using spatial information.
2309-2312

- Junfeng Li, Masato Akagi:
A hybrid microphone array post-filter in a diffuse noise field.
2313-2316

- Venkatesh Krishnan, Phil Spencer Whitehead, David V. Anderson, Mark A. Clements:
A framework for estimation of clean speech by fusion of outputs from multiple speech enhancement systems.
2317-2320

- Yuki Denda, Takanobu Nishiura, Yoichi Yamashita:
A study of weighted CSP analysis with average speech spectrum for noise robust talker localization.
2321-2324

- Young-Ik Kim, Sung Jun An, Rhee Man Kil, Hyung-Min Park:
Sound segregation based on binaural zero-crossings.
2325-2328

- Jürgen Freudenberger, Klaus Linhard:
A two-microphone diversity system and its application for hands-free car kits.
2329-2332

- Takahiro Murakami, Kiyoshi Kurihara, Yoshihisa Ishida:
Directionally constrained minimization of power algorithm for speech signals.
2333-2336

- Alessio Brutti, Maurizio Omologo, Piergiorgio Svaizer:
Oriented global coherence field for the estimation of the head orientation in smart rooms equipped with distributed microphone arrays.
2337-2340

- Nilesh Madhu, Rainer Martin:
Robust speaker localization through adaptive weighted pair TDOA (AWEPAT) estimation.
2341-2344

- Guillaume Lathoud, Mathew Magimai-Doss, Bertrand Mesot:
A spectrogram model for enhanced source localization and noise-robust ASR.
2345-2348

- Sriram Srinivasan, Mattias Nilsson, W. Bastiaan Kleijn:
Denoising through source separation and minimum tracking.
2349-2352

- Louisa Busca Grisoni, John H. L. Hansen:
Collaborative voice activity detection for hearing aids.
2353-2356

- Enrique Robledo-Arnuncio, Biing-Hwang Juang:
Using inter-frequency decorrelation to reduce the permutation inconsistency problem in blind source separation.
2357-2360

- Amarnag Subramanya, Zhengyou Zhang, Zicheng Liu, Jasha Droppo, Alex Acero:
A graphical model for multi-sensory speech processing in air-and-bone conductive microphones.
2361-2364

Prosody in Language Performance I, II
- Heejin Kim, Jennifer Cole:
The stress foot as a unit of planned timing: evidence from shortening in the prosodic phrase.
2365-2368

- Pauline Welby, Hélène Loevenbruck:
Segmental "anchorage" and the French late rise.
2369-2372

- Ivan Chow:
Prosodic cues for syntactically-motivated junctures.
2373-2376

- Isabel Falé, Isabel Hub Faria:
A glimpse of the time-course of intonation processing in European Portuguese.
2377-2380

- Petra Wagner:
Great expectations - introspective vs. perceptual prominence ratings and their acoustic correlates.
2381-2384

- Christian Jensen, John Tndering:
Choosing a scale for measuring perceived prominence.
2385-2388

- Jens Edlund, David House, Gabriel Skantze:
The effects of prosodic features on the interpretation of clarification ellipses.
2389-2392

- Matthias Jilka:
Exploration of different types of intonational deviations in foreign-accented and synthesized speech.
2393-2396

- Jörg Bröggelwirth:
A rhythmic-prosodic model of poetic speech.
2397-2400

- Sonja Biersack, Vera Kempe, Lorna Knapton:
Fine-tuning speech registers: a comparison of the prosodic features of child-directed and foreigner-directed speech.
2401-2404

- Timothy Arbisi-Kelm:
An analysis of the intonational structure of stuttered speech.
2405-2408

- Britta Lintfert, Wolfgang Wokurek:
Voice quality dimensions of pitch accents.
2409-2412

- Marion Dohen, Hélène Loevenbruck:
Audiovisual production and perception of contrastive focus in French: a multispeaker study.
2413-2416

- Pashiera Barkhuysen, Emiel Krahmer, Marc Swerts:
Predicting end of utterance in multimodal and unimodal conditions.
2417-2420

- Saori Tanaka, Masafumi Nishida, Yasuo Horiuchi, Akira Ichikawa:
Production of prominence in Japanese sign language.
2421-2424

Speaker Characterization and Recognition I-IV
- Andreas Stolcke, Luciana Ferrer, Sachin S. Kajarekar, Elizabeth Shriberg, Anand Venkataraman:
MLLR transforms as features in speaker recognition.
2425-2428

- Brendan Baker, Robbie Vogt, Sridha Sridharan:
Gaussian mixture modelling of broad phonetic and syllabic events for text-independent speaker verification.
2429-2432

- Hagai Aronowitz, David Burshtein:
Efficient speaker identification and retrieval.
2433-2436

- R. Sinha, S. E. Tranter, M. J. F. Gales, Philip C. Woodland:
The Cambridge University March 2005 speaker diarisation system.
2437-2440

- Xuan Zhu, Claude Barras, Sylvain Meignier, Jean-Luc Gauvain:
Combining speaker identification and BIC for speaker diarization.
2441-2444

- Dan Istrate, Nicolas Scheffer, Corinne Fredouille, Jean-François Bonastre:
Broadcast news speaker tracking for ESTER 2005 campaign.
2445-2448

Phonetics and Phonology I, II
- Sorin Dusan:
On the nature of acoustic information in identification of coarticulated vowels.
2449-2452

- Cédric Gendrot, Martine Adda-Decker:
Impact of duration on F1/F2 formant values of oral vowels: an automatic analysis of large broadcast news corpora in French and German.
2453-2456

- Hugo Quené:
Modeling of between-speaker and within-speaker variation in spontaneous speech tempo.
2457-2460

- Masahiko Komatsu, Makiko Aoyagi:
Vowel devoicing vs. mora-timed rhythm in spontaneous Japanese - inspection of phonetic labels of OGI_TS.
2461-2464

- Jalal-Eddin Al-Tamimi, Emmanuel Ferragne:
Does vowel space size depend on language vowel inventories? evidence from two Arabic dialects and French.
2465-2468

- Chilin Shih:
Understanding phonology by phonetic implementation.
2469-2472

Spoken / Multi-modal Dialogue Systems I, II
Human factors, User Experience and Natural Language Application Design
- Esther Levin, Alex Levin:
Spoken dialog system for real-time data capture.
2497-2500

- Michael Pucher, Peter Fröhlich:
A user study on the influence of mobile device class, synthesis method, data rate and lexicon on speech synthesis quality.
2501-2504

- Fang Chen, Yael Katzenellenbogen:
User's experience of a commercial speech dialogue system.
2505-2508

- Esther Levin, Amir M. Mané:
Voice user interface design for automated directory assistance.
2509-2512

- Maria Gabriela Alvarez-Ryan, Narendra K. Gupta, Barbara Hollister, Tirso Alonso:
Optimizing user experience through design of the spoken language understanding (SLU) module.
2513-2516

- Jeremy H. Wright, David A. Kapilow, Alicia Abella:
Interactive visualization of human-machine dialogs.
2517-2520

TTS Inventory
- Matthew P. Aylett:
Synthesising hyperarticulation in unit selection TTS.
2521-2524

- Daniel Tihelka:
Symbolic prosody driven unit selection for highly natural synthetic speech.
2525-2528

- Jindrich Matousek, Zdenek Hanzlícek, Daniel Tihelka:
Hybrid syllable/triphone speech synthesis.
2529-2532

- Francisco Campillo Díaz, José Luis Alba, Eduardo Rodríguez Banga:
A neural network approach for the design of the target cost function in unit-selection speech synthesis.
2533-2536

- Christian Weiss:
FSM and k-nearest-neighbor for corpus based video-realistic audio-visual synthesis.
2537-2540

- Gui-Lin Chen, Ke-Song Han, Zhen-Li Yu, Dong-Jian Yue, Yi-Qing Zu:
An embedded and concatenative approach to TTS of multiple languages.
2541-2544

- Tony Ezzat, Ethan Meyers, James R. Glass, Tomaso Poggio:
Morphing spectral envelopes using audio flow.
2545-2548

- Vincent Colotte, Richard Beaufort:
Linguistic features weighting for a text-to-speech system without prosody model.
2549-2552

- Ingunn Amdal, Torbjørn Svendsen:
Unit selection synthesis database development using utterance verification.
2553-2556

- Yong Zhao, Lijuan Wang, Min Chu, Frank K. Soong, Zhigang Cao:
Refining phoneme segmentations using speaker-adaptive context dependent boundary models.
2557-2560

- Yining Chen, Yong Zhao, Min Chu:
Customizing base unit set with speech database in TTS systems.
2561-2564

- Soufiane Rouibia, Olivier Rosec:
Unit selection for speech synthesis based on a new acoustic target cost.
2565-2568

- Dan Chazan, Ron Hoory, Zvi Kons, Ariel Sagi, Slava Shechtman, Alexander Sorin:
Small footprint concatenative text-to-speech synthesis system using complex spectral envelope modeling.
2569-2572

- Francesc Alías, Ignasi Iriondo Sanz, Lluís Formiga, Xavier Gonzalvo, Carlos Monzo, Xavier Sevillano:
High quality Spanish restricted-domain TTS oriented to a weather forecast application.
2573-2576

- Ingmund Bjrkan, Torbjørn Svendsen, Snorre Farner:
Comparing spectral distance measures for join cost optimization in concatenative speech synthesis.
2577-2580

- Maria João Barros, Ranniery Maia, Keiichi Tokuda, Fernando Gil Resende, Diamantino Freitas:
HMM-based european Portuguese TTS system.
2581-2584

- Wael Hamza, John F. Pitrelli:
Combining the flexibility of speech synthesis with the naturalness of pre-recorded audio: a comparison of two approaches to phrase-splicing TTS.
2585-2588

- Guntram Strecha, Oliver Jokisch, Matthias Eichner, Rüdiger Hoffmann:
Codec integrated voice conversion for embedded speech synthesis.
2589-2592

- David Sündermann, Guntram Strecha, Antonio Bonafonte, Harald Höge, Hermann Ney:
Evaluation of VTLN-based voice conversion for embedded speech synthesis.
2593-2596

- Juri Isogai, Junichi Yamagishi, Takao Kobayashi:
Model adaptation and adaptive training using ESAT algorithm for HMM-based speech synthesis.
2597-2600

- Tien Ying Fung, Yuk-Chi Li, Eddie Sio, Icarus Lee, Helen M. Meng, P. C. Ching:
Embedded Cantonese TTS for multi-device access to web content.
2601-2604

- Karl Schnell, Arild Lacroix:
Model based analysis of a diphone database for improved unit concatenation.
2605-2608

Robust Speech Recognition I-IV
- Ning Ma, Phil Green:
Context-dependent word duration modelling for robust speech recognition.
2609-2612

- Julien Epps, Eric H. C. Choi:
An energy search approach to variable frame rate front-end processing for robust ASR.
2613-2616

- Roberto Gemello, Franco Mana, Renato de Mori:
Non-linear estimation of voice activity to improve automatic recognition of noisy speech.
2617-2620

- Yusuke Kida, Tatsuya Kawahara:
Voice activity detection based on optimally weighted combination of multiple features.
2621-2624

- Pei Ding:
Soft decision strategy and adaptive compensation for robust speech recognition against impulsive noise.
2625-2628

- Nicolás Morales, Doroteo Torre Toledano, John H. L. Hansen, José Colás, Javier Garrido:
Statistical class-based MFCC enhancement of filtered and band-limited speech for robust ASR.
2629-2632

- Hemant Misra, Hervé Bourlard:
Spectral entropy feature in full-combination multi-stream for robust ASR.
2633-2636

- Wooil Kim, Richard M. Stern, Hanseok Ko:
Environment-independent mask estimation for missing-feature reconstruction.
2637-2640

- André Coy, Jon Barker:
Soft harmonic masks for recognising speech in the presence of a competing speaker.
2641-2644

- Lech Szymanski, Martin Bouchard:
Comb filter decomposition for robust ASR.
2645-2648

- Panikos Heracleous, Tomomi Kaino, Hiroshi Saruwatari, Kiyohiro Shikano:
Investigating the role of the Lombard reflex in non-audible murmur (NAM) recognition.
2649-2652

- Evan Ruzanski, John H. L. Hansen, Don Finan, James Meyerhoff, William Norris, Terry Wollert:
Improved "TEO" feature-based automatic stress detection using physiological and acoustic speech sensors.
2653-2656

- Takeshi S. Kobayakawa:
Spectral subtraction using elliptic integral for multiplication factor.
2657-2660

- Longbiao Wang, Norihide Kitaoka, Seiichi Nakagawa:
Robust distant speech recognition based on position dependent CMN using a novel multiple microphone processing technique.
2661-2664

- H. Tanaka, Hiroshi Fujimura, Chiyomi Miyajima, Takanori Nishino, Katunobu Itou, Kazuya Takeda:
Data collection and evaluation of speech recognition for motorbike riders.
2665-2668

- Agustín Álvarez Marquina, Pedro Gómez Vilda, Victor Nieto Lluis, Rafael Martínez, Victoria Rodellar:
Application of a first-order differential microphone for efficient voice activity detection in a car platform.
2669-2672

- Panji Setiawan, Suhadi Suhadi, Tim Fingscheidt, Sorel Stan:
Robust speech recognition for mobile devices in car noise.
2673-2676

- Péter Mihajlik, Zoltán Tobler, Zoltán Tüske, Géza Gordos:
Evaluation and optimization of noise robust front-end technologies for the automatic recognition of Hungarian telephone speech.
2677-2680

- Gang Chen, Douglas D. O'Shaughnessy, Hesham Tolba:
A performance investigation of noisy voice recognition over IP telephony networks.
2681-2684

- Akinori Ito, Takashi Kanayama, Motoyuki Suzuki, Shozo Makino:
Internal noise suppression for speech recognition by small robots.
2685-2688

- Florian Kraft, Robert Malkin, Thomas Schaaf, Alex Waibel:
Temporal ICA for classification of acoustic events i a kitchen environment.
2689-2692

- Jan Felix Krebber:
"hello - is anybody at home?" - about the minimum word accuracy of a smart home spoken dialogue system.
2693-2696

- Hans-Günter Hirsch, Harald Finster:
The simulation of realistic acoustic input scenarios for speech recognition systems.
2697-2700

- Michael Walsh, Gregory M. P. O'Hare, Julie Carson-Berndsen:
An agent-based framework for speech investigation.
2701-2704

Speech Coding
- Stephen So, Kuldip K. Paliwal:
Switched split vector quantisation of line spectral frequencies for wideband speech coding.
2705-2708

- Changchun Bao, Jason Lukasiak, Christian Ritz:
A novel voicing cut-off determination for low bit-rate harmonic speech coding.
2709-2712

- Hauke Krüger, Peter Vary:
A partial decorrelation scheme for improved predictive open loop quantization with noise shaping.
2713-2716

- Venkatesh Krishnan, Thomas P. Barnwell III, David V. Anderson:
Using dynamic codebook re-ordering to exploit inter-frame correlation in MELP coders.
2717-2720

- Adriane Swalm Durey, Venkatesh Krishnan, Thomas P. Barnwell III:
Enhanced speech coding based on phonetic class segmentation.
2721-2724

- Ali Erdem Ertan, Thomas P. Barnwell III:
A pitch-synchronous pitch-cycle modification method for designing a hybrid i-MELP/waveform-matching speech coder.
2725-2728

- Joon-Hyuk Chang, Jong Won Shin, Seung Yeol Lee, Nam Soo Kim:
A new structural preprocessor for low-bit rate speech coding.
2729-2732

- Tiago H. Falk, Wai-Yip Chan, Peter Kabal:
An improved GMM-based voice quality predictor.
2733-2736

- Jan S. Erkelens:
High-quality memoryless subband coding of impulse responses at 22 bits per frame.
2737-2740

- Shi-Han Chen, Kuo-Guan Wu, Chih-Chung Kuo:
A study of variable pulse allocation for MPE and CELP coders based on PESQ analysis.
2741-2744

- José L. Pérez-Córdoba, Antonio M. Peinado, Angel M. Gomez, Antonio J. Rubio:
Joint source-channel coding of LSP parameters for bursty channels.
2745-2748

Gender and Age Issues in Speech and Language Research I, II
- Daniel Elenius, Mats Blomberg:
Adaptation and normalization experiments in speech recognition for 4 to 8 year old children.
2749-2752

- Wim Jansen, Hugo Van Hamme:
PROSPECT features and their application to missing data techniques for vocal tract length normalization.
2753-2756

- Andreas Hagen, Bryan L. Pellom:
Data driven subword unit modeling for speech recognition and its application to interactive reading tutors.
2757-2760

- Anton Batliner, Mats Blomberg, Shona D'Arcy, Daniel Elenius, Diego Giuliani, Matteo Gerosa, Christian Hacker, Martin J. Russell, Stefan Steidl, Michael Wong:
The PF_STAR children's speech corpus.
2761-2764

- Linda Bell, Johan Boye, Joakim Gustafson, Mattias Heldner, Anders Lindström, Mats Wirén:
The Swedish NICE corpus - spoken dialogues between children and embodied characters in a computer game scenario.
2765-2768

- Yusuke Miyauchi, Nao Hodoshima, Keiichi Yasu, Nahoko Hayashi, Takayuki Arai, Mitsuko Shindo:
A preprocessing technique for improving speech intelligibility in reverberant environments: the effect of steady-state suppression on elderly people.
2769-2772

Discourse and Dialogue I, II
- Norbert Pfleger, Markus Löckelt:
Synchronizing dialogue contributions of human users and virtual characters in a virtual reality environment.
2773-2776

- Anand Venkataraman, Yang Liu, Elizabeth Shriberg, Andreas Stolcke:
Does active learning help automatic dialog act tagging in meeting data?
2777-2780

- Dan Bohus, Alexander I. Rudnicky:
A principled approach for rejection threshold optimization in spoken dialog systems.
2781-2784

- David Pérez-Piñar López, Carmen García-Mateo:
Application of confidence measures for dialogue systems through the use of parallel speech recognizers.
2785-2788

- Sophie Rosset, Delphine Tribout:
Multi-level information and automatic dialog acts detection in human-human spoken dialogs.
2789-2792

- Rieks op den Akker, Harry Bunt, Simon Keizer, Boris W. van Schooten:
From question answering to spoken dialogue: towards an information search assistant for interactive multimodal information extraction.
2793-2796

Text-to-Speech I, II
- Ulrich Reubold, Alexander Steffen:
Pitch-effects in diphone recording: are logatomes inappropriate?
2797-2800

- Tomoki Toda, Keiichi Tokuda:
Speech parameter generation algorithm considering global variance for HMM-based speech synthesis.
2801-2804

- Makoto Tachibana, Junichi Yamagishi, Takashi Masuko, Takao Kobayashi:
Performance evaluation of style adaptation for hidden semi-Markov model based speech synthesis.
2805-2808

- Gabriel Webster, Tina Burrows, Katherine Knill:
A comparison of methods for speaker-dependent pronunciation tuning for text-to-speech synthesis.
2809-2812

- Ann K. Syrdal, Alistair Conkie:
Perceptually-based data-driven join costs: comparing join types.
2813-2816

- Yannis Pantazis, Yannis Stylianou, Esther Klabbers:
Discontinuity detection in concatenated speech synthesis based on nonlinear speech analysis.
2817-2820

Language and Dialect Identification I, II
- Tingyao Wu, Dirk Van Compernolle, Jacques Duchateau, Qian Yang, Jean-Pierre Martens:
Improving the discrimination between native accents when recorded over different channels.
2821-2824

- Isabel Trancoso, António Joaquim Serralheiro, Céu Viana, Diamantino Caseiro:
Aligning and recognizing spoken books in different varieties of Portuguese.
2825-2828

- Bin Ma, Haizhou Li, Chin-Hui Lee:
An acoustic segment modeling approach to automatic language identification.
2829-2832

- Dong Zhu, Martine Adda-Decker, Fabien Antoine:
Different size multilingual phone inventories and context-dependent acoustic models for language identification.
2833-2836

- Sheng Gao, Bin Ma, Haizhou Li, Chin-Hui Lee:
A text categorization approach to automatic language identification.
2837-2840

- Giampiero Salvi:
Advances in regional accent clustering in Swedish.
2841-2844

Speech Recognition in Ubiquitous Networking and Context-Aware Computing
- David Pearce, Jonathan Engelsma, James C. Ferrans, John Johnson:
An architecture for seamless access to distributed multimodal services.
2845-2848

- Zheng-Hua Tan, Paul Dalsgaard, Børge Lindberg, Haitian Xu:
Robust speech recognition in ubiquitous networking and context-aware computing.
2849-2852

- Valentin Ion, Reinhold Haeb-Umbach:
Unified probabilistic approach to error concealment for distributed speech recognition.
2853-2856

- Alastair Bruce James, Ben Milner:
Combining packet loss compensation methods for robust distributed speech recognition.
2857-2860

- Trond Skogstad, Torbjørn Svendsen:
Distributed ASR using speech coder data for efficient feature vector representation.
2861-2864

- Sadaoki Furui, Tomohisa Ichiba, Takahiro Shinozaki, Edward W. D. Whittaker, Koji Iwano:
Cluster-based modeling for ubiquitous speech recognition.
2865-2868

Phonetics and Phonology I, II
- Danny R. Moates, Zinny S. Bond, Russell Fox, Verna Stockmal:
The feature [sonorant] in lexical access.
2869-2872

- Simone Mikuteit:
Voice and aspiration in German and east bengali stops: a cross-language study.
2873-2876

- Irene Jacobi, Louis C. W. Pols, Jan Stroop:
Polder dutch: aspects of the /ei/-lowering in standard dutch.
2877-2880

- Eric Castelli, René Carré:
Production and perception of Vietnamese vowels.
2881-2884

- Vu Ngoc Tuan, Christophe d'Alessandro, Alexis Michaud:
Using open quotient for the characterisation of vietnamese glottalised tones.
2885-2888

- John Hajek, Mary Stevens:
On the acoustic characterization of ejective stops in Waima'a.
2889-2892

- Mary Stevens, John Hajek:
Spirantization of /p t k/ in Sienese Italian and so-called semi-fricatives.
2893-2896

- Barbara Gili Fivela, Claudio Zmarich:
Italian geminates under speech rate and focalization changes: kinematic, acoustic, and perception data.
2897-2900

- Sunhee Kim:
Durational characteristics of Korean Lombard speech.
2901-2904

- Toshiko Isei-Jaakkola, Satoshi Asakawa:
A cross-linguistic study of vowel quantity in different word structures: Japanese, Finnish and Czech.
2905-2908

- Laura Mori, Melissa Barkat-Defradas:
Acoustic properties of foreign accent: VOT variations in Moroccan-accented Italian.
2909-2912

- Andréia S. Rauber, Paola Escudero, Ricardo Augusto Hoffmann Bion, Barbara O. Baptista:
The interrelation between the perception and production of English vowels by native speakers of Brazilian Portuguese.
2913-2916

- Julia Hoelterhoff:
Recognition of German obstruents.
2917-2920

- Radek Skarnitzl, Jan Volín:
Czech voiced labiodental continuant discrimination from basic acoustic data.
2921-2924

- Jean-Baptiste Maj, Anne Bonneau, Dominique Fohr, Yves Laprie:
An elitist approach for extracting automatically well-realized speech sounds with high confidence.
2925-2928

- Na'im R. Tyson:
Applying multiple regression models for predic