ICSLP 1996:
Philadelphia, PA, USA
The 4th International Conference on Spoken Language Processing, Philadelphia, PA, USA, October 3-6, 1996.
ISCA 1996
Plenary Lectures
- Anne Cutler:
The comparative study of spoken-language processing.

- James L. Flanagan:
Natural communication with machines - progress and challenge.

Large Vocabulary
- Z. Li, Michel Héon, Douglas D. O'Shaughnessy:
New developments in the INRS continuous speech recognition system.

- Lori Lamel, Gilles Adda:
On designing pronunciation lexicons for large vocabulary, continuous speech recognition.

- Pablo Fetter, Frédéric Dandurand, Peter Regel-Brietzmann:
Word graph rescoring using confidence measures.

- Xavier L. Aubert, Peter Beyerlein, Meinhard Ullrich:
A bottom-up approach for handling unseen triphones in large vocabulary continuous speech recognition.

- V. Valtchev, Philip C. Woodland, Steve J. Young:
Discriminative optimisation of large vocabulary recognition systems.

- Tatsuo Matsuoka, Katsutoshi Ohtsuki, Takeshi Mori, Sadaoki Furui, Katsuhiko Shirai:
Japanese large-vocabulary continuous-speech recognition using a business-newspaper corpus.

- David M. Carter, Jaan Kaja, Leonardo Neumeyer, Manny Rayner, Fuliang Weng, Mats Wirén:
Handling compound nouns in a Swedish speech-understanding system.

- Javier Macías Guarasa, Ascensión Gallardo-Antolín, Javier Ferreiros, José Manuel Pardo, Luis Villarrubia Grande:
Initial evaluation of a preselection module for a flexible large vocabulary speech recognition system in.

Multimodal ASR (Face and Lips)
- Mamoun Alissali, Paul Deléglise, Alexandrina Rogozan:
Asynchronous integration of visual information in an automatic speech recognition system.

- Iain A. Matthews, J. Andrew Bangham, S. J. Cox:
Audiovisual speech recognition using multiscale nonlinear image decomposition.

- Qin Su, Peter L. Silsbee:
Robust audiovisual integration using semicontinuous hidden Markov models.

- Richard P. Schumeyer, Kenneth E. Barner:
The effect of visual information on word initial consonant perception of dysarthric speech.

- Devi Chandramohan, Peter L. Silsbee:
A multiple deformable template approach for visual speech recognition.

- Piero Cosi, Emanuela Magno Caldognetto, Franco Ferrero, M. Dugatto, Kyriaki Vagges:
Speaker independent bimodal phonetic recognition experiments.

- Juergen Luettin, Neil A. Thacker, Steve W. Beet:
Speechreading using shape and intensity information.

- Juergen Luettin, Neil A. Thacker, Steve W. Beet:
Speaker identification by lipreading.

Perception of Words
- David W. Gow Jr., Janis Melvold, Sharon Manuel:
How word onsets drive lexical access and segmentation: evidence from acoustics, phonology and processing.

- David van Kuijk, Peter Wittenburg, Ton Dijkstra:
RAW: a real-speech model for human word recognition.

- Mehdi Meftah, Sami Boudelaa:
How facilitatory can lexical information be during word recognition? evidence from moroccan arabic.

- Alette P. Haveman:
Effects of frequency on the auditory perception of open- versus closed-class words.

- Michael S. Vitevitch, Paul A. Luce, Jan Charles-Luce, David Kemmerer:
Phonotactic and metrical influences on adult ratings of spoken nonsense words.

- Edward T. Auer, Lynne E. Bernstein:
Lipreading supplemented by voice fundamental frequency: to what extent does the addition of voicing increase lexical uniqueness for the lipreader?

- Saskia te Riele, Sieb G. Nooteboom, Hugo Quené:
Strategies used in rhyme-monitoring.

- Wilma van Donselaar, Cecile T. L. Kuijpers, Anne Cutler:
How do dutch listeners process words with epenthetic schwa?

Phonetics, Transcription, and Analysis
- Patrick Juola, Philip Zimmermann:
Whole-word phonetic distances and the PGPfone alphabet.

- Shuping Ran, J. Bruce Millar, Phil Rose:
Automatic vowel quality description using a variable mapping to an eight cardinal vowel reference set.

- Andreas Kipp, Maria-Barbara Wesenick, Florian Schiel:
Automatic detection and segmentation of pronunciation variants in German speech corpora.

- Stephanie Seneff, Raymond Lau, Helen M. Meng:
ANGIE: a new framework for speech analysis based on morpho-phonological modelling.

- Byunggon Yang:
Perceptual contrast in the Korean and English vowel system normalized.

- Yong-Ju Lee, Sook-Hyang Lee:
On phonetic characteristics of pause in the Korean read speech.

- Sami Boudelaa, Mehdi Meftah:
Cross-language effects of lexical stress in word recognition: the case of Arabic English bilinguals.

- Maria-Barbara Wesenick:
Automatic generation of German pronunciation variants.

- Maria-Barbara Wesenick, Andreas Kipp:
Estimating the quality of phonetic transcriptions and segmentations of speech signals.

- Bojan Petek, Rastislav Sustarsic, Smiljana Komar:
An acoustic analysis of contemporary vowels of the standard slovenian language.

- Sandrine Robbe, Anne Bonneau, Sylvie Coste, Yves Laprie:
Using decision trees to construct optimal acoustic cues.

- Donna Erickson, Osamu Fujimura:
Maximum jaw displacement in contrastive emphasis.

- Rebecca Herman, Mary E. Beckman, Kiyoshi Honda:
Subglottal pressure and final lowering in English.

- Cecile T. L. Kuijpers, Wilma van Donselaar, Anne Cutler:
Phonological variation: epenthesis and deletion of schwa in Dutch.

Spoken Language Processing for Special Populations
- James J. Mahshie:
Feedback considerations for speech training systems.

- Anne-Marie Öster:
Clinical applications of computer-based speech training for children with hearing impairment.

- Valérie Hazan, Andrew Simpson:
Enhancing information-rich regions of natural VCV and sentence materials presented in noise.

- Valérie Hazan, Alan Adlard:
Speech perceptual abilities of children with specific reading difficulty (dyslexia).

- Larry D. Paarmann, Michael K. Wynne:
Bimodal perception of spectrum compressed speech.

- Dragana Barac-Cikoja, Sally Revoile:
Effect of sentential context on syllabic stress perception by hearing-impaired listeners.

- Martin Russell, Catherine Brown, Adrian Skilling, Robert W. Series, Julie L. Wallace, Bill Bohnam, Paul Barker:
Applications of automatic speech recognition to speech and language development in young children.

- D. R. Campbell:
Sub-band adaptive speech enhancement for hearing aids.

- Thomas Portele, Jürgen Krämer:
Adapting a TTS system to a reading machine for the blind.

Dialogue Special Sessions
- Katsuhiko Shirai:
Modeling of spoken dialogue with and without visual information.

- Stephanie Seneff, David Goddeau, Christine Pao, Joseph Polifroni:
Multimodal discourse modelling in a multi-user multi-domain environment.

- Kenji Kita, Yoshikazu Fukui, Masaaki Nagata, Tsuyoshi Morimoto:
Automatic acquisition of probabilistic dialogue models.

- Paul Heisterkamp, Scott McGlashan:
Units of dialogue management: an example.

- Sharon L. Oviatt, Robert VanGent:
Error resolution during multimodal human-computer interaction.

- Ramesh R. Sarukkai, Dana H. Ballard:
Improved spontaneous dialogue recognition using dialogue and utterance triggers by adaptive probability boosting.

- Kai Hbener, Uwe Jost, Henrik Heine:
Speech recognition for spontaneously spoken German dialogues.

- Paul Taylor, Hiroshi Shimodaira, Stephen Isard, Simon King, Jacqueline C. Kowtko:
Using prosodic information to constrain language models for spoken dialogue.

- Peter A. Heeman, Kyung-ho Loken-Kim, James F. Allen:
Combining the detection and correction of speech repairs.

- Yuji Sagawa, Wataru Sugimoto, Noboru Ohnishi:
Generating spontaneous elliptical utterance.

- Gösta Bruce, Marcus Filipsson, Johan Frid, Björn Granström, Kjell Gustafson, Merle Horne, David House, Birgitta Lastow, Paul Touati:
Developing the modelling of Swedish prosody in spontaneous dialogue.

- Shimei Pan, Kathleen McKeown:
Spoken language generation in a multimedia system.

- Keikichi Hirose, Mayumi Sakata, Hiromichi Kawanami:
Synthesizing dialogue speech of Japanese based on the quantitative analysis of prosodic features.

- Shuichi Tanaka, Shu Nakazato, Keiichiro Hoashi, Katsuhiko Shirai:
Spoken dialogue interface in a dual task situation.

- Yasuhisa Niimi, Yutaka Kobayashi:
A dialogue control strategy based on the reliability of speech recognition.

- Alexander I. Rudnicky, Stephen Reed, Eric H. Thayer:
Speechwear: a mobile speech system.

- Helen M. Meng, Senis Busayapongchai, James R. Glass, David Goddeau, I. Lee Hetherington, Edward Hurley, Christine Pao, Joseph Polifroni, Stephanie Seneff, Victor Zue:
WHEELS: a conversational system in the automobile classifieds domain.

- M. David Sadek, A. Ferrieux, A. Cozannet, Philippe Bretier, Franck Panaget, J. Simonin:
Effective human-computer cooperative spoken dialogue: the AGS demonstrator.

- Samir Bennacef, Laurence Devillers, Sophie Rosset, Lori Lamel:
Dialog in the RAILTEL telephone-based system.

- Alon Lavie, Lori S. Levin, Yan Qu, Alex Waibel, Donna Gates, Marsal Gavaldà, Laura Mayfield, Maite Taboada:
Dialogue processing in a conversational speech translation system.

Language Modeling
- Thomas Niesler, Philip C. Woodland:
Combination of word-based and category-based language models.

- Francisco J. Valverde-Albacete, José Manuel Pardo:
A multi-level lexical-semantics based language model design for guided integrated continuous speech recognition.

- Florian Gallwitz, Elmar Nöth, Heinrich Niemann:
A category based approach for recognition of out-of-vocabulary words.

- Kristie Seymore, Ronald Rosenfeld:
Scalable backoff language models.

- Rukmini Iyer, Mari Ostendorf:
Modeling long distance dependence in language: topic mixtures vs. dynamic cache models.

- Marcello Federico:
Bayesian estimation methods for n-gram language model adaptation.

- Man-Hung Siu, Mari Ostendorf:
Modeling disfluencies in conversational speech.

- John Miller, Fil Alleva:
Evaluation of a language model using a clustered model backoff.

- Antonio Bonafonte, José B. Mariño:
Language modeling using x-grams.

- Klaus Ries, Finn Dag Buø, Alex Waibel:
Class phrase models for language modelling.

- Petra Geutner:
Introducing linguistic constraints into statistical language modeling.

- Jianying Hu, William Turin, Michael K. Brown:
Language modeling with stochastic automata.

Feature Extraction for Speech Recognition
- Don X. Sun:
Feature dimension reduction using reduced-rank maximum likelihood estimation for hidden Markov models.

- Kai Hbener:
Using multi-level segmentation coefficients to improve HMM speech recognition.

- Thomas Eisele, Reinhold Haeb-Umbach, Detlev Langmann:
A comparative study of linear feature transformation techniques for automatic speech recognition.

- Ben Milner:
Inclusion of temporal information into features for speech recognition.

- Hubert Wassner, Gérard Chollet:
New cepstral representation using wavelet analysis and spectral transformation for robust speech recognition.

- Christopher John Long, Sekharajit Datta:
Wavelet based feature extraction for phoneme recognition.

- Andrzej Drygajlo:
New fast wavelet packet transform algorithms for frame synchronized speech processing.

- Srinivasan Umesh, Leon Cohen, Nenad Marinovic, Douglas J. Nelson:
Frequency-warping in speech.

- Daisuke Kobayashi, Shoji Kajita, Kazuya Takeda, Fumitada Itakura:
Extracting speech features from human speech-like noise.

- Shoji Kajita, Kazuya Takeda, Fumitada Itakura:
Subband-crosscorrelation analysis for robust speech recognition.

- Hervé Bourlard, Stéphane Dupont:
A new ASR approach based on independent processing and recombination of partial frequency bands.

- Climent Nadeu, José B. Mariño, Javier Hernando, Albino Nogueiras:
Frequency and time filtering of filter-bank energies for HMM speech recognition.

Speech Production - Measurement and Modeling
- Yves Laprie, Marie-Odile Berger:
Extraction of tongue contours in x-ray images with minimal user interaction.

- Didier Demolin, Thierry Metens, Alain Soquet:
Three-dimensional measurement of the vocal tract by MRI.

- Philip Gleason, Betty Tuller, J. A. Scott Kelso:
Syllable affiliation of final consonant clusters undergoes a phase transition over speaking rates.

- Arthur Lobo, Michael H. O'Malley:
Towards a biomechanical model of the larynx.

- Yann Morlec, Gérard Bailly, Véronique Aubergé:
Generating intonation by superposing gestures.

- Hideki Kawahara, Hiroko Kato, J. C. Williams:
Effects of auditory feedback on F0 trajectory generation.

Speech Coding / HMMs and NNs in ASR
- Ian S. Burnett, John J. Parry:
On the effects of accent and language on low rate speech coders.

- Jeng-Shyang Pan, Fergus R. McInnes, Mervyn A. Jack:
VQ codevector index assignment using genetic algorithms for noisy channels.

- Gavin C. Cawley:
An improved vector quantization algorithm for speech transmission over noisy channels.

- C. Murgia, Gang Feng, Alain Le Guyader, Catherine Quinquis:
Very low delay and high quality coding of 20 hz-15 khz speech signals at 64 kbit/s.

- Carlos M. Ribeiro, Isabel Trancoso:
Application of speaker modification techniques to phonetic vocoding.

- Tadashi Yonezaki, Kiyohiro Shikano:
Entropy coded vector quantization with hidden Markov models.

- Minoru Kohata:
An application of recurrent neural networks to low bit rate speech coding.

- Kazuhito Koishida, Keiichi Tokuda, Takao Kobayashi, Satoshi Imai:
CELP coding system based on mel-generalized cepstral analysis.

- Cheung-Fat Chan, Wai-Kwong Hui:
Wideband re-synthesis of narrowband CELP-coded speech using multiband excitation model.

- Takuya Koizumi, Mikio Mori, Shuji Taniguchi, Mitsutoshi Maruya:
Recurrent neural networks for phoneme recognition.

- M. A. Mokhtar, A. Zein-el-Abddin:
A model for the acoustic phonetic structure of arabic language using a single ergodic hidden Markov model.

- Yifan Gong, Irina Illina, Jean Paul Haton:
Modelling long term variability information in mixture stochastic trajectory framework.

- Thierry Moudenc, Robert Sokol, Guy Mercier:
Segmental phonetic features recognition by means of neural-fuzzy networks and integration in an n-best solutions post-processing.

- Irina Illina, Yifan Gong:
Stochastic trajectory model with state-mixture for continuous speech recognition.

- Hermann Hild, Alex Waibel:
Recognition of spelled names over the telephone.

- Gilles Boulianne, Patrick Kenny:
Optimal tying of HMM mixture densities using decision trees.

- Hwan Jin Choi, Yung-Hwan Oh:
Speech recognition using an enhanced FVQ based on a codeword dependent distribution normalization and codeword weighting by fuzzy objective function.

- Mikko Kurimo, Panu Somervuo:
Using the self-organizing map to speed up the probability density estimation for speech recognition with mixture density HMMs.

Vowels
NNs and Stochastic Modeling
- Geunbae Lee, Jong-Hyeok Lee, Kyubong Park, Byung-Chang Kim:
Integrating connectionist, statistical and symbolic approaches for continuous spoken Korean processing.

- Hynek Hermansky, Sangita Timberwala, Misha Pavel:
Towards ASR on partially corrupted speech.

- Herbert Gish, Kenney Ng:
Parametric trajectory models for speech recognition.

- Kate Knill, M. J. F. Gales, Steve J. Young:
Use of Gaussian selection in large vocabulary continuous speech recognition using HMMs.

- J. Hogberg, Kåre Sjölander:
Cross phone state clustering using lexical stress and context.

- Eduardo Lleida-Solano, Richard C. Rose:
Likelihood ratio decoding and confidence measures for continuous speech recognition.

- Xiaohui Ma, Yifan Gong, Yuqing Fu, Jiren Lu, Jean Paul Haton:
A study on continuous Chinese speech recognition based on stochastic trajectory models.

- Yoshiaki Itoh, Jiro Kiyama, Hiroshi Kojima, Susumu Seki, Ryuichi Oka:
A proposal for a new algorithm of reference interval-free continuous DP for real-time speech or text retrieval.

- Akinori Ito, Masaki Kohda:
Language modeling by string pattern n-gram for Japanese speech recognition.

- Reinhard Kneser:
Statistical language modeling using a variable context length.

- Finn Tore Johansen:
A comparison of hybrid HMM architectures using global discriminative training.

- Wei Wei, Etienne Barnard, Mark A. Fanty:
Improved probability estimation with neural network models.

- Ha-Jin Yu, Yung-Hwan Oh:
A neural network using acoustic sub-word units for continuous speech recognition.

- Louis ten Bosch, Roel Smits:
On the error criteria in neural networks as a tool for human classification modelling.

- Gordon Ramsay:
A non-linear filtering approach to stochastic training of the articulatory-acoustic mapping using the EM algorithm.

- Y. P. Yang, John R. Deller Jr.:
A tool for automated design of language models.

- Felix Freitag, Enric Monte:
Acoustic-phonetic decoding based on elman predictive neural networks.

- Tan Lee, P. C. Ching:
On improving discrimination capability of an RNN based recognizer.

- Yumi Wakita, Jun Kawai, Hitoshi Iida:
An evaluation of statistical language modeling for speech recognition using a mixed category of both words and parts-of-speech.

Neural Models of Speech Processing
- Boris Aleksandrovsky, James Whitson, Gretchen Andes, Gary Lynch, Richard Granger:
Novel speech processing mechanism derived from auditory neocortical circuit analysis.

- Ping Tang, Jean Rouat:
Modeling neurons in the anteroventral cochlear nucleus for amplitude modulation (AM) processing: application to speech sound.

- Halewijn Vereecken, Jean-Pierre Martens:
Noise suppression and loudness normalization in an auditory model-based acoustic front-end.

- James J. Hant, Brian Strope, Abeer Alwan:
A psychoacoustic model for the noise masking of voiceless plosive bursts.

- Martin Hunke, Thomas Holton:
Training machine classifiers to match the performance of human listeners in a natural vowel classification task.

- Kiyoaki Aikawa, Hideki Kawahara, Minoru Tsuzaki:
A neural matrix model for active tracking of frequency-modulated tones.

Utterance Verification and Word Spotting
- Richard C. Rose, Eduardo Lleida-Solano, G. W. Erhart, R. V. Grubbe:
A user-configurable system for voice label recognition.

- Philippe Gelin, Christian Wellekens:
Keyword spotting enhancement for video soundtrack indexing.

- Rachida El Méliani, Douglas D. O'Shaughnessy:
New efficient fillers for unlimited word recognition and keyword spotting.

- Michelle S. Spina, Victor Zue:
Automatic transcription of general audio data: preliminary analyses.

- Francis Kubala, Tasos Anastasakos, Hubert Jin, Long Nguyen, Richard M. Schwartz:
Transcribing radio news.

- Anand R. Setlur, Rafid A. Sukkar, John Jacob:
Correcting recognition errors via discriminative utterance verification.

Acquisition/Learning Training L2 Learners
Focus, Stress and Accent
Spoken Language Dialogue and Conversation
- Norbert Reithinger, Ralf Engel, Michael Kipp, Martin Klesen:
Predicting dialogue acts for a speech-to-speech translation system.

- Johannes Müller, Holger Stahl, Manfred Lang:
Automatic speech translation based on the semantic structure.

- Lewis M. Norton, Carl Weir, K. W. Scholz, Deborah A. Dahl, Ahmed Bouzid:
A methodology for application development for spoken language systems.

- Stephanie Seneff, Joseph Polifroni:
A new restaurant guide conversational system: issues in rapid prototyping for specialized domains.

- Tadahiko Kumamoto, Akira Ito:
Semantic interpretation of a Japanese complex sentence in an advisory dialogue - focused on the postpositional word "KEDO, " which works as a conjunction between clauses.

- Youngkuk Hong, Myoung-Wan Koo, Gijoo Yang:
A Korean morphological analyzer for speech translation system.

- Rolf Carlson, Sheri Hunnicutt:
Generic and domain-specific aspects of the waxholm NLP and dialog modules.

- Megumi Kameyama, Goh Kawai, Isao Arima:
A real-time system for summarizing human-human spontaneous spoken dialogues.

- Bernd Hildebrandt, Heike Rautenstrauch, Gerhard Sagerer:
Evaluation of spoken language understanding and dialogue systems.

- Kuniko Kakita:
Inter-speaker interaction of F0 in dialogs.

- Hans Brandt-Pook, Gernot A. Fink, Bernd Hildebrandt, Franz Kummert, Gerhard Sagerer:
A robust dialogue system for making an appointment.

- Kazuyuki Takagi, Shuichi Itahashi:
Segmentation of spoken dialogue by interjections, disfluent utterances and pauses.

- David Goddeau, Helen M. Meng, Joseph Polifroni, Stephanie Seneff, Senis Busayapongchai:
A form-based dialogue manager for spoken language applications.

- Steve Whittaker, David Attwater:
The design of complex telephony applications using large vocabulary speech technology.

- Stephen Sutton, David G. Novick, Ronald A. Cole, Pieter J. E. Vermeulen, Jacques de Villiers, Johan Schalkwyk, Mark A. Fanty:
Building 10, 000 spoken dialogue systems.

- Yen-Ju Yang, Lee-Feng Chien, Lin-Shan Lee:
Speaker intention modeling for large vocabulary Mandarin spoken dialogues.

- P. E. Kenne, Mary O'Kane:
Hybrid language models and spontaneous legal discourse.

- P. E. Kenne, Mary O'Kane:
Topic change and local perplexity in spoken legal dialogue.

- Jennifer J. Venditti, Marc Swerts:
Intonational cues to discourse structure in Japanese.

- Niels Ole Bernsen, Hans Dybkjær, Laila Dybkjær:
Principles for the design of cooperative spoken human-machine dialogue.

- Karen L. Jenkin, Michael S. Scordilis:
Development and comparison of three syllable stress classifiers.

Speech Disorders
- Donald G. Jamieson, Li Deng, M. Price, Vijay Parsa, J. Till:
Interaction of speech disorders with speech coders: effects on speech intelligibility.

- Maurílio Nunes Vieira, Arnold G. D. Maran, Fergus R. McInnes, Mervyn A. Jack:
Detecting arytenoid cartilage misplacement through acoustic and electroglottographic jitter analysis.

- Maurílio Nunes Vieira, Fergus R. McInnes, Mervyn A. Jack:
Robust F0 and jitter estimation in pathological voices.

- F. Plante, H. Kessler, Barry M. G. Cheetham, J. E. Earis:
Speech monitoring of infective laryngitis.

- Jean Schoentgen, Raoul De Guchteneere:
Searching for nonlinear relations in whitened jitter time series.

- Liliana Gavidia-Ceballos, John H. L. Hansen, James F. Kaiser:
Vocal fold pathology assessment using AM autocorrelation analysis of the teager energy operator.

- David P. Kuehn:
Continuous positive airway pressure (CPAP) in the treatment of hypernasality.

- Carol Y. Espy-Wilson, Venkatesh R. Chari, Caroline B. Huang:
Enhancement of alaryngeal speech by adaptive filtering.

- Li Deng, Xuemin Shen, Donald G. Jamieson, J. Till:
Simulation of disordered speech using a frequency-domain vocal tract model.

- Yasuo Endo, Hideki Kasuya:
A stochastic model of fundamental period perturbation and its application to perception of pathological voice quality.

- Eric J. Wallen, John H. L. Hansen:
A screening test for speech pathology assessment using objective quality measures.

- Douglas A. Cairns, John H. L. Hansen, James F. Kaiser:
Recent advances in hypernasal speech detection using the nonlinear teager energy operator.

Vocal Tract Geometry
- Kiyoshi Honda, Shinji Maeda, Michiko Hashi, Jim Dembowski, John R. Westbury:
Human palate and related structures: their articulatory consequences.

- Edward P. Davis, Andrew Douglas, Maureen C. Stone:
A continuum mechanics representation of tongue deformation.

- Philbert Bangayan, Abeer Alwan, Shrikanth Narayanan:
From MRI and acoustic data to articulatory synthesis: a case study of the lateral approximants in american English.

- Shrikanth Narayanan, Abigail Kaun, Dani Byrd, Peter Ladefoged, Abeer Alwan:
Liquids in tamil.

- Chang-Sheng Yang, Hideki Kasuya:
Speaker individualities of vocal tract shapes of Japanese vowels measured by magnetic resonance images.

- S. El-Masri, Xavier Pelorson, P. Saguet, Pierre Badin:
Vocal tract acoustics using the transmission line matrix (TLM) method.

- Gérard Bailly:
Building sensori-motor prototypes from audiovisual exemplars.

- Mats Båvegård, Gunnar Fant:
Parameterized VT area function inversion.

- Jianwu Dang, Kiyoshi Honda:
An improved vocal tract model of vowel production implementing piriform resonance and transvelar nasal coupling.

- C. S. Blackburn, Steve J. Young:
Pseudo-articulatory speech synthesis for recognition using automatic feature extraction from x-ray data.

Prosody in ASR and Segmentation
- Sharon L. Oviatt, Gina-Anne Levow, Margaret MacEachern, Karen Kuhn:
Modeling hyperarticulate speech during human-computer error resolution.

- Siripong Potisuk, Mary P. Harper, Jackson T. Gandour:
Using stress to disambiguate spoken Thai sentences containing syntactic ambiguity.

- Hung-yun Hsieh, Ren-Yuan Lyu, Lin-Shan Lee:
Use of prosodic information to integrate acoustic and linguistic knowledge in continuous Mandarin speech recognition with very large vocabulary.

- G. V. Ramana Rao, J. Srichand:
Word boundary detection using pitch variations.

- Atsuhiro Sakurai, Keikichi Hirose:
Detection of phrase boundaries in Japanese by low-pass filtering of fundamental frequency contours.

- Vincent Pagel, Noelle Carbonell, Yves Laprie:
A new method for speech delexicalization, and its application to the perception of French prosody.

Acquisition and Learning by Machine
Dialogue Systems
- Jean-Luc Gauvain, J. J. Gangolf, Lori Lamel:
Speech recognition for an information kiosk.

- Helmer Strik, Albert Russel, Henk van den Heuvel, Catia Cucchiarini, Lou Boves:
Localizing an automatic inquiry system for public transport information.

- Stephen M. Marcus, Deborah W. Brown, Randy G. Goldberg, Max S. Schoeffler, William R. Wetzel, Richard R. Rosinski:
Prompt constrained natural language - evolving the next generation of telephony services.

- Tatsuya Kawahara, Chin-Hui Lee, Biing-Hwang Juang:
Key-phrase detection and verification for flexible speech understanding.

- Bernhard Suhm, Brad A. Myers, Alex Waibel:
Interactive recovery from speech recognition errors in speech user interfaces.

- Sunil Issar:
Estimation of language models for new spoken language applications.

Speech Enhancement and Robust Processing
- Xuemin Shen, Li Deng, Anisa Yasmin:
H-infinity filtering for speech enhancement.

- Saeed Vaseghi, Ben P. Milner:
A comparitive analysis of channel-robust features and channel equalization methods for speech recognition.

- Jia-Lin Shen, Wen-Liang Hwang, Lin-Shan Lee:
Robust speech recognition features based on temporal trajectory filtering of frequency band spectrum.

- Kevin Power:
Durational modelling for improved connected digit recognition.

- Carlos Avendaño, Hynek Hermansky:
Study on the dereverberation of speech based on temporal envelope filtering.

- Thorsten Brants:
Estimating Markov model structures.

- Eric K. Ringger, James F. Allen:
A fertility channel model for post-correction of continuous speech recognition.

- Hiroshi Yasukawa:
Restoration of wide band signal from telephone speech using linear prediction error processing.

- Hiroshi Matsumoto, Noboru Naitoh:
Smoothed spectral subtraction for a frequency-weighted HMM in noisy speech recognition.

- William S. Woods, Martin Hansen, Thomas Wittkop, Birger Kollmeier:
A simple architecture for using multiple cues in sound separation.

- Bojan Petek, Ove Andersen, Paul Dalsgaard:
On the robust automatic segmentation of spontaneous speech.

- C. G. Miglietta, Chafic Mokbel, Denis Jouvet, Jean Monné:
Bayesian adaptation of speech recognizers to field speech data.

- A. J. Darlington, D. J. Campbell:
Sub-band adaptive filtering applied to speech enhancement.

- J. P. Openshaw, John S. Mason:
Noise robust estimate of speech dynamics for speaker recognition.

- Javier Ortega-Garcia, Joaquin Gonzalez-Rodriguez:
Overview of speech enhancement techniques for automatic speaker recognition.

- Naomi Harte, Saeed Vaseghi, Ben P. Milner:
Dynamic features for segmental speech recognition.

- Takuya Koizumi, Mikio Mori, Shuji Taniguchi:
Speech recognition based on a model of human auditory system.

- Josep M. Salavedra, Enrique Masgrau:
APVQ encoder applied to wideband speech coding.

- Jin Zhou, Yair Shoham, Ali N. Akansu:
Simple fast vector quantization of the line spectral frequencies.

Speaker Adaptation and Normalization I
- Tomoko Matsui, Sadaoki Furui:
N-best-based instantaneous speaker adaptation method for speech recognition.

- Claude Montacié, Marie-José Caraty, Claude Barras:
Mixture splitting technic and temporal control in a HMM-based recognition system.

- Lei Yao, Dong Yu, Taiyi Huang:
A unified spectral transformation adaptation approach for robust speech recognition.

- Qiang Huo, Chin-Hui Lee:
On-line adaptive learning of the correlated continuous density hidden Markov models for speech recognition.

- Nikko Ström:
Speaker adaptation by modeling the speaker variation in a continuous speech recognition system.

- Yasuo Ariki, Shigeaki Tagashira:
An enquiring system of unknown words in TV news by spontaneous repetition (application of speaker normalization by speaker subspace projection).

- Jin-Song Zhang, Beiqian Dai, Changfu Wang, HingKeung Kwan, Keikichi Hirose:
Adaptive recognition method based on posterior use of distribution pattern of output probabilities.

- Philip C. Woodland, D. Pye, M. J. F. Gales:
Iterative unsupervised adaptation using maximum likelihood linear regression.

- Tasos Anastasakos, John W. McDonough, Richard M. Schwartz, John Makhoul:
A compact model for speaker-adaptive training.

- Shigeru Homma, Jun-ichi Takahashi, Shigeki Sagayama:
Iterative unsupervised speaker adaptation for batch dictation.

- Daniel C. Burnett, Mark A. Fanty:
Rapid unsupervised adaptation to children's speech on a connected-digit task.

- Jun Ishii, Masahiro Tonomura, Shoichi Matsunaga:
Speaker adaptation using tree structured shared-state HMMs.

Spoken Language and NLP
- Richard M. Schwartz, Scott Miller, David Stallard, John Makhoul:
Language understanding using hidden understanding models.

- Allen L. Gorin:
Processing of semantic information in fluently spoken language.

- Andreas Stolcke, Elizabeth Shriberg:
Automatic linguistic segmentation of conversational speech.

- Manuela Boros, Wieland Eckert, Florian Gallwitz, Günther Görz, Gerhard Hanrieder, Heinrich Niemann:
Towards understanding spontaneous speech: word accuracy vs. concept accuracy.

- Wolfgang Minker, Samir Bennacef, Jean-Luc Gauvain:
A stochastic case frame approach for natural language understanding.

- Frank Seide, Bernhard Rueber, Andreas Kellner:
Improving speech understanding by incorporating database constraints and dialogue history.

- Finn Dag Buø, Alex Waibel:
Learning to parse spontaneous speech.

- Jean-Yves Antoine:
Spontaneous speech and natural language processing ALPES: a robust semantic-led parser.

- J. Alvarez-Cercadillo, F. Javier Caminero-Gil, C. Crespo-Casas, Daniel Tapias Merino:
The natural language processing module for a voice assisted operator at telef nica i+D.

- André Berton, Pablo Fetter, Peter Regel-Brietzmann:
Compound words in large-vocabulary German speech recognition systems.

- Anton Batliner, Anke Feldhaus, Stefan Geißler, Tibor Kiss, Ralf Kompe, Elmar Nöth:
Prosody, empty categories and parsing - a success story.

- B. Srinivas:
"almost parsing" technique for language modeling.

Spoken Discourse Analysis/Synthesis
Acoustic Modeling
- Christian-Michael Westendorf, Jens Jelitto:
Learning pronunciation dictionary from speech data.

- C. Rathinavelu, Li Deng:
The trended HMM with discriminative training for phonetic classification.

- Ariane Lazaridès, Yves Normandin, Roland Kuhn:
Improving decision trees for acoustic modeling.

- Gongjun Li, Taiyi Huang:
An improved training algorithm in HMM-based speech recognition.

- Ji Ming, Peter O'Boyle, John G. McMahon, F. Jack Smith:
Speech recognition using a strong correlation assumption for the instantaneous spectra.

- Pau Pachès-Leal, Climent Nadeu:
On parameter filtering in continuous subword-unit-based speech recognition.

- Shigeki Okawa, Katsuhiko Shirai:
Estimation of statistical phoneme center considering phonemic environments.

- Xue Wang, Louis ten Bosch, Louis C. W. Pols:
Integration of context-dependent durational knowledge into HMM-based speech recognition.

- Toshiaki Fukada, Michiel Bacchiani, Kuldip K. Paliwal, Yoshinori Sagisaka:
Speech recognition based on acoustically derived segment units.

- Rivarol Vergin, Azarshid Farhat, Douglas D. O'Shaughnessy:
Robust gender-dependent acoustic-phonetic modelling in continuous speech recognition based on a new automatic male/female classification.

- Tae-Young Yang, Won-Ho Shin, Weon-Goo Kim, Dae Hee Youn:
A codebook adaptation algorithm for SCHMM using formant distribution.

- Jacques Simonin, S. Bodin, Denis Jouvet, Katarina Bartkova:
Parameter tying for flexible speech recognition.

- Tsuneo Nitta, Shin'ichi Tanaka, Yasuyuki Masai, Hiroshi Matsuura:
Word-spotting based on inter-word and intra-word diphone models.

- Antonio Bonafonte, Josep Vidal, Albino Nogueiras:
Duration modeling with expanded HMM applied to speech recognition.

- Ricardo de Córdoba, José Manuel Pardo:
Different strategies for distribution clustering using discrete, semicontinuous and continuous HMMs in CSR.

- Ilija Zeljkovic, Shrikanth Narayanan:
Improved HMM phone and triphone models for realtime ASR telephony applications.

- Yasuhiro Minami, Sadaoki Furui:
Improved extended HMM composition by incorporating power variance.

- Gordon Ramsay, Li Deng:
Optimal filtering and smoothing for speech recognition using a stochastic target model.

- Zhihong Hu, Johan Schalkwyk, Etienne Barnard, Ronald A. Cole:
Speech recognition using syllable-like units.

- Jean-Claude Junqua, Lorenzo Vassallo:
Context modeling and clustering in continuous speech recognition.

- Li Deng, Jim Jian-Xiong Wu:
Hierarchical partition of the articulatory state space for overlapping-feature based speech recognition.

- Olivier Oppizzi, David Fournier, Philippe Gilles, Henri Meloni:
A fuzzy acoustic-phonetic decoder for speech recognition.

- Katrin Kirchhoff:
Syllable-level desynchronisation of phonetic features for speech recognition.

- James R. Glass, Jane W. Chang, Michael K. McCandless:
A probabilistic framework for feature-based speech recognition.

- Jim Jian-Xiong Wu, Li Deng, Jacky Chan:
Modeling context-dependent phonetic units in a continuous speech recognition system for Mandarin Chinese.

Physics and Simulation of the Vocal Tract
Duration and Rhythm
Acoustic Analysis
- Goangshiuan S. Ying, Leah H. Jamieson, Carl D. Mitchell:
A probabilistic approach to AMDF pitch detection.

- Alain Soquet, Véronique Lecuit, Thierry Metens, Didier Demolin:
From sagittal cut to area function: an RMI investigation.

- Léonard Janer, Juan José Bonet, Eduardo Lleida-Solano:
Pitch detection and voiced/unvoiced decision algorithm based on wavelet transforms.

- Yannis Stylianou:
Decomposition of speech signals into a deterministic and a stochastic part.

- Cheol-Woo Jo, Ho-Gyun Bang, William A. Ainsworth:
Improved glottal closure instant detector based on linear prediction and standard pitch concept.

- Xihong Wang, Stephen A. Zahorian, Stefan Auberg:
Analysis of speech segments using variable spectral/temporal resolution.

- Brian Eberman, William Goldenthal:
Time-based clustering for phonetic segmentation.

- Parham Zolfaghari, Tony Robinson:
Formant analysis using mixtures of Gaussians.

- Hywel B. Richards, John S. Mason, Melvyn J. Hunt, John S. Bridle:
Deriving articulatory representations from speech with various excitation modes.

- Manish Sharma, Richard J. Mammone:
"blind" speech segmentation: automatic segmentation of speech without linguistic knowledge.

- Hiroshi Ohmura, Kazuyo Tanaka:
Speech synthesis using a nonlinear energy damping model for the vocal folds vibration effect.

- Munehiro Namba, Hiroyuki Kamata, Yoshihisa Ishida:
Neural networks learning with L1 criteria and its efficiency in linear prediction of speech signals.

- Anna Esposito, Eugène C. Ezin, M. Ceccarelli:
Preprocessing and neural classification of English stop consonants [b, d, g, p, t, k].

- K. S. Ananthakrishnan:
A comparison of modified k-means(MKM) and NN based real time adaptive clustering algorithms for articulatory space codebook formation.

- Wen Ding, Hideki Kasuya:
A novel approach to the estimation of voice source and vocal tract parameters from speech signals.

- Hartmut R. Pfitzinger, Susanne Burger, Sebastian Heid:
Syllable detection in read and spontaneous speech.

- Kuansan Wang, Chin-Hui Lee, Biing-Hwang Juang:
Maximum likelihood learning of auditory feature maps for stationary vowels.

- Antonio Bonafonte, Albino Nogueiras, Antonio Rodriguez-Garrido:
Explicit segmentation of speech using Gaussian models.

- E. Mousset, William A. Ainsworth, José A. R. Fonollosa:
A comparison of several recent methods of fundamental frequency and voicing decision estimation.

- Toshihiko Abe, Takao Kobayashi, Satoshi Imai:
Robust pitch estimation with harmonics enhancement in noisy environments based on instantaneous frequency.

- Asunción Moreno, Miquel Rutllán:
Integrated polispectrum on speech recognition.

Speech Recognition Using HMMs and NNs
- Joao P. Neto, Ciro Martins, Luís B. Almeida:
An incremental speaker-adaptation technique for hybrid HMM-MLP recognizer.

- Youngjoo Suh, Youngjik Lee:
Phoneme segmentation of continuous speech using multi-layer perceptron.

- Jeff Bilmes, Nelson Morgan, Su-Lin Wu, Hervé Bourlard:
Stochastic perceptual speech models with durational dependence.

- G. D. Cook, Anthony J. Robinson:
Boosting the performance of connectionist large vocabulary speech recognition.

- Nicolas Pican, Dominique Fohr, Jean-François Mari:
HMMs and OWE neural network for continuous speech recognition.

- Steve Waterhouse, Dan J. Kershaw, Tony Robinson:
Smoothed local adaptation of connectionist systems.

Adverse Environments and Multiple Microphones
- Takeshi Yamada, Satoshi Nakamura, Kiyohiro Shikano:
Robust speech recognition with speaker localization by a microphone array.

- Ea-Ee Jan, James L. Flanagan:
Sound source localization in reverberant environments using an outlier elimination algorithm.

- Dan J. Kershaw, Tony Robinson, Steve Renals:
The 1995 abbot LVCSR system for multiple unknown microphones.

- Diego Giuliani, Maurizio Omologo, Piergiorgio Svaizer:
Experiments of speech recognition in a noisy and reverberant environment using a microphone array and HMM.

- Joaquin Gonzalez-Rodriguez, Javier Ortega-Garcia, César Martin, Luis Hernández:
Increasing robustness in GMM speaker recognition systems for noisy and reverberant speech with low complexity microphone arrays.

- Kuan-Chieh Yen, Yunxin Zhao:
Robust automatic speech recognition using a multi-channel signal separation front-end.

Prosodic Synthesis in Dialogue
Speech Synthesis
- Richard Sproat:
Multilingual text analysis for text-to-speech synthesis.

- Yoshifumi Ooyama, Hisako Asano, Koji Matsuoka:
Spoken-style explanation generator for Japanese kanji using a text-to-speech system.

- Ken-ichi Magata, Tomoki Hamagami, Mitsuo Komura:
A method for estimating prosodic symbol from text for Japanese text-to-speech synthesis.

- Eduardo López Gonzalo, Jose M. Rodriguez-Garcia:
Statistical methods in data-driven modeling of Spanish prosody for text to speech.

- Jung-Chul Lee, Youngjik Lee, Sanghun Kim, Minsoo Hahn:
Intonation processing for TTS using stylization and neural network learning method.

- Alan W. Black, Andrew Hunt:
Generating F0 contours from toBI labels using linear regression.

- Wern-Jun Wang, Shaw-Hwa Hwang, Sin-Horng Chen:
The broad study of homograph disambiguity for Mandarin speech synthesis.

- Thierry Dutoit, Vincent Pagel, Nicolas Pierret, F. Bataille, Olivier van der Vrecken:
The MBROLA project: towards a set of high quality speech synthesizers free of use for non commercial purposes.

- Makoto Hashimoto, Norio Higuchi:
Training data selection for voice conversion using speaker selection and vector field smoothing.

- Ki-Seung Lee, Dae Hee Youn, Il-Whan Cha:
A new voice transformation method based on both linear and nonlinear prediction analysis.

- Geneviève Baudoin, Yannis Stylianou:
On the transformation of the speech spectrum for voice conversion.

- Cristina Delogu, Andrea Paoloni, Susanna Ragazzini, Paola Ridolfi:
Spectral analysis of synthetic speech and natural speech with noise over the telephone line.

- Weizhong Zhu, Hideki Kasuya:
A new speech synthesis system based on the ARX speech production model.

- Geraldo Lino de Campos, Evandro B. Gouvêa:
Speech synthesis using the CELP algorithm.

- Shaw-Hwa Hwang, Sin-Horng Chen, Yih-Ru Wang:
A Mandarin text-to-speech system.

- Mike D. Edgington, A. Lowry:
Residual-based speech modification algorithms for text-to-speech synthesis.

- Per Olav Heggtveit:
A generalized LR parser for text-to-speech synthesis.

- M. P. Pollard, Barry M. G. Cheetham, C. C. Goodyear, Mike D. Edgington, A. Lowry:
Enhanced shape-invariant pitch and time-scale modification for concatenative speech synthesis.

- Yasuhiko Arai, Ryo Mochizuki, Hirofumi Nishimura, Takashi Honda:
An excitation synchronous pitch waveform extraction method and its application to the VCV-concatenation synthesis of Japanese spoken words.

- Ren-Hua Wang, Qingfeng Liu, Difei Tang:
A new Chinese text-to-speech system with high naturalness.

- Ansgar Rinscheid:
Voice conversion based on topological feature maps and time-variant filtering.

Instructional Technology for Spoken Language
Multimodal Spoken Language Processing
- Lynne E. Bernstein, Christian Benoît:
For speech perception by humans or machines, three senses are better than one.

- Kaoru Sekiyama, Yoh'ichi Tohkura, Michio Umeda:
A few factors which affect the degree of incorporating lip-read information into speech perception.

- Eric Vatikiotis-Bateson, Kevin G. Munhall, Y. Kasahara, Frederique Garcia, Hani Yehia:
Characterizing audiovisual information during speech.

- Charlotte M. Reed:
The implications of the tadoma method of speechreading for spoken language processing.

- Ruth Campbell:
Seeing speech in space and time: psychological and neurological findings.

- Kerry P. Green:
Studies of the mcgurk effect: implications for theories of speech perception.

- N. M. Brooke:
Using the visual component in automatic speech recognition.

- Robert E. Remez:
Perceptual organization of speech in one and several modalities: common functions, common resources.

- David B. Pisoni, Helena M. Saldaña, Sonya M. Sheffert:
Multi-modal encoding of speech in memory: a first report.

Prosody - Phonological/Phonetic Measures
Phonetics and Perception
Language Acquisition
- Jean E. Andruski, Patricia K. Kuhl:
The acoustic structure of vowels in mothers' speech to infants and adults.

- Chris J. Clement, Florien J. Koopmans-van Beinum, Louis C. W. Pols:
Acoustical characteristics of sound production of deaf and normally hearing infants.

- John Kingston, Christine Bartels, José Benkí, Deanna Moore, Jeremy Rice, Rachel Thorburn, Neil Macmillan:
Learning non-native vowel categories.

- Pierre A. Hallé, Toshisada Deguchi, Yuji Tamekawa, Benedicte de Boysson-Bardies, Shigeru Kiritani:
Word recognition by Japanese infants.

- Peter W. Jusczyk:
Investigations of the word segmentation abilities of infants.

- Akiko Hayashi, Yuji Tamekawa, Toshisada Deguchi, Shigeru Kiritani:
Developmental change in perception of clause boundaries by 6- and 10-month-old Japanese infants.

Production and Prosody Posters
- Paavo Alku, Erkki Vilkman:
A frequency domain method for parametrization of the voice source.

- Krzysztof Marasek:
Glottal correlates of the word stress and the tense/lax opposition in German.

- Suzanne Boyce, Carol Y. Espy-Wilson:
Coarticulatory stability in american English /r/.

- Shinobu Masaki, Reiko Akahane-Yamada, Mark K. Tiede, Yasuhiro Shimada, Ichiro Fujimoto:
An MRI-based analysis of the English /r/ and /l/ articulations.

- David van Kuijk:
Does lexical stress or metrical stress better predict word boundaries in Dutch?

- Alan Wrench, Alan D. McIntosh, William J. Hardcastle:
Optopalatograph (OPG): a new apparatus for speech production analysis.

- René Carré:
Prediction of vowel systems using a deductive approach.

- Sheila J. Mair, Celia Scully, Christine H. Shadle:
Distinctions between [t] and [tch] using electropalatography data.

- Michiko Hashi, Raymond D. Kent, John R. Westbury, Mary J. Lindstrom:
Relating formants and articulation in intelligibility test words.

- Imad Znagui, Mohamed Yeou:
The role of coarticulation in the perception of vowel quality in modern standard Arabic.

- Simon Arnfield, Wilf Jones:
Updating the reading EPG.

- Goangshiuan S. Ying, Leah H. Jamieson, Ruxin Chen, Carl D. Mitchell:
Lexical stress detection on stress-minimal word pairs.

- Jing Wang:
An acoustic study of the interaction between stressed and unstressed syllables in spoken Mandarin.

- Nobuaki Minematsu, Seiichi Nakagawa:
Automatic detection of accent nuclei at the head of words for speech recognition.

- Fu-Chiang Chou, Chiu-yu Tseng, Lin-Shan Lee:
Automatic generation of prosodic structure for high quality Mandarin speech synthesis.

- Tomoki Hamagami, Ken-ichi Magata, Mitsuo Komura:
A study on Japanese prosodic pattern and its modeling in restricted speech.

- Steve Hoskins:
A phonetic study of focus in intransitive verb sentences.

- Stefan Rapp:
Goethe for prosody.

- K. A. Straub:
Prosodic cues in syntactically ambiguous strings; an interactive speech planning mechanism.

- Jinfu Ni, Ren-Hua Wang, Deyu Xia:
A functional model for generation of the local components of F0 contours in Chinese.

- Marie Fellbaum:
The acquisition of voiceless stops in the interlanguage of second language learners of English and Spanish.

User-Machine Interfaces
- Brian Mellor, Chris Baber, C. Tunley:
Evaluating automatic speech recognition as a component of a multi-input device human-computer interface.

- A. Life, I. Salter, Jean-Noel Temem, F. Bernard, Sophie Rosset, Samir Bennacef, Lori Lamel:
Data collection for the MASK kiosk: WOz vs prototype system.

- M. Karaorman, Ted H. Applebaum, T. Itoh, M. Endo, Y. Ohno, M. Hoshimi, Takahiro Kamai, Kenji Matsui, Kazue Hata, Steve Pearson, Jean-Claude Junqua:
An experimental Japanese/English interpreting video phone system.

- Sara Basson, Stephen Springer, Cynthia Fong, Hong C. Leung, Edward Man, Michele Olson, John F. Pitrelli, Ranvir Singh, Suk Wong:
User participation and compliance in speech automated telecommunications applications.

- Samuel Bayer:
Embedding speech in web interfaces.

- Toshihiro Isobe, Masatoshi Morishima, Fuminori Yoshitani, Nobuo Koizumi, Ken'ya Murakami:
Voice-activated home banking system and its field trial.

TTS Systems and Rules
- Sangho Lee, Yung-Hwan Oh:
A text analyzer for Korean text-to-speech systems.

- Helen E. Karn:
Design and evaluation of a phonological phrase parser for Spanish text-to-speech.

- Ove Andersen, Roland Kuhn, Ariane Lazaridès, Paul Dalsgaard, Jürgen Haas, Elmar Nöth:
Comparison of two tree-structured approaches for grapheme-to-phoneme conversion.

- M. J. Adamson, Robert I. Damper:
A recurrent network that learns to pronounce English text.

- Eleonora Cavalcante Albano, Agnaldo Antonio Moreira:
Archisegment-based letter-to-phone conversion for concatenative speech synthesis in Portuguese.

- Yuki Yoshida, Shin'ya Nakajima, Kazuo Hakoda, Tomohisa Hirokawa:
A new method of generating speech synthesis units based on phonological knowledge and clustering technique.

Prosody and Labeling
- Martine Grice, Matthias Reyelt, Ralf Benzmüller, Jörg Mayer, Anton Batliner:
Consistency in transcription and labelling of German intonation with GToBI.

- Anton Batliner, Ralf Kompe, Andreas Kießling, Heinrich Niemann, Elmar Nöth:
Syntactic-prosodic labeling of large spontaneous speech data-bases.

- Florien J. Koopmans-van Beinum, Monique E. van Donzel:
Relationship between discourse structure and dynamic speech rate.

- Nigel Ward:
Using prosodic clues to decide when to produce back-channel utterances.

- Marion Mast, Ralf Kompe, Stefan Harbeck, Andreas Kießling, Heinrich Niemann, Elmar Nöth, Ernst Günter Schukat-Talamazzini, Volker Warnke:
Dialog act classification with the help of prosody.

- David van Kuijk, Henk van den Heuvel, Lou Boves:
Using lexical stress in continuous speech recognition for dutch.

Speaker/Language Identification and Verification
- Karsten Kumpf, Robin W. King:
Automatic accent classification of foreign accented australian English speech.

- Filipp Korkmazskiy, Biing-Hwang Juang:
Discriminative adaptation for speaker verification.

- Verna Stockmal, D. Muljani, Zinny S. Bond:
Perceptual features of unknown foreign languages as revealed by multi-dimensional scaling.

- Kin Yu, John S. Mason:
On-line incremental adaptation for speaker verification using maximum likelihood estimates of CDHMM parameters.

- Dominique Genoud, Frédéric Bimbot, Guillaume Gravier, Gérard Chollet:
Combining methods to improve speaker verification decision.

- Cesar Martín del Alamo, J. Alvarez, C. de la Torre, F. J. Poyatos, Lis Hernández:
Incremental speaker adaptation with minimum error discriminative training for speaker identification.

- Konstantin P. Markov, Seiichi Nakagawa:
Frame level likelihood normalization for text-independent speaker identification using Gaussian mixture models.

- Ann E. Thymé-Gobbel, Sandra E. Hutchins:
On using prosodic cues in automatic language identification.

- Tadashi Kitamura, Shinsai Takei:
Speaker recognition model using two-dimensional mel-cepstrum and predictive neural network.

- HingKeung Kwan, Keikichi Hirose:
Unknown language rejection in language identification system.

- James Hieronymus, Shubha Kadambe:
Spoken language identification using large vocabulary speech recognition.

- Carlos Teixeira, Isabel Trancoso, António Joaquim Serralheiro:
Accent identification.

- Sarel van Vuuren:
Comparison of text-independent speaker recognition methods on telephone speech with acoustic mismatch.

- Xue Yang, J. Bruce Millar, Iain MacLeod:
On the sources of inter- and intra-speaker variability in the acoustic dynamics of speech.

- Kay M. Berkling, Etienne Barnard:
Language identification with inaccurate string matching.

- Michael J. Carey, Eluned S. Parris, Harvey Lloyd-Thomas, S. J. Bennett:
Robust prosodic features for speaker identification.

- Enric Monte, Javier Hernando Pericas, Xavier Miró, A. Adolf:
Text independent speaker identification on noisy environments by means of self organizing maps.

- Paul Dalsgaard, Ove Andersen, Hanne Hesselager, Bojan Petek:
Language identification using language-dependent phonemes and language-independent speech units.

Emotion in Recognition and Synthesis
Stochastic Techniques in Robust Speech Recognition
- Chin-Hui Lee, Biing-Hwang Juang, Wu Chou, J. J. Molina-Perez:
A study on task-independent subword selection and modeling for speech recognition.

- Mazin G. Rahim, Chin-Hui Lee:
Simultaneous ANN feature and HMM recognizer design using string-based minimum classification error (MCE) training.

- Sunil K. Gupta, Frank K. Soong, Raziel Haimi-Cohen:
Quantizing mixture-weights in a tied-mixture HMM.

- M. J. F. Gales, D. Pye, Philip C. Woodland:
Variance compensation within the MLLR framework for robust speech recognition and speaker adaptation.

- Arun C. Surendran, Chin-Hui Lee, Mazin G. Rahim:
Maximum-likelihood stochastic matching approach to non-linear equalization for robust speech recognition.

- Jen-Tzung Chien, Hsiao-Chuan Wang, Lee-Min Lee:
Estimation of channel bias for telephone speech recognition.

Prosodic Synthesis in Text to Speech
Dialogue Events
Databases and Tools
- Peter Roach, Simon Arnfield, William J. Barry, J. Baltova, Marian Boldea, Adrian Fourcin, W. Gonet, Ryszard Gubrynowicz, E. Hallum, Lori Lamel, Krzysztof Marasek, Alain Marchal, Einar Meister, Klára Vicsi:
BABEL: an eastern european multi-language database.

- Ren-Hua Wang, Deyu Xia, Jinfu Ni, Bicheng Liu:
USTC95 - a putonghua corpus.

- Edward Hurley, Joseph Polifroni, James R. Glass:
Telephone data collection using the world wide web.

- M. Falcone, A. Gallo:
The "SIVA" speech database for speaker verification: description and evaluation.

- Christoph Draxler:
A multi-level description of date expressions in German telephone speech.

- Robert H. Halstead Jr., Ben Serridge, Jean-Manuel Van Thong, William Goldenthal:
Viterbi search visualization using vista: a generic performance visualization tool.

- Toomas Altosaar, Matti Karjalainen, Martti Vainio:
A multilingual phonetic representation and analysis system for different speech databases.

- Detlev Langmann, Reinhold Haeb-Umbach, Lou Boves, Els den Os:
FRESCO: the French telephone speech data collection - part of the european Speechdat(m) project.

- Johannes Müller, Holger Stahl, Manfred Lang:
Predicting the out-of-vocabulary rate and the required vocabulary size for speech processing applications.

- Nathalie Parlangeau, Alain Marchal:
AMULET: automatic MUltisensor speech labelling and event tracking: study of the spatio-temporal correlations in voiceless plosive production.

- Minsoo Hahn, Sanghun Kim, Jung-Chul Lee, Yong-Ju Lee:
Constructing multi-level speech database for spontaneous speech processing.

- Marian Boldea, Alin Doroga, Tiberiu Dumitrescu, Maria Pescaru:
Preliminaries to a romanian speech database.

- Klaus J. Kohler:
Labelled data bank of spoken standard German the Kiel corpus of read/spontaneous speech.

- I. Lee Hetherington, Michael K. McCandless:
SAPPHIRE: an extensible speech analysis and recognition tool based on tcl/tk.

- Jiro Kiyama, Yoshiaki Itoh, Ryuichi Oka:
Automatic detection of topic boundaries and keywords in arbitrary speech using incremental reference interval-free continuous DP.

- Bo-Ren Bai, Lee-Feng Chien, Lin-Shan Lee:
Very-large-vocabulary Mandarin voice message file retrieval using speech queries.

- Håkan Melin:
Gandalf - a Swedish telephone speaker verification database.

- Ellen Gurman Bard, Catherine Sotillo, Anne H. Anderson, M. M. Taylor:
The DCIEM map task corpus: spontaneous dialogue under sleep deprivation and drug treatment.

- Xavier Menéndez-Pidal, James B. Polikoff, Shirley M. Peters, Jennie E. Leonzio, H. Timothy Bunnell:
The nemours database of dysarthric speech.

- Jean Hennebert, Dijana Petrovska-Delacrétaz:
POST: parallel object-oriented speech toolkit.

Robust Speech Processing
Dialects and Speaking Styles
Production and Perception of Prosody
Topics in ASR and Search
- Joerg P. Ueberla, I. R. Gransden:
Clustered language models with context-equivalent states.

- Yuji Yonezawa, Masato Akagi:
Modeling of contextual effects and its application to word spotting.

- Jochen Junkawitsch, L. Neubauer, Harald Höge, Günther Ruske:
A new keyword spotting algorithm with pre-calculated optimal thresholds.

- Roxane Lacouture, Yves Normandin:
Detection of ambiguous portions of signal corresponding to OOV words or misrecognized portions of input.

- Fabio Brugnara, Marcello Federico:
Techniques for approximating a trigram language model.

- Keizaburo Takagi, Koichi Shinoda, Hiroaki Hattori, Takao Watanabe:
Unsupervised and incremental speaker adaptation under adverse environmental conditions.

- Hugo Van hamme, Filip Van Aelten:
An adaptive-beam pruning technique for continuous speech recognition.

- Carlos Avendaño, Sarel van Vuuren, Hynek Hermansky:
Data based filter design for RASTA-like channel normalization in ASR.

- Stefan Ortmanns, Hermann Ney, Frank Seide, I. Lindam:
A comparison of time conditioned and word conditioned search techniques for large vocabulary speech recognition.

- Stefan Ortmanns, Hermann Ney, A. Eiden:
Language-model look-ahead for large vocabulary speech recognition.

- Jean-Luc Husson, Yves Laprie:
A new search algorithm in segmentation lattices of speech signals.

- Tomokazu Yamada, Shigeki Sagayama:
LR-parser-driven viterbi search with hypotheses merging mechanism using context-dependent phone models.

- Jan Nouza:
Discrete-utterance recognition with a fast match based on total data reduction.

- F. Javier Caminero-Gil, C. de la Torre, Luis Villarrubia, Cesar Martín del Alamo, Lis Hernández:
On-line garbage modeling with discriminant analysis for utterance verification.

- Paul Placeway, John D. Lafferty:
Cheating with imperfect transcripts.

- Naoto Iwahashi:
Novel training method for classifiers used in speaker adaptation.

- Katsuki Minamino:
Large vocabulary word recognition based on a graph-structured dictionary.

- Bach-Hiep Tran, Frank Seide, Volker Steinbiss:
A word graph based n-best search in continuous speech recognition.

- David M. Goblirsch:
Viterbi beam search with layered bigrams.

- Eric Buhrke, Wu Chou, Qiru Zhou:
A wave decoder for continuous speech recognition.

- Eric Thelen:
Long term on-line speaker adaptation for large vocabulary dictation.

- Gerhard Sagerer, Heike Rautenstrauch, Gernot A. Fink, Bernd Hildebrandt, A. Jusek, Franz Kummert:
Incremental generation of word graphs.

- Irina Illina, Yifan Gong:
Improvement in n-best search for continuous speech recognition.

- Antonio Bonafonte, José B. Mariño, Albino Nogueiras:
Sethos: the UPC speech understanding system.

- Pietro Laface, Luciano Fissore, A. Maro, Franco Ravera:
Segmental search for continuous speech recognition.

Multimodal Dialogue/HCI
- A. P. Breen, E. Bowers, W. Welsh:
An investigation into the generation of mouth shapes for a talking head.

- Bertrand Le Goff, Christian Benoît:
A text-to-audiovisual-speech synthesizer for French.

- Yuri Iwano, Shioya Kageyama, Emi Morikawa, Shu Nakazato, Katsuhiko Shirai:
Analysis of head movements and its role in spoken dialogue.

- Satoru Hayamizu, Osamu Hasegawa, Katunobu Itou, Katsuhiko Sakaue, Kazuyo Tanaka, Shigeki Nagaya, Masayuki Nakazawa, T. Endoh, Fumio Togawa, Kenji Sakamoto, Kazuhiko Yamamoto:
RWC multimodal database for interactions by integration of spoken language and visual information.

- Christian Cavé, Isabelle Guaïtella, Roxane Bertrand, Serge Santi, Françoise Harlay, Robert Espesser:
About the relationship between eyebrow movements and F0 variations.

- Laurel Fais, Kyung-ho Loken-Kim, Tsuyoshi Morimoto:
How many words is a picture really worth?

- A. Lagana, F. Lavagetto, A. Storace:
Visual synthesis of source acoustic speech through kohonen neural networks.

- Helena M. Saldaña, David B. Pisoni, Jennifer M. Fellowes, Robert E. Remez:
Audio-visual speech perception without speech cues.

Multilingual Speech Processing
- Jim Barnett, Andrés Corrada, G. Gao, Larry Gillick, Yoshiko Ito, Steve Lowe, Linda Manganaro, Barbara Peskin:
Multilingual speech recognition at dragon systems.

- Joachim Köhler:
Multi-lingual phoneme recognition exploiting acoustic-phonetic similarities of sounds.

- Atsushi Nakamura, Shoichi Matsunaga, Tohru Shimizu, Masahiro Tonomura, Yoshinori Sagisaka:
Japanese speech databases for robust speech recognition.

- Lori Lamel, Martine Adda-Decker, Jean-Luc Gauvain, Gilles Adda:
Spoken language processing in a multilingual context.

- Victor Zue, Stephanie Seneff, Joseph Polifroni, Helen M. Meng, James R. Glass:
Multilingual human-computer interactions: from information access to language learning.

- Ulla Ackermann, Bianca Angelini, Fabio Brugnara, Marcello Federico, Diego Giuliani, Roberto Gretter, Gianni Lazzari, Heinrich Niemann:
Speedata: multilingual spoken data entry.

- Hiyan Alshawi:
Head automata for speech translation.

- Ye-Yi Wang, John D. Lafferty, Alex Waibel:
Word clustering with parallel spoken language corpora.

- Jae-Woo Yang, Youngjik Lee:
Toward translating Korean speech into other languages.

- Thomas Bub, Johannes Schwinn:
VERBMOBIL: the evolution of a complex large speech-to-speech translation system.

- Alon Lavie, Alex Waibel, Lori S. Levin, Donna Gates, Marsal Gavaldà, Torsten Zeppenfeld, Puming Zhan, Oren Glickman:
Translation of conversational speech with JANUS-II.

Acoustics in Synthesis
Pitch and Rate
General ASR Posters
- Puming Zhan, Klaus Ries, Marsal Gavaldà, Donna Gates, Alon Lavie, Alex Waibel:
JANUS-II: towards spontaneous Spanish speech recognition.

- Kris Demuynck, Jacques Duchateau, Dirk Van Compernolle:
Reduced semi-continuous models for large vocabulary continuous speech recognition in Dutch.

- Andrei Constantinescu, Olivier Bornet, Gilles Caloz, Gérard Chollet:
Validating different flexible vocabulary approaches on the Swiss French Polyphone and Polyvar databases.

- Néstor Becerra Yoma, Fergus R. McInnes, Mervyn A. Jack:
Use of a reliability coefficient in noise cancelling by neural net and weighted matching algorithms.

- Kazuhiko Ozeki:
Likelihood normalization using an ergodic HMM for continuous speech recognition.

- Laurence Candille, Henri Meloni:
Dynamic control of a production model.

- Hiroaki Hattori, Eiko Yamada:
Speech recognition using sub-word units dependent on phonetic contexts of both training and recognition vocabularies.

- Bruno Jacob, Christine Sénac:
Hidden Markov models merging acoustic and articulatory information to automatic speech recognition.

- Mats Blomberg, Kjell Elenius:
Creation of unseen triphones from diphones and monophones using a speech production approach.

- Bo Xu, Bing Ma, Shuwu Zhang, Fei Qu, Taiyi Huang:
Speaker-independent dictation of Chinese speech with 32k vocabulary.

- J. J. Humphries, Philip C. Woodland, D. Pearce:
Using accent-specific pronunciation modelling for robust speech recognition.

- Tilo Sloboda, Alex Waibel:
Dictionary learning for spontaneous speech recognition.

- Johan de Veth, Lou Boves:
Comparison of channel normalisation techniques for automatic speech recognition over the phone.

- Manuel A. Leandro, José Manuel Pardo:
Anchor point detection for continuous speech recognition in Spanish: the spotting of phonetic events.

- Bhiksha Raj, Evandro Bacci Gouvêa, Pedro J. Moreno, Richard M. Stern:
Cepstral compensation by polynomial approximation for environment-independent speech recognition.

- B. T. Lilly, Kuldip K. Paliwal:
Effect of speech coders on speech recognition performance.

- Léonard Janer, Josep Martí, Climent Nadeu, Eduardo Lleida-Solano:
Wavelet transforms for non-uniform speech recogntion systems.

- Tsuyoshi Usagawa, Markus Bodden, Klaus Rateitschek:
A binaural model as a front-end for isolated word recognition.

- Hiroshi G. Okuno, Tomohiro Nakatani, Takeshi Kawabata:
A new speech enhancement: speech stream segregation.

Data-based Synthesis
- Andrew Slater, John Coleman:
Non-segmental analysis and synthesis based on a speech database.

- Ralf Benzmüller, William J. Barry:
Microsegment synthesis - economic principles in a low-cost solution.

- X. D. Huang, Alex Acero, J. Adcock, Hsiao-Wuen Hon, J. Goldsmith, J. Liu, Mike Plumpe:
Whistler: a trainable text-to-speech system.

- Thomas Portele, Karlheinz Stöber, Horst Meyer, Wolfgang Hess:
Generation of multiple synthesis inventories by a bootstrapping procedure.

- Bernd Möbius, Jan P. H. van Santen:
Modeling segmental duration in German text-to-speech synthesis.

- Nick Campbell:
Autolabelling Japanese ToBI.

Speaker Identification and Verification
- S. Parthasarathy, Aaron E. Rosenberg:
General phrase speaker verification using sub-word background models and likelihood-ratio scoring.

- Jin'ichi Murakami, Masahide Sugiyama, Hideyuki Watanabe:
Unknown-multiple signal source clustering problem using ergodic HMM and applied to speaker classification.

- Jean-Luc Le Floch, Claude Montacié, Marie-José Caraty:
GMM and ARVM cooperation and competition for text-independent speaker recognition on telephone speech.

- Qiguang Lin, Ea-Ee Jan, ChiWei Che, Dong-Suk Yuk, James L. Flanagan:
Selective use of the speech spectrum and a VQGMM method for speaker identification.

- Michael Newman, Larry Gillick, Yoshiko Ito, Don McAllaster, Barbara Peskin:
Speaker verification through large vocabulary continuous speech recognition.

- Andrea Paoloni, Susanna Ragazzini, Giacomo Ravaioli:
Predictive neural networks in text independent speaker verification: an evaluation on the SIVA database.

Acoustic Phonetics
Perception of Vowels and Consonants
- Jialu Zhang:
On the syllable structures of Chinese relating to speech recognition.

- Takashi Otake, Kiyoko Yoneyama:
Can a moraic nasal occur word-initially in Japanese?

- Winifred Strange, Reiko Akahane-Yamada, B. H. Fitzgerald, R. Kubo:
Perceptual assimilation of american English vowels by Japanese listeners.

- Winifred Strange, Ocke-Schwen Bohn, S. A. Trent, M. C. McNair, K. C. Bielec:
Context and speaker effects in the perceptual assimilation of German vowels by american listeners.

- Mohamed Zahid:
Examination of a perceptual non-native speech contrast: pharyngealized/non-pharyngealized discrimination by French-speaking adults.

- Roel Smits:
Context-dependent relevance of burst and transitions for perceived place in stops: it's in production, not perception.

- Ryoji Baba, Kaori Omuro, Hiromitsu Miyazono, Tsuyoshi Usagawa, Masahiko Higuchi:
The perception of morae in long vowels comparison among Japanese, Korean and English speakers.

- Robin J. Lickley:
Juncture cues to disfluency.

- James R. Sawusch:
Effects of duration and formant movement on vowel perception.

- Neeraj Deshmukh, Richard Duncan, Aravind Ganapathiraju, Joseph Picone:
Benchmarking human performance for continuous speech recognition.

- Takayuki Arai, Misha Pavel, Hynek Hermansky, Carlos Avendaño:
Intelligibility of speech with filtered time trajectories of spectral envelopes.

- Douglas H. Whalen, Sonya M. Sheffert:
Perceptual use of vowel and speaker information in breath sounds.

- Philippe Mousty, Monique Radeau, Ronald Peereman, Paul Bertelson:
The role of neighborhood relative frequency in spoken word recognition.

- James M. McQueen, Mark A. Pitt:
Transitional probability and phoneme monitoring.

- Anne Bonneau:
Identification of vowel features from French stop bursts.

- Zinny S. Bond, Thomas J. Moore, Beverley Gable:
Listening in a second language.

- Denis Burnham, Elizabeth Francis, Di Webster, Sudaporn Luksaneeyanawin, Chayada Attapaiboon, Francisco Lacerda, Peter Keller:
Perception of lexical tone across languages: evidence for a linguistic mode of processing.

- James S. Magnuson, Reiko Akahane-Yamada:
Acoustic correlates to the effects of talker variability on the perception of English /r/ and /l/ by Japanese listeners.

Last update Sat May 25 18:31:45 2013
CET by the DBLP Team —
Data released under the ODC-BY 1.0 license — See also our legal information page