Volume 20, Number 1, 2012
- Helen Meng:
Farewell Editorial.
1

- Li Deng:
Inaugural Editorial: Riding the Tidal Wave of Human-Centric Information Processing - Innovate, Outreach, Collaborate, Connect, Expand, and Win.
2-3

- Dong Yu, Geoffrey E. Hinton, Nelson Morgan, Jen-Tzung Chien, Shigeki Sagayama:
Introduction to the Special Section on Deep Learning for Speech and Language Processing.
4-6

- Nelson Morgan:
Deep and Wide: Multiple Layers in Automatic Speech Recognition.
7-13

- Abdel-rahman Mohamed, George E. Dahl, Geoffrey E. Hinton:
Acoustic Modeling Using Deep Belief Networks.
14-22

- Garimella S. V. S. Sivaram, Hynek Hermansky:
Sparse Multilayer Perceptron for Phoneme Recognition.
23-29

- George E. Dahl, Dong Yu, Li Deng, Alex Acero:
Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition.
30-42

- George Saon, Jen-Tzung Chien:
Bayesian Sensing Hidden Markov Models.
43-54

- Jen-Tzung Chien, Chuang-Hua Chueh:
Topic-Based Hierarchical Segmentation.
55-66

- I. Yücel Özbek, Mark Hasegawa-Johnson, Mübeccel Demirekler:
On Improving Dynamic State Space Approaches to Articulatory Inversion With MAP-Based Parameter Estimation.
67-81

- Mark R. P. Thomas, Jon Gudnason, Patrick A. Naylor:
Estimation of Glottal Closing and Opening Instants in Voiced Speech Using the YAGA Algorithm.
82-91

- Jesper Jensen, Richard C. Hendriks:
Spectral Magnitude Minimum Mean-Square Error Estimation Using Binary and Continuous Gain Functions.
92-102

- Hen-Geul Yeh, Carlos Rangel Ruiz:
Fixed-Point Implementation of Cascaded Forward-Backward Adaptive Predictors.
103-107

- Tobias May, Steven van de Par, Armin Kohlrausch:
Noise-Robust Speaker Recognition Combining Missing Data Techniques and Universal Background Modeling.
108-121

- Alberto Carini, Stefania Cecchi, Francesco Piazza, Ivan Omiciuolo, Giovanni L. Sicuranza:
Multiple Position Room Response Equalization in Frequency Domain.
122-135

- Iman S. Mossavat, Petko N. Petkov, W. Bastiaan Kleijn, Oliver Amft:
A Hierarchical Bayesian Approach to Modeling Heterogeneity in Speech Quality Assessment.
136-146

- Thomas Ulrich Christiansen, Steven Greenberg:
Perceptual Confusions Among Consonants, Revisited - Cross-Spectral Integration of Phonetic-Feature Information and Consonant Recognition.
147-161

- Enzo De Sena, Hüseyin Hacihabiboglu, Zoran Cvetkovic:
On the Design and Implementation of Higher Order Differential Microphones.
162-174

- Ted S. Wada, Biing-Hwang Juang:
Enhancement of Residual Echo for Robust Acoustic Echo Cancellation.
175-189

- Adam M. Stark, Mark D. Plumbley:
Performance Following: Real-Time Prediction of Musical Sequences Without a Score.
190-199

- Matthias Mauch, Hiromasa Fujihara, Masataka Goto:
Integrating Additional Chord Information Into HMM-Based Lyrics-to-Audio Alignment.
200-210

- Berlin Chen, Shih-Hsiang Lin:
A Risk-Aware Modeling Framework for Speech Summarization.
211-222

- Richard C. Hendriks, Timo Gerkmann:
Noise Correlation Matrix Estimation for Multi-Microphone Speech Enhancement.
223-233

- Giovanni L. Sicuranza, Alberto Carini:
On the BIBO Stability Condition of Adaptive Recursive FLANN Filters With Application to Nonlinear Active Noise Control.
234-245

- Francesco Nesta, Maurizio Omologo:
Generalized State Coherence Transform for Multidimensional TDOA Estimation of Multiple Sources.
246-260

- Yasmín Montenegro M., José Carlos M. Bermudez:
Transient Mean-Square Analysis of Prediction Error Method-Based Adaptive Feedback Cancellation in Hearing Aids.
261-275

- Lei Xie, Lilei Zheng, Zihan Liu, Yanning Zhang:
Laplacian Eigenmaps for Automatic Story Segmentation of Broadcast News.
276-289

- Norberto Degara, Enrique Argones-Rúa, Antonio Pena, Soledad Torres-Guijarro, Matthew E. P. Davies, Mark D. Plumbley:
Reliability-Informed Beat Tracking of Musical Signals.
290-301

- Jen-Tzung Chien, Hsin-Lung Hsieh:
Convex Divergence ICA for Blind Source Separation.
302-313

- Han-Gil Moon:
A Low-Complexity Design for an MP3 Multi-Channel Audio Decoding System.
314-321

- Celia Shahnaz, Wei-Ping Zhu, M. Omair Ahmad:
Pitch Estimation Based on a Harmonic Sinusoidal Autocorrelation Model and a Time-Domain Matching Scheme.
322-335

- Claudio Garretón, Néstor Becerra Yoma:
Telephone Channel Compensation in Speaker Verification Using a Polynomial Approximation in the Log-Filter-Bank Energy Domain.
336-341

- Vishweshwara Rao, Pradeep Gaddipati, Preeti Rao:
Signal-Driven Window-Length Adaptation for Sinusoid Detection in Polyphonic Music.
342-348

Volume 20, Number 2, February 2012
- Xavier Anguera Miró, Simon Bozonnet, Nicholas W. D. Evans, Corinne Fredouille, Gerald Friedland, Oriol Vinyals:
Speaker Diarization: A Review of Recent Research.
356-370

- Gerald Friedland, Adam Janin, David Imseng, Xavier Anguera Miró, Luke R. Gottlieb, Marijn Huijbregts, Mary Tai Knox, Oriol Vinyals:
The ICSI RT-09 Speaker Diarization System.
371-381

- Nicholas W. D. Evans, Simon Bozonnet, Dong Wang, Corinne Fredouille, Raphaël Troncy:
A Comparative Study of Bottom-Up and Top-Down Approaches to Speaker Diarization.
382-392

- Marijn Huijbregts, David A. van Leeuwen, Chuck Wooters:
Speaker Diarization Error Analysis Using Oracle Components.
393-403

- Marijn Huijbregts, David A. van Leeuwen:
Large-Scale Speaker Diarization for Long Recordings and Small Collections.
404-413

- Oshry Ben-Harush, Itshak Lapidot, Hugo Guterman:
Initialization of Iterative-Based Speaker Diarization Systems for Telephone Conversations.
414-425

- José Manuel Pardo, Roberto Barra-Chicote, Rubén San Segundo, Ricardo de Córdoba, Beatriz Martínez-González:
Speaker Diarization Features: The UPM Contribution to the RT09 Evaluation.
426-435

- Martin Zelenák, Carlos Segura, Jordi Luque, Javier Hernando:
Simultaneous Speech Detection With Spatial Features for Speaker Diarization.
436-446

- Katsuhiko Ishiguro, Takeshi Yamada, Shoko Araki, Tomohiro Nakatani, Hiroshi Sawada:
Probabilistic Speaker Diarization With Bag-of-Words Representations of Speaker Angle Information.
447-460

- Tin Lay Nwe, Hanwu Sun, Bin Ma, Haizhou Li:
Speaker Clustering and Cluster Purification Methods for RT07 and RT09 Evaluation Meeting Data.
461-473

- Fernando Batista, Helena Moniz, Isabel Trancoso, Nuno J. Mamede:
Bilingual Experiments on Automatic Recovery of Capitalization and Punctuation of Automatic Speech Transcripts.
474-485

- Thomas Hain, Lukás Burget, John Dines, Philip N. Garner, Frantisek Grézl, Asmaa El Hannani, Marijn Huijbregts, Martin Karafiát, Mike Lincoln, Vincent Wan:
Transcribing Meetings With the AMIDA Systems.
486-498

- Takaaki Hori, Shoko Araki, Takuya Yoshioka, Masakiyo Fujimoto, Shinji Watanabe, Takanobu Oba, Atsunori Ogawa, Kazuhiro Otsuka, Dan Mikami, Keisuke Kinoshita, Tomohiro Nakatani, Atsushi Nakamura, Junji Yamato:
Low-Latency Real-Time Meeting Recognition and Understanding Using Distant Microphones and Omni-Directional Camera.
499-513

- Joan Serrà, Holger Kantz, Xavier Serra, Ralph G. Andrzejak:
Predictability of Music Descriptor Time Series and its Application to Cover Song Detection.
514-525

- Marco Dinarelli, Alessandro Moschitti, Giuseppe Riccardi:
Discriminative Reranking for Spoken Language Understanding.
526-539

- Ebru Arisoy, Murat Saraclar, Brian Roark, Izhak Shafran:
Discriminative Language Modeling With Linguistic and Statistically Derived Features.
540-550

- Björn Hoffmeister, Georg Heigold, David Rybach, Ralf Schlüter, Hermann Ney:
WFST Enabled Solutions to ASR Problems: Beyond HMM Decoding.
551-564

- Alberto Sanchís, Alfons Juan, Enrique Vidal:
A Word-Based Naïve Bayes Classifier for Confidence Estimation in Speech Recognition.
565-574

- Wen Zhang, Mengqiu Zhang, Rodney A. Kennedy, Thushara D. Abhayapala:
On High-Resolution Head-Related Transfer Function Measurements: An Efficient Sampling Scheme.
575-584

- Sungrack Yun, Chang D. Yoo:
Loss-Scaled Large-Margin Gaussian Mixture Models for Speech Emotion Classification.
585-598

- Nima Yousefian, Philipos C. Loizou:
A Dual-Microphone Speech Enhancement Algorithm Based on the Coherence Function.
599-609

- Laura E. Boucheron, Phillip L. De Leon, Steven Sandoval:
Low Bit-Rate Speech Coding Through Quantization of Mel-Frequency Cepstral Coefficients.
610-619

- Nam Soo Kim, Tae Gyoon Kang, Shin Jae Kang, Chang Woo Han, Doo Hwa Hong:
Speech Feature Mapping Based on Switching Linear Dynamic System.
620-631

- Yi-Cheng Pan, Hung-yi Lee, Lin-Shan Lee:
Interactive Spoken Document Retrieval With Suggested Key Terms Ranked by a Markov Decision Process.
632-645

- Jake Gunther:
Learning Echo Paths During Continuous Double-Talk Using Semi-Blind Source Separation.
646-660

- Meng Yu, Wenye Ma, Jack Xin, Stanley Osher:
Multi-Channel l1 Regularized Convex Speech Enhancement Model and Fast Computation by the Split Bregman Method.
661-675

- Hüseyin Hacihabiboglu, Zoran Cvetkovic:
Multichannel Dereverberation Theorems and Robustness Issues.
676-689

- Laura Romoli, Stefania Cecchi, Paolo Peretti, Francesco Piazza:
A Mixed Decorrelation Approach for Stereo Acoustic Echo Cancellation Based on the Estimation of the Fundamental Frequency.
690-698

- Jacob Benesty, Mehrez Souden, Yiteng Huang:
A Perspective on Differential Microphone Arrays in the Context of Noise Reduction.
699-704

- Frédéric Mustière, Martin Bouchard, Miodrag Bolic:
All-Pole Modeling of Discrete Spectral Powers: A Unified Approach.
705-708

- Takayuki Arai, Nao Hodoshima, Keiichi Yasu:
Errata to "Using Steady-State Suppression to Improve Speech Intelligibility in Reverberant Environments for Elderly Listeners".
709

Volume 20, Number 3, March 2012
- Kazuyoshi Yoshii, Masataka Goto:
A Nonparametric Bayesian Multipitch Analyzer Based on Infinite Latent Harmonic Allocation.
717-730

- Siddika Parlak, Murat Saraclar:
Performance Analysis and Improvement of Turkish Broadcast News Retrieval.
731-741

- Haohai Sun, Shefeng Yan, U. Peter Svensson:
Optimal Higher Order Ambisonics Encoding With Predefined Constraints.
742-754

- Mitchell McLaren, David A. van Leeuwen:
Source-Normalized LDA for Robust Speaker Recognition Using i-Vectors From Multiple Speech Sources.
755-766

- Elias K. Kokkinis, Joshua D. Reiss, John Mourjopoulos:
A Wiener Filter Approach to Microphone Leakage Reduction in Close-Microphone Applications.
767-779

- Qiang Fu, Yong Zhao, Biing-Hwang Juang:
Automatic Speech Recognition Based on Non-Uniform Error Criteria.
780-793

- Heiga Zen, Mark J. F. Gales, Yoshihiko Nankaku, Keiichi Tokuda:
Product of Experts for Statistical Parametric Speech Synthesis.
794-805

- Elina Helander, Hanna Silén, Tuomas Virtanen, Moncef Gabbouj:
Voice Conversion Using Dynamic Kernel Partial Least Squares Regression.
806-817

- Ning Ma, Jon Barker, Heidi Christensen, Phil Green:
Combining Speech Fragment Decoding and Adaptive Noise Floor Modeling.
818-827

- Liang-Che Sun, Lin-Shan Lee:
Modulation Spectrum Equalization for Improved Robust Speech Recognition.
828-843

- Matija Marolt:
Automatic Transcription of Bell Chiming Recordings.
844-853

- Emanuël Anco Peter Habets, Jacob Benesty, Patrick A. Naylor:
A Speech Distortion and Interference Rejection Constraint Beamformer.
854-867

- Yousheng Chen, Qin Gong:
A Normalized Beamforming Algorithm for Broadband Speech Using a Continuous Interleaved Sampling Strategy.
868-874

- Sabato Marco Siniscalchi, Dau-Cheng Lyu, Torbjørn Svendsen, Chin-Hui Lee:
Experiments on Cross-Language Attribute Detection and Phone Recognition With Minimal Target-Specific Training Data.
875-887

- Xiang Lin, Andy W. H. Khong, Patrick A. Naylor:
A Forced Spectral Diversity Algorithm for Speech Dereverberation in the Presence of Near-Common Zeros.
888-899

- Yu-Hsiang Bosco Chiu, Bhiksha Raj, Richard M. Stern:
Learning-Based Auditory Encoding for Robust Speech Recognition.
900-914

- Amir Adler, Valentin Emiya, Maria G. Jafari, Michael Elad, Rémi Gribonval, Mark D. Plumbley:
Audio Inpainting.
922-932

- Ana M. Barbancho, Anssi Klapuri, Lorenzo J. Tardón, Isabel Barbancho:
Automatic Transcription of Guitar Chords and Fingering From Audio.
915-921

- Wei Chu, Abeer Alwan:
SAFE: A Statistical Approach to F0 Estimation Under Clean and Noisy Conditions.
933-944

- Ashish Panda, Thambipillai Srikanthan:
Psychoacoustic Model Compensation for Robust Speaker Verification in Environmental Noise.
945-953

- Emanuel A. P. Habets, Jacob Benesty:
A Perspective on Frequency-Domain Beamformers in Room Acoustics.
947-960

- Thomas Drugman, Thierry Dutoit:
The Deterministic Plus Stochastic Model of the Residual Signal and Its Applications.
968-981

- S. C. Chan, Y. Chu:
Performance Analysis and Design of FxLMS Algorithm in Broadband ANC System With Online Secondary-Path Modeling.
982-993

- Thomas Drugman, Mark R. P. Thomas, Jon Gudnason, Patrick A. Naylor, Thierry Dutoit:
Detection of Glottal Closure Instants From Speech Signals: A Quantitative Review.
994-1006

- Alfonso Perez Carrillo, Jordi Bonada, Esteban Maestre, Enric Guaus, Merlijn Blaauw:
Performance Control Driven Violin Timbre Model Based on Neural Networks.
1007-1021

- Ravi K. Chivukula, Yuriy A. Reznik, Venkat Devarajan, Mythreya Jayendra-Lakshman:
Fast Algorithms for Low-Delay SBR Filterbanks in MPEG-4 AAC-ELD.
1022-1031

- Xianyu Zhao, Yuan Dong:
Variational Bayesian Joint Factor Analysis Models for Speaker Verification.
1032-1042

- Ashutosh Pandey, V. John Mathews:
Adaptive Gain Processing With Offending Frequency Suppression for Digital Hearing Aids.
1043-1055

- Tamar Shoham, David Malah, Slava Shechtman:
Quality Preserving Compression of a Concatenative Text-To-Speech Acoustic Database.
1056-1068

- Vladimir Despotovic, Norbert Goertz, Zoran Peric:
Nonlinear Long-Term Prediction of Speech Based on Truncated Volterra Series.
1069-1073

- Siow Yong Low, Svetha Venkatesh, Sven Nordholm:
A Spectral Slit Approach to Doubletalk Detection.
1074-1080

Volume 20, Number 4, May 2012
- S. Nakagawa, L. Wang, S. Ohtsuka:
Speaker Identification and Verification by Combining MFCC and Phase Information.
1085-1095

- Riccardo Miotto, Gert R. G. Lanckriet:
A Generative Context Model for Semantic Music Annotation and Retrieval.
1096-1108

- C.-C. Lin, R. T.-H. Tsai:
A Generative Data Augmentation Model for Enhancing Chinese Dialect Pronunciation Prediction.
1109-1117

- Alexey Ozerov, Emmanuel Vincent, Frédéric Bimbot:
A General Flexible Framework for the Handling of Prior Information in Audio Source Separation.
1118-1133

- Jia-Min Ren, Jyh-Shing Roger Jang:
Discovering Time-Constrained Sequential Patterns for Music Genre Classification.
1134-1144

- Virginia Estellers, Mihai Gurban, Jean-Philippe Thiran:
On Dynamic Stream Weighting for Audio-Visual Speech Recognition.
1145-1157

- Navin Chatlani, John J. Soraghan:
EMD-Based Filtering (EMDF) of Low-Frequency Noise for Speech Enhancement.
1158-1166

- Haiyan Shu, Haibin Huang, Susanto Rahardja:
Analysis of Bit-Plane Probability for Generalized Gaussian Distribution and its Application in Audio Coding.
1167-1176

- Tobias Rosenkranz, Henning Puder:
Improving Robustness of Codebook-Based Noise Estimation Approaches With Delta Codebooks.
1177-1188

- Ines Hafizovic, Carl-Inge Colombo Nilsen, Sverre Holm:
Transformation Between Uniform Linear and Spherical Microphone Arrays With Symmetric Responses.
1189-1195

- Xiaohong Yang, Yufang Yang:
Prosodic Realization of Rhetorical Structure in Chinese Discourse.
1196-1206

- David T. Yeh:
Automated Physical Modeling of Nonlinear Audio Circuits for Real-Time Audio Effects - Part II: BJT and Vacuum Tube Examples.
1207-1216

- Manish Narwaria, Weisi Lin, Ian Vince McLoughlin, Sabu Emmanuel, Liang-Tien Chia:
Nonintrusive Quality Assessment of Noise Suppressed Speech With Mel-Filtered Energies and Support Vector Regression.
1217-1232

- Wei-Ho Tsai, Hsin-Chieh Lee:
Automatic Evaluation of Karaoke Singing Based on Pitch, Volume, and Rhythm Features.
1233-1243

- Takanobu Oba, Takaaki Hori, Atsushi Nakamura, Akinori Ito:
Round-Robin Duel Discriminative Language Models.
1244-1255

- Yiteng Arden Huang, Jacob Benesty:
A Multi-Frame Approach to the Frequency-Domain Single-Channel Noise Reduction Problem.
1256-1269

- Miroslav Zivanovic, Johan Schoukens:
Single and Piecewise Polynomials for Modeling of Pitched Sounds.
1270-1281

- Yaakov Bucris, Israel Cohen, Miriam A. Doron:
Bayesian Focusing for Coherent Wideband Beamforming.
1282-1296

- Hélène Papadopoulos, Geoffroy Peeters:
Local Key Estimation From an Audio Signal Relying on Harmonic and Metrical Structures.
1297-1312

- Elizabeth Godoy, Olivier Rosec, Thierry Chonavel:
Voice Conversion Using Dynamic Frequency Warping With Amplitude Scaling, for Parallel or Nonparallel Corpora.
1313-1323

- Ruofei Chen, Cheung-Fat Chan, Hing-Cheung So:
Model-Based Speech Enhancement With Improved Spectral Envelope Estimation via Dynamics Tracking.
1324-1336

- Qun Feng Tan, Shrikanth S. Narayanan:
Novel Variations of Group Sparse Regularization Techniques With Applications to Noise Robust Automatic Speech Recognition.
1337-1346

- Rubén Solera-Ureña, Ana I. García-Moral, Carmen Peláez-Moreno, Manel Martínez-Ramón, Fernando Díaz-de-María:
Real-Time Robust Automatic Speech Recognition Using Compact Support Vector Machines.
1347-1361

- Amin Fazel, Shantanu Chakrabartty:
Sparse Auditory Reproducing Kernel (SPARK) Features for Noise-Robust Speech Recognition.
1362-1371

- Jorge I. Marin-Hurtado, Devangi N. Parikh, David V. Anderson:
Perceptually Inspired Noise-Reduction Method for Binaural Hearing Aids.
1372-1382

- Timo Gerkmann, Richard C. Hendriks:
Unbiased MMSE-Based Noise Power Estimation With Low Complexity and Low Tracking Delay.
1383-1393

- Haiquan Zhao, Xiangping Zeng, Xiaoqiang Zhang, Zhengyou He, Tian-rui Li, Weidong Jin:
Adaptive Extended Pipelined Second-Order Volterra Filter for Nonlinear Active Noise Controller.
1394-1399

- Damián Marelli, Mitsuko Aramaki, Richard Kronland-Martinet, Charles Verron:
An Efficient Time-Frequency Method for Synthesizing Noisy Sounds With Short Transients and Narrow Spectral Components.
1400-1408

- Maurice F. Fallon, Simon J. Godsill:
Acoustic Source Localization and Tracking of a Time-Varying Number of Speakers.
1409-1415

Volume 20, Number 5, July 2012
- Vesa Välimäki, Julian D. Parker, Lauri Savioja, Julius O. Smith, Jonathan S. Abel:
Fifty Years of Artificial Reverberation.
1421-1448

- Flavio P. Ribeiro, Dinei A. F. Florêncio, Demba E. Ba, Cha Zhang:
Geometrically Constrained Room Modeling With Compact Microphone Arrays.
1449-1460

- Wenliang Chen, Jun'ichi Kazama, Min Zhang, Yoshimasa Tsuruoka, Yujie Zhang, Yiou Wang, Kentaro Torisawa, Haizhou Li:
Bitext Dependency Parsing With Auto-Generated Bilingual Treebank.
1461-1472

- K. Lakhdhar, R. Lefebvre:
Context-Based Adaptive Arithmetic Encoding of EAVQ Indices.
1473-1481

- Chao-Ling Hsu, DeLiang Wang, Jyh-Shing Roger Jang, Ke Hu:
A Tandem Algorithm for Singing Pitch Extraction and Voice Separation From Music Accompaniment.
1482-1491

- Zhen-Hua Ling, Li-Rong Dai:
Minimum Kullback-Leibler Divergence Parameter Generation for HMM-Based Speech Synthesis.
1492-1502

- John Woodruff, DeLiang Wang:
Binaural Localization of Multiple Sources in Reverberant and Noisy Environments.
1503-1512

- Welly Naptali, Masatoshi Tsuchiya, Seiichi Nakagawa:
Topic-Dependent-Class-Based $n$-Gram Language Model.
1513-1525

- Jesper Rindom Jensen, Jacob Benesty, Mads Græsbøll Christensen, Søren Holdt Jensen:
Non-Causal Time-Domain Filters for Single-Channel Noise Reduction.
1526-1541

- Kamil Adilolu, Robert Anniés, Elio Wahlen, Hendrik Purwins, Klaus Obermayer:
A Graphical Representation and Dissimilarity Measure for Basic Everyday Sound Events.
1542-1552

- Cees H. Taal, Richard C. Hendriks, Richard Heusdens:
A Low-Complexity Spectro-Temporal Distortion Measure for Audio Processing Applications.
1553-1564

- Huawei Chen, Wee Ser, Jianjiang Zhou:
Robust Nearfield Wideband Beamformer Design Using Worst Case Mean Performance Optimization With Passband Response Variance Constraint.
1565-1572

- D. Rama Sanand, Srinivasan Umesh:
VTLN Using Analytically Determined Linear-Transformation on Conventional MFCC.
1573-1584

- Sandro Cumani, Pietro Laface:
Analysis of Large-Scale SVM Training Algorithms for Language and Speaker Recognition.
1585-1596

- Xiaoyan Cai, Wenjie Li:
Mutually Reinforced Manifold-Ranking Based Relevance Propagation Model for Query-Focused Multi-Document Summarization.
1597-1607

- Xiaojia Zhao, Yang Shao, DeLiang Wang:
CASA-Based Robust Speaker Identification.
1608-1616

- Saeed Mosayyebpour, Hamid Sheikhzadeh, T. Aaron Gulliver, Morteza Esmaeili:
Single-Microphone LP Residual Skewness-Based Inverse Filtering of the Room Impulse Response.
1617-1632

- Upendra V. Chaudhari, Michael Picheny:
Matching Criteria for Vocabulary-Independent Search.
1633-1643

- Daniele Giacobello, Mads Græsbøll Christensen, Manohar N. Murthi, Søren Holdt Jensen, Marc Moonen:
Sparse Linear Prediction and Its Applications to Speech Processing.
1644-1657

- Stefan Bilbao:
Optimized FDTD Schemes for 3-D Acoustic Wave Propagation.
1658-1663

Volume 20, Number 6, August 2012
- Sin-Horng Chen, Jyh-Her Yang, Chen-Yu Chiang, Ming-Chieh Liu, Yih-Ru Wang:
A New Prosody-Assisted Mandarin ASR System.
1669-1684

- Romain Serizel, Marc Moonen, Jan Wouters, Søren Holdt Jensen:
A Zone-of-Quiet Based Approach to Integrated Active Noise Control and Noise Reduction for Speech Enhancement in Hearing Aids.
1685-1697

- Christian D. Sigg, Tomas Dikk, Joachim M. Buhmann:
Speech Enhancement Using Generative Dictionary Learning.
1698-1712

- Heiga Zen, Norbert Braunschweiler, Sabine Buchholz, Mark J. F. Gales, Kate Knill, Sacha Krstulovic, Javier Latorre:
Statistical Parametric Speech Synthesis Based on Speaker and Language Factorization.
1713-1724

- Christian Schüldt, Fredric Lindström, Ingvar Claesson:
A Delay-Based Double-Talk Detector.
1725-1733

- Alastair J. Manders, David M. Simpson, Steven L. Bell:
Objective Prediction of the Sound Quality of Music Processed by an Adaptive Feedback Canceller.
1734-1745

- Shoichi Koyama, Ken'ichi Furuya, Yusuke Hiwasaki, Yoichi Haneda:
Reproducing Virtual Sound Sources in Front of a Loudspeaker Array Using Inverse Wave Propagator.
1746-1758

- Justin Salamon, Emilia Gómez:
Melody Extraction From Polyphonic Music Signals Using Pitch Contour Characteristics.
1759-1770

- Yizhao Ni, Matt McVicar, Raúl Santos-Rodriguez, Tijl De Bie:
An End-to-End Machine Learning System for Harmonic Analysis of Music.
1771-1783

- Daisuke Saito, Shinji Watanabe, Atsushi Nakamura, Nobuaki Minematsu:
Statistical Voice Conversion Based on Noisy Channel Model.
1784-1794

- Daniel Angus, Andrew E. Smith, Janet Wiles:
Human Communication as Coupled Time Series: Quantifying Multi-Participant Recurrence.
1795-1807

- Claire Masterson, Gavin Kearney, Marcin Gorzel, Francis M. Boland:
HRIR Order Reduction Using Approximate Factorization.
1808-1817

- Jan Vanek, Jan Trmal, Josef V. Psutka, Josef Psutka:
Optimized Acoustic Likelihoods Computation for NVIDIA and ATI/AMD Graphics Processors.
1818-1828

- Jan Ole Jungmann, Radoslaw Mazur, Markus Kallinger, Tiemin Mei, Alfred Mertins:
Combined Acoustic MIMO Channel Crosstalk Cancellation and Room Impulse Response Reshaping.
1829-1842

- Tianyu T. Wang, Thomas F. Quatieri:
Two-Dimensional Speech-Signal Modeling.
1843-1856

- Isabel Barbancho, Lorenzo J. Tardón, Simone Sammartino, Ana M. Barbancho:
Inharmonicity-Based Method for the Automatic Generation of Guitar Tablature.
1857-1868

- Amit Das, John H. L. Hansen:
Constrained Iterative Speech Enhancement Using Phonetic Classes.
1869-1883

- Abbas Keshavarz, Saeed Mosayyebpour, Mehrzad Biguesh, T. Aaron Gulliver, Morteza Esmaeili:
Speech-Model Based Accurate Blind Reverberation Time Estimation Using an LPC Filter.
1884-1893

- Anil Kumar Vuppala, Jainath Yadav, Saswat Chakrabarti, K. Sreenivasa Rao:
Vowel Onset Point Detection for Low Bit Rate Coded Speech.
1894-1903

Volume 20, Number 7, 2012
- Theodoros Giannakopoulos, Sergios Petridis:
Fisher Linear Semi-Discriminant Analysis for Speaker Diarization.
1913-1922

- Xiaodong Cui, Jing Huang, Jen-Tzung Chien:
Multi-View and Multi-Objective Semi-Supervised Learning for HMM-Based Automatic Speech Recognition.
1923-1935

- Jacob L. Newman, Stephen J. Cox:
Language Identification Using Visual Features.
1936-1947

- Jesper Rindom Jensen, Jacob Benesty, Mads Græsbøll Christensen, Søren Holdt Jensen:
Enhancement of Single-Channel Periodic Signals in the Time-Domain.
1948-1963

- Marco Compagnoni, Paolo Bestagini, Fabio Antonacci, Augusto Sarti, Stefano Tubaro:
Localization of Acoustic Sources Through the Fitting of Propagation Cones Using Multiple Independent Arrays.
1964-1975

- Jung-Woo Choi, Yang-Hann Kim:
Integral Approach for Reproduction of Virtual Sound Source Surrounded by Loudspeaker Array.
1976-1989

- Tomi Kinnunen, Rahim Saeidi, Filip Sedlak, Kong-Aik Lee, Johan Sandberg, Maria Hansson-Sandsten, Haizhou Li:
Low-Variance Multitaper MFCC Features: A Case Study in Robust Speaker Verification.
1990-2001

- Wen-Lin Zhang, Weiqiang Zhang, Bi-Cheng Li, Dan Qu, Michael T. Johnson:
Bayesian Speaker Adaptation Based on a New Hierarchical Probabilistic Model.
2002-2015

- Tobias May, Steven van de Par, Armin Kohlrausch:
A Binaural Scene Analyzer for Joint Localization and Recognition of Speakers in the Presence of Interfering Noise Sources and Reverberation.
2016-2030

- Armando Muscariello, Guillaume Gravier, Frédéric Bimbot:
Unsupervised Motif Acquisition in Speech via Seeded Discovery and Template Matching Combination.
2031-2044

- César González Ferreras, David Escudero Mancebo, Carlos Vivaracho-Pascual, Valentín Cardeñoso-Payo:
Improving Automatic Classification of Prosodic Events by Pairwise Coupling.
2045-2058

- Maximo Cobos, José J. López:
Maximum a Posteriori Binary Mask Estimation for Underdetermined Source Separation Using Smoothed Posteriors.
2059-2064

- Sarmad Malik, Gerald Enzner:
State-Space Frequency-Domain Adaptive Filtering for Nonlinear Acoustic Echo Cancellation.
2065-2079

- Ryoichi Miyazaki, Hiroshi Saruwatari, Takayuki Inoue, Yu Takahashi, Kiyohiro Shikano, Kazunobu Kondo:
Musical-Noise-Free Speech Enhancement Based on Optimized Iterative Spectral Subtraction.
2080-2094

- Hung-yi Lee, Chia-Ping Chen, Lin-Shan Lee:
Integrating Recognition and Retrieval With Relevance Feedback for Spoken Term Detection.
2095-2110

- Mohamed I. Alkanhal, Mohammed A. Al-Badrashiny, Mansour M. Alghamdi, Abdulaziz O. Al-Qabbany:
Automatic Stochastic Arabic Spelling Correction With Emphasis on Space Insertions and Deletions.
2111-2122

- Stephen J. Elliott, Jordan Cheer, Jung-Woo Choi, Youngtae Kim:
Robustness and Regularization of Personal Audio Systems.
2123-2133

- Lakshmi Saheer, John Dines, Philip N. Garner:
Vocal Tract Length Normalization for Statistical Parametric Speech Synthesis.
2134-2148

- Yongqiang Wang, M. J. F. Gales:
Speaker and Noise Factorization for Robust Speech Recognition.
2149-2158

Volume 20, Number 8, October 2012
- Leonardo O. Nunes, Flávio R. Avila, Alan Freihof Tygel, Luiz W. P. Biscainho, Bowon Lee, Amir Said, Ronald W. Schafer:
A Parametric Objective Quality Assessment Tool for Speech Signals Degraded by Acoustic Echo.
2181-2190

- Yong Zhao, Biing-Hwang Juang:
Nonlinear Compensation Using the Gauss-Newton Method for Noise-Robust Speech Recognition.
2191-2206

- Brian McFee, Luke Barrington, Gert R. G. Lanckriet:
Learning Content Similarity for Music Recommendation.
2207-2218

- Hannu Pulakka, Ulpu Remes, Santeri Yrttiaho, Kalle J. Palomäki, Mikko Kurimo, Paavo Alku:
Bandwidth Extension of Telephone Speech to Low Frequencies Using Sinusoidal Synthesis and a Gaussian Mixture Model.
2219-2231

- Iynkaran Natgunanathan, Yong Xiang, Yue Rong, Wanlei Zhou, Song Guo:
Robust Patchwork-Based Embedding and Decoding Scheme for Digital Audio Watermarking.
2232-2239

- Yotaro Kubo, Shinji Watanabe, Takaaki Hori, Atsushi Nakamura:
Structural Classification Methods Based on Weighted Finite-State Transducers for Automatic Speech Recognition.
2240-2251

- Xiaodong Cui, Jian Xue, Xin Chen, Peder A. Olsen, Pierre L. Dognin, Upendra V. Chaudhari, John R. Hershey, Bowen Zhou:
Hidden Markov Acoustic Modeling With Bootstrap and Restructuring for Low-Resourced Languages.
2252-2264

- Amit Das, John H. L. Hansen:
Phoneme Selective Speech Enhancement Using Parametric Estimators and the Mixture Maximum Model: A Unifying Approach.
2265-2279

- Phillip L. De Leon, Michael Pucher, Junichi Yamagishi, Inma Hernáez, Ibon Saratxaga:
Evaluation of Speaker Verification Security and Detection of HMM-Based Synthetic Speech.
2280-2290

- Wei-Ho Tsai, Hsin-Chieh Lee:
Singer Identification Based on Spoken Data in Voice Characterization.
2291-2300

- Daniel Felps, Christian Geng, Ricardo Gutierrez-Osuna:
Foreign Accent Conversion Through Concatenative Synthesis in the Articulatory Domain.
2301-2312

- Gustavo Reis, Francisco Fernández de Vega, Aníbal Ferreira:
Automatic Transcription of Polyphonic Piano Music Using Genetic Algorithms, Adaptive Spectral Envelope Modeling, and Dynamic Noise Level Estimation.
2313-2328

- Soroosh Mariooryad, Carlos Busso:
Generating Human-Like Behaviors Using Joint, Speech-Driven Models for Conversational Agents.
2329-2340

- Hasim Sak, Murat Saraclar, Tunga Gungor:
Morpholexical and Discriminative Language Models for Turkish Automatic Speech Recognition.
2341-2351

- Yongwon Jeong:
Adaptation of Hidden Markov Models Using Model-as-Matrix Representation.
2352-2364

- Seyedmahdad Mirsamadi, Shabnam Ghaffarzadegan, Hamid Sheikhzadeh, Seyed Mohammad Ahadi, Amir Hossein Rezaie:
Efficient Frequency Domain Implementation of Noncausal Multichannel Blind Deconvolution for Convolutive Mixtures of Speech.
2365-2377

- Barry-John Theobald, Iain Matthews:
Relating Objective and Subjective Performance Measures for AAM-Based Visual Speech Synthesis.
2378-2387

- Terence Betlehem, Christopher Withers:
Sound Field Reproduction With Energy Constraint on Loudspeaker Weights.
2388-2392

Volume 20, Number 9, November 2012
- Nikolaos Mitianoudis:
A Generalized Directional Laplacian Distribution : Estimation, Mixture Models and Audio Source Separation.
2397-2408

- Janne Pylkkönen, Mikko Kurimo:
Analysis of Extended Baum-Welch and Constrained Optimization for Discriminative Training of HMMs.
2409-2419

- Alex Southern, Damian T. Murphy, Lauri Savioja:
Spatial Encoding of Finite Difference Time Domain Acoustic Models for Auralization.
2420-2432

- Marco Crocco, Andrea Trucco:
Stochastic and Analytic Optimization of Sparse Aperiodic Arrays and Broadband Beamformers With Robust Superdirective Patterns.
2433-2447

- Masashi Okada, Takao Onoye, Wataru Kobayashi:
A Ray Tracing Simulation of Sound Diffraction Based on the Analytic Secondary Source Model.
2448-2460

- Ryouichi Nishimura:
Audio Watermarking Using Spatial Masking and Ambisonics.
2461-2469

- Flávio R. Avila, Luiz W. P. Biscainho:
Bayesian Restoration of Audio Signals Degraded by Impulsive Noise Modeled as Individual Pulses.
2470-2481

- Woojay Jeon, Changxue Ma, Dusan Macho:
Statistical Utterance Comparison for Speaker Clustering Using Factor Analysis.
2482-2491

- Justin Jian Zhang, Pascale Fung:
Automatic Parliamentary Meeting Minute Generation Using Rhetorical Structure Modeling.
2492-2504

- Tomoki Toda, Mikihiro Nakagiri, Kiyohiro Shikano:
Statistical Voice Conversion Techniques for Body-Conducted Unvoiced Speech Enhancement.
2505-2517

- Arun Narayanan, DeLiang Wang:
A CASA-Based System for Long-Term SNR Estimation.
2518-2527

- Ronen Talmon, Israel Cohen, Sharon Gannot, Ronald R. Coifman:
Supervised Graph-Based Processing for Sequential Transient Interference Suppression.
2528-2538

- Andre Holzapfel, Matthew E. P. Davies, José R. Zapata, João Lobato Oliveira, Fabien Gouyon:
Selective Sampling for Beat Tracking Evaluation.
2539-2548

- Meng Guo, Søren Holdt Jensen, Jesper Jensen:
Novel Acoustic Feedback Cancellation Approaches in Hearing Aid Applications Using Probe Noise and Probe Noise Enhancement.
2549-2563

- Jens Ahrens, Sascha Spors:
A Modal Analysis of Spatial Discretization of Spherical Loudspeaker Distributions Used for Sound Field Synthesis.
2564-2574

- V. Tourbabin, Morag Agmon, Boaz Rafaely, Joseph Tabrikian:
Optimal Real-Weighted Beamforming With Application to Linear and Spherical Arrays.
2575-2585

- Pejman Mowlaee, Rahim Saeidi, Mads Græsbøll Christensen, Zheng-Hua Tan, Tomi Kinnunen, Pasi Fränti, Søren Holdt Jensen:
A Joint Approach for Single-Channel Speaker Identification and Speech Separation.
2586-2601

- Berlin Chen, Kuan-Yu Chen, Pei-Ning Chen, Yi-Wen Chen:
Spoken Document Retrieval With Unsupervised Query Modeling Techniques.
2602-2612

- Kruthiventi S. S. Srinivas, Kishore Prahallad:
An FIR Implementation of Zero Frequency Filtering of Speech Signals.
2613-2617

Volume 20, Number 10, December 2012
- Mari Ostendorf:
A Message from the Vice President of Publications on New Developments in Signal Processing Society Publications.
2625

- Sundar Harshavardhan, Chandra Sekhar Seelamantula, Thippur V. Sreenivas:
A Mixture Model Approach for Formant Tracking and the Robustness of Student's-t Distribution.
2626-2636

- Steven Hargreaves, Anssi Klapuri, Mark Sandler:
Structural Segmentation of Multitrack Audio.
2637-2647

- Matthew Gibson, Thomas Hain:
Correctness-Adjusted Unsupervised Discriminative Acoustic Model Adaptation.
2648-2656

- Bruno Defraene, Toon van Waterschoot, Hans Joachim Ferreau, Moritz Diehl, Marc Moonen:
Real-Time Perception-Based Clipping of Audio Signals Using Convex Optimization.
2657-2671

- G. Ananthakrishnan, Olov Engwall, Daniel Neiberg:
Exploring the Predictability of Non-Unique Acoustic-to-Articulatory Mappings.
2672-2682

- Fabio Antonacci, Jason Filos, Mark R. P. Thomas, Emanuël Anco Peter Habets, Augusto Sarti, Patrick A. Naylor, Stefano Tubaro:
Inference of Room Geometry From Acoustic Impulse Responses.
2683-2695

- João Lobato Oliveira, Matthew E. P. Davies, Fabien Gouyon, Luís Paulo Reis:
Beat Tracking for Multiple Applications: A Multi-Agent System Architecture With State Recovery.
2696-2706

- Takuya Yoshioka, Tomohiro Nakatani:
Generalization of Multi-Channel Linear Prediction Methods for Blind MIMO Impulse Response Shortening.
2707-2720

Last update Fri May 24 18:05:44 2013
CET by the DBLP Team —
Data released under the ODC-BY 1.0 license — See also our legal information page