|
| |
Publications
Thesis Publications
J. Dines. "Model based trainable speech synthesis and its applications",
Ph.D. Thesis. Queensland University of Technology, Brisbane Australia, 2003.
J. Dines. "Active Noise Control for Agricultural Machinery", Honours
Thesis. University of Southern Queensland, Toowoomba, Australia, 1998.
Book Chapters
[7] Weifeng Li, Kenichi Kumatani, John Dines, Mathew Magimai Doss, Herve Bourlard,
"A Neural Network Based Regression Approach for Recognising Simultaneous Speech"
in: Andei Popescu Belis, Rainer Stiefelhagen, eds., Machine Learning for Multimodal
Interaction, LNCS 5237, Springer-Verlag, Berlin/Heidelberg, 2008, p110-118.
[6] John Dines, Mathew Magimai Doss, "A Study of Phoneme and Grapheme Based Context
Dependent ASR Systems" in: Andei Popescu Belis, Steve Renals, Herve Bourlard, eds., Machine Learning for Multimodal Interaction, LNCS 4892, Springer-Verlag, Berlin/Heidelberg, 2008, p215-226.
[5] Thomas Hain, Lukas Burget, John Dines, Giulia Garau, Martin Karafiat, David van Leeuwan,
Mike Lincoln, Vincent Wan "The 2007 AMI(DA) System for Meeting Transcription"
in: Rainer Stiefelhagen, Rachel Bowers, Jonathon Fiscus, eds., Multimodal Technologies for
Perception of Humans, LNCS 4625, Springer-Verlag, Berlin/Heidelberg, 2008, p414-428.
[4] Darren Moore, John Dines, Mathew Magimai Doss, Jithendra Vepa, Octanvian Cheng,
Thomas Hain "Juicer: A Weighted Finite-State Transducer Speech Decoder"
in: Steve Renals, Samy Bengio, Jonathon G. Fiscus, eds., Machine Learning for
Multimodal Interaction, LNCS 4299, Springer-Verlag, Berlin/Heidelberg, 2006, p285-296.
[3] Thomas Hain, Lukas Burget, John Dines, Giulia Garau, Martin Karafiat, Mike Lincoln,
Jithendra Vepa, Vincent Wan "The AMI Meeting Transcription System: Progress and
Performance" in: Steve Renals, Samy Bengio, Jonathon G. Fiscus, eds.,
Machine Learning for Multimodal Interaction, LNCS 4299, Springer-Verlag,
Berlin/Heidelberg, 2006, p419-431.
[2] Thomas Hain, Lukas Burget, John Dines, Giulia Garau, Martin Karafiat, Mike Lincoln,
Iain McCowan, Darren Moore, Vincent Wan, Roeland Ordelman, Steve Renals "The
2005 AMI System for the Transcription of Speech in Meetings"
in: Steve Renals, Samy Bengio, eds., Machine Learning for Multimodal Interaction,
LNCS 3869, Springer-Verlag, Berlin/Heidelberg, 2006, p450-462.
[1] Thomas Hain, Lukas Burget, John Dines, Giulia Garau, Martin Karafiat, Mike Lincoln,
Iain McCowan, Darren Moore, Vincent Wan, Roeland Ordelman, Steve Renals "The
Development of the AMI System for the Transcription of Speech in Meetings"
in: Steve Renals, Samy Bengio, eds., Machine Learning for Multimodal Interaction,
LNCS 3869, Springer-Verlag, Berlin/Heidelberg, 2006, p344-356.
Journal Publications
[2] John Dines, Junichi Yamagishi, Simon King "Measuring the Gap between HMM-based
ASR and TTS" in: IEEE Journal of Selected Topics in Signal Processing
(accepted for publication).
[1] Junichi Yamagishi, Bela Usababaev, Simon King, Oliver Watts, John Dines, Jilei Tian,
Yong Guan, Rile Hu, Keiichiro Oura, Yi-Jian Wu, Keiichi Tokuda, Reima Karhila,
Mikko Kurimo "Thousands of Voices for HMM-based Speech Synthesis - Analysis
and Applications of TTS Systems Built on Various ASR Corpora" in: IEEE Transactions
on Audio, Speech and Language Processing (accepted for publication).
Conference Publications
[32] Lakshmi Saheer, Philip N. Garner, John Dines, Hui Liang, "VTLN adaptation
for statistical speech synthesis" accepted: ICASSP 2010 (Dallas, USA).
[31] Hui Liang, John Dines, Lakshmi Saheer, "A Comparison of Supervised and Unsupervised
Cross-Lingual Speaker Adaptation Approaches for HMM-Based Speech Synthesis" accepted:
ICASSP 2010 (Dallas, USA).
[30] Danil Korchagin, Philip N. Garner, John Dines, "Automatic Temporal Alignment of AV
Data with Confidence Estimation" accepted: ICASSP 2010 (Dallas, USA).
[29] Junichi Yamagishi, Mike Lincoln, Simon King, John Dines, Mathew Gibson, Jilei Tian,
Yong Guan, "Analysis of Unsupervised and Noise-Robust Speaker Adaptive HMM-based
Speech Synthesis Systems toward a Unified ASR and TTS Framework" in: Proceedings of
Blizzard Challenge Workshop, (Edinburgh, U.K.), 2009.
[28] Junichi Yamagishi, Bela Usabaev, Simon King, Oliver Watts, John Dines, Jilei Tian,
Rile Hu, Yong Guan, Keiichiro Oura, Keiichi Tokuda, Reima Karhila, Mikko Kurimo,
"Thousands of Voices for HMM-based Speech Synthesis" in: Proceedings of
Interspeech, (Brighton, U.K.), 2009.
[27] Philip N. Garner, John Dines, Thomas Hain, Asmaa El Hannani, Martin Karafiat, Danil
Korchagin, Mike Lincoln, Vincent Wan and Le Zhang, "Real-Time ASR from Meetings"
in: Proceedings of Interspeech, (Brighton, U.K.), 2009.
[26] John Dines, Lakshmi Saheer and Hui Liang, "Speech recognition with
speech synthesis models by marginalising over decision tree leaves"
in: Proceedings of Interspeech,
(Brighton, U.K.), 2009.
[25] John Dines, Junichi Yamagishi and Simon King, "Measuring the gap
between HMM-based ASR and TTS" to appear in: Proceedings of Interspeech,
(Brighton, U.K.), 2009.
[24] Weifeng Li, John Dines, Mathew Magimai.-Doss and Herve Bourlard, "Non-linear
mapping for multi-channel speech separation and robust overlapping speech recognition",
in: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing
(ICASSP), 2009
[23] Vincent Wan, John Dines, Asmaa El Hannani, Thomas Hain, "Bob: A
lexicon and pronunciation dictionary generator", in: Proceedings of
Workshop on Spoken Language Technology (SLT), (Goa, India), 2008.
[22] Sarah Favre, Hugues Salamin, John Dines, Alessandro Vinciarelli,
"Role Recognition in Multiparty Recordings using Social Affiliation Networks
and Discrete Distributions", in Proceedings of Special Session on Social
Signal Processing at ICMI, (Chania, Greece), 2008.
[21] Kenichi Kumatani, John McDonough, Barbara Rauch, Philip
N. Garner, Weifeng Li, and John Dines, "Maximum Kurtosis Beamforming with
the Generalized Sidelobe Canceller", in Proceedings of Interspeech-2008,
(Brisbane, Australia), 2008.
[20] Weifeng Li, M. Magimai.-Doss, J. Dines, and H. Bourlard, "MLP-based
Log Spectral Energy Mapping for Robust Overlapping Speech Recognition",
in Proceedings of EUSIPCO, (Lausanne, Switzerland), 2008.
[19] Weifeng Li, J. Dines, M. Magimai.-Doss, and H. Bourlard, "Neural
network based regression for robust overlapping speech recognition using
microphone arrays", in Proceedings of Interspeech-2008, (Brisbane, Australia),
2008.
[18] Weifeng Li, K. Kumatani, J. Dines, M. Magimai.-Doss, and H. Bourlard,
"A Neural Network based Regression Approach for Recogninizing Simultaneous
Speech", in Proceedings of Joint Workshop on Machine Learning and
Multimodal Interaction, (Utrecht, Netherlands), September, 2008.
[17] Thomas Hain, Lukas Burget,
Martin Karafiat, John Dines, David van Leeuwen, Giulia Garau, Mike Lincoln and
Vincent Wan, "AMI/DA STT and SASTT", in Proceedings of RT07 Workshop,
(Baltimore, USA), 10 May 2007.
[16] John Dines and Jithendra
Vepa, "Direct optimisation of a multilayer perceptron for the estimation of
cepstral mean and variance statistics", in Proceedings of Interspeech 2007 Eurospeech,
(Antwerp, Belgium), 2007.
[15] John Dines and Mathew
Magimai Doss, "A study of phoneme and grapheme based context-dependent ASR
systems", in Proceedings of MLMI-07, (Brno, Czech Republic), 2007.
[14] Octavian Cheng, John Dines
and Mathew Magimai Doss, "A Generalized dynamic composition algorithm of
weighted finite state transducers for large vocabulary speech recognition",
in Proceedings of ICASSP, (Honolulu, Hawaii), 2007.
[13]
Thomas Hain, Lukas Burget, John Dines, Giulia
Garau, Vincent Wan, Martin Karafiat, Jithendra Vepa and Mike Lincoln, "The
AMI system for the transcription of speech in meetings", in Proceedings of ICASSP, (Honolulu, Hawaii), 2007.
[12] Thomas Hain, Lukas Burget, John
Dines, Giulia Garau, Martin Karafiat, Mike Lincoln, Jithendra Vepa and Vincent
Wan, "The AMI meeting transcription system: Progress and performance",
in Proceedings of NIST RT'O6 Workshop, (Washington, D.C.), 2006.
[11] John Dines, Jithendra Vepa, and Thomas Hain.
"The segmentation of
multi-channel meeting recordings for automatic speech recognition",
in Interspeech 2006 ICSLP, (Pittsburgh), 2006.
[10] Darren Moore, John Dines, Mathew Magimai Doss, Jithendra Vepa,
Octavian Cheng, Thomas Hain. "Juicer: A Weighted Finite State
Transducer speech decoder", in MLMI-06, (Washington DC), 2006.
[9] Thomas Hain, Lukas Burget, John Dines, Giulia Garau, Martin
Karafiat, Mike Lincoln, Iain McCowan, Darren Moore, Vincent Wan, Roeland
Ordelman and Steve Renals. "The
2005 AMI System for the Transcription of Speech in Meetings", in NIST
Spring 2005 Rich Transcription Workshop, (Edinburgh, Scotland), 2005.
[8] Thomas Hain, Lukas Burget, John Dines, Iain McCowan, Martin
Karafiat, Mike Lincoln, Darren Moore, Giulia Garau, Vincent Wan, Roeland
Ordelman, and Steve Renals. "The Development of the AMI System for
the Transcription of Speech in Meetings", in MLMI, (Edinburgh, UK), 2005.
[7] Thomas Hain, John Dines,
Giulia Garau, Martin Karafiat, Darren Moore, Vincent Wan, Roeland Ordelman and
Steve Renals. "Transcription of Conference Room Meetings: an
Investigation", in Eurospeech, (Lisbon, Portugal), 2005.
[6] G. Aradilla, J Dines and S Silvadas. "Using RASTA in task independent
TANDEM feature extraction", in ICSLP, (Korea), 2004.
[5] J. Dines, S.
Sridharan and M. Moody. “ Speech segmentation with
HMM”, in Proceedings of the International
Australian Speech Science and Technology Conference (SST-2002), (Melbourne,
Australia), 2002.
[4] J. Dines, S.
Sridharan and M. Moody. “Application of the trended hidden
Markov model to speech synthesis”, in Proceedings of the European
Conference on Speech Communication and Technology (Eurospeech), (Aalborg,
Denmark), 2001.
[3] J. Dines and S.
Sridharan. “Trainable speech synthesis with trended hidden
Markov models”, in Proceedings of the International Conference on
Acoustics, Speech and Signal Processing (ICASSP), (Salt Lake City, USA), 2001.
[2] J. Dines, S.
Sridharan and M. Moody. “ Compression of speech for
mass storage using speech recognition and text-to-speech synthesis”,
in Proceedings of the International Australian Speech Science and Technology
Conference (SST-2000),
(Canberra, Australia), 2000.
[1] J. Dines and S.
Sridharan. “A speaker independent phonetic vocoder for the English language”,
in Proceedings of the International Symposium on Signal Processing and
Communications Systems (ISPACS), (Honolulu, USA), 2000.
Other
M. Magimai-Doss, J. Dines, H. Bourlard and H. Hermansky. "Phoneme
vs based grapheme automatic speech recognition", IDIAP
research report 04-48, 2004
I. McCowan, D. Moore, J. Dines, D. Gatica-Perez, M. Flynn, P.
Wellner, and H. Bourlard. "On
the Use of Information Retrieval Measures for Speech Recognition Evaluation",
IDIAP research report 04-73, 2004
|