Monday 5th September & Tuesday 6th September, 2022 UK Speech 2022 will be held at University of Edinburgh Central Campus. Registration, posters, coffee and lunches will be in the ground floor of the Informatics Forum, while keynote talks and oral sessions will be in the Room G.152, Teviot Lecture Theatre (Doorway 5, Old Medical School)
You can find abstracts for the presentations in the UK Speech 2022 abstract book
Atrium/G.07, Informatics Forum
Room G.152, Teviot Lecture Theatre (Doorway 5, Old Medical School)
Welcome and announcements
UK Speech organisers
Room G.152, Teviot Lecture Theatre (Doorway 5, Old Medical School)
Chair: Dr Korin Richmond
Using Ultrasound to Image the Articulators in the Speech Therapy Clinic
Dr Joanne Cleland, University of Strathclyde
Atrium/G.07, Informatics Forum
1. Text-free non-parallel many-to-many voice conversion using normalising flows
Thomas Merritt, Abdelhamid Ezzerg, Piotr Biliński, Magdalena Proszewska, Kamil Pokora, Roberto Barra-Chicote and Daniel Korzekwa
2. Leveraging Explicit Acoustic Features for Controllable TTS
Tian Huey Teh, Devang S Ram Mohan, Vivian Hu, Alexandra Torresquintero, Zack Hodari, Tomás Gómez Ibarrondo, Christopher G. R. Wallis and Simon King
3. Treating the noisy phase issue in speech enhancement using complex ratio masks
Georgiana-Elena Sfeclis
4. Comparing human emotion perception and automatic emotion recognition of user turns in human-machine dialogues
Norbert Braunschweiler, Rama Doddipatla, Simon Keizer and Svetlana Stoyanchev
5. Language Modelling with Recurrent Neural Networks for Code-Switching
Olga Iakovenko and Thomas Hain
6. Speaker Diarization: Importance of the Modulation Spectrum and Incorporating Uncertainty Modelling
Simon McKnight
7. Modelling trajectories of human speech articulators using general Tau theory
Benjamin Elie, David Lee and Alice Turk
8. Multi-sentence TTS with Expressive and Coherent Prosody
Marcel Granero-Moya, Amith Nagaraj, Peter Makarov, Ammar Abbas, Mateusz Lajszczak, Arnaud Joly, Sri Karlapati, Alexis Moinet, Thomas Drugman and Penny Karanasou
9. Investigating perception of spoken dialogue acceptability through surprisal
Sarenne Wallbridge, Peter Bell and Catherine Lai
10. Peter 2.0: Building a Cyborg
Matthew Aylett, Ari Shapiro, Sai Prasad, Lama Nachman, Stacy Marsella and Peter Scott-Morgan
11. Monitoring sleep disordered breathing of long-Covid patients at home using acoustic AI technology
Gerardo Roa Dabike, Ning Ma and Guy Brown
12. Incremental Disfluency Detection for Spoken Learner English
Lucy Skidmore and Roger K. Moore
13. Audio-Based Computational Analysis of Podcast Expressivity
Shahar Elisha, Emmanouil Benetos, Jussi Karlgren and Mariano Beguerisse-Diaz
14. Tandem Multitask Training of Speaker Diarisation and Speech Recognition for Meeting Transcription
Xianrui Zheng, Chao Zhang and Phil Woodland
15. Comparing Human and Machine Perceptions of Voice Anonymisation
Farida Yusuf, Dan Kumpik, Matt Clifford, Jonathan Erskine and Jennifer Williams
16. ABAIR-ÉIST: recent progress in Irish language low-resource ASR development
Liam Lonergan, Christian Saam, Mengjie Qian, Neasa Ní Chiaráin, Christer Gobl and Ailbhe Ní Chasaide
17. A summary of the GENEA Challenge 2022 on co-speech gesture generation
Youngwoo Yoon, Pieter Wolfert, Taras Kucherenko, Carla Viegas, Teodor Nikolov, Mihail Tsakov and Gustav Eje Henter
18. Neural formant synthesis – a proving ground for speech-synthesis control
Gustavo Teodoro Döhler Beck, Ulme Wennberg, Zofia Malisz and Gustav Eje Henter
19. Empowering neural TTS with HMMs to get the best of both worlds
Shivam Mehta, Harm Lameris, Éva Székely, Jonas Beskow and Gustav Eje Henter
20. Unsupervised data selection for Speech Recognition with contrastive loss ratios
Chanho Park,Rehan Ahmad and Thomas Hain
21. Domain-Informed Probing of wav2vec 2.0 Embeddings for Phonetic Features
Patrick Cormac English, Julie Carson-Berndsen and John Kelleher
22. Self-supervised Graphs for Audio Representation Learning with Limited Labeled Data
Amir Shirian, Krishna Somandepalli and Tanaya Guha
Atrium/G.07, Informatics Forum
Room G.152, Teviot Lecture Theatre (Doorway 5, Old Medical School)
Chair: Prof Simon King
Speech Privacy: Where Are We Going and How to Get There?
Dr Jennifer Williams, University of Southampton
18:30, Drinks Reception at the Scottish National Gallery
20:00, Dinner the Scottish Cafe & Restaurant at the Scottish National Gallery
Plus an after dinner Ceilidh at the Scottish Cafe & Restaurant - bring your dancing shoes!
Room G.152, Teviot Lecture Theatre (Doorway 5, Old Medical School)
Prof Jon Barker
1. Evaluating watchability for video localisation
Zack Hodari, Tian Huey Teh, Vivian Hu, Tomás Gómez Ibarrondo, Devang S Ram Mohan, Alexandra Torresquintero, Chris Wallis, James Leoni and Simon King
2. Transforming adult to child speech for dubbing
Protima Nomo Sudro, Anton Ragni and Thomas Hain
3. Deciphering Speech: a Zero-Resource Approach to Cross-Lingual Transfer in ASR
Ondrej Klejch, Electra Wallington and Peter Bell
Room G.152, Teviot Lecture Theatre (Doorway 5, Old Medical School)
Dr Kate Knill
Multimodal Speech – Embracing the Iceberg!
Prof Naomi Harte, Trinity College Dublin
Atrium/G.07, Informatics Forum
Atrium/G.07, Informatics Forum
1. Speaker identification in courtroom contexts: performance of human listeners compared to a state-of-the-art forensic voice comparison system
Philip Weber, Nabanita Basu, Agnes S. Bali, Claudia Rosas-Aguilar, Gary Edmond, Kristy A. Martire and Geoffrey Stewart Morrison
2. Automatic generation of accented speech using phonetic features
Margot Masson, Anthony Ventresque and Julie Carson-Berndsen
3. Exploring hidden speech representations of self-supervised automatic speech recognition models
Tamara Soloveva, Ramon Sanabria and Peter Bell
4. AVSE Challenge: Audio-visual Speech Enhancement Challenge
Lorena Aldana, Cassia Valentini-Botinhao, Ondrej Klejch, Mandar Gogate, Kia Dashtipour, Amir Hussain and Peter Bell
5. Leveraging linguistic knowledge for accent robustness of end-to-end models
Andrea Carmantini and Peter Bell
6. A Biological Understanding of Dramatic Speech through Synthesis
Emily Lau, Brechtje Post and Kate Knill
7. Modelling Pronunciation Variation in Different Spoken Englishes
Emma O'Neill and Julie Berndsen
8. Using Utterance-Specific Dirichlet Priors to Model Uncertainty in Emotion Class Labels
Wen Wu, Chao Zhang, Xixin Wu and Philip C. Woodland
9. PSE-Net: Real-time Personalized Sound Enhancement
Abhinav Mehrotra, Alberto Gil C. P. Ramos, Nic Lane and Sourav Bhattacharya
10. Conversational Speech vs. Sustained Phonation for Diagnosis of Parkinson’s Disease
Steve Beet, Phill Restall and Ladan Baghai-Ravary
11. Tree-Constrained Pointer Generator for End-to-end Contextual ASR
Guangzhi Sun, Chao Zhang and Phil Woodland
12. Canonical-Correlated Graph Neural Network for Multimodal Energy-Efficient Speech Enhancement
Leandro Aparecido Passos Junior, Ahmed Khubaib, Mohsin Raza, Amir Hussain and Ahsan Adeel
13. CognoSpeak: a Cognitive Health Assessment Tool (CcHAT)
Nathan Pevy, Heidi Christensen and Daniel Blackburn
14. Attention Forcing for Speech Synthesis
Qingyun Dou and Mark Gales
15. Multimodal Emotion Recognition in Conversations
Jiachen Luo, Joshua Reiss and Huy Phan
16. Addressing user concerns about multi-modal hearing technology
Dorothy Hardy, Michael Akeroyd, Adeel Hussain, Peter Bell and Amir Hussain
17. Model for Assessor Bias in Automatic Pronunciation Assessment
Jose Antonio Lopez Saenz and Thomas Hain
18. A siamese RNN architecture to detect deliberate imitation and phonetic convergence in L2-speech
Byron Z. Yuan, Aldo Pastore, Dorina De Jong, Hao Xu, Luciano Fadiga and Alessandro D'Ausilio
19. Using conversational data to improve prosody in Text-to-Speech synthesis
Johannah O'Mahony, Catherine Lai and Simon King
20. Non-Linear Pairwise Language Mappings for Low-Resource Multilingual Acoustic Model Fusion
Muhammad Umar Farooq, Darshan Adiga Haniya Narayana and Thomas Hain
21. RoomReader: A Multimodal Corpus of Online Multiparty Conversational Interactions
Justine Reverdy, Sam O'Connor Russell, Louise Duquenne, Diego Garaialde, Benjamin Cowan and Naomi Harte
22. Person-specific automatic speaker recognition: understanding the behaviour of individuals for applications of ASR
Vincent Hughes, Paul Foulkes, Philip Harrison, Jessica Wormald, Chenzi Xu, David van der Vloed and Finnian Kelly
23. Alternative Evaluation Methods of Latent Representations of Speech Audio
Eimear Stanley, Yumnah Mohamied and Peter Bell
Atrium/G.07, Informatics Forum
Atrium/G.07, Informatics Forum
1. Autovocoder: Vocoding Without Spectrograms
Jacob Webber and Simon King
2. Exploring Prosody Transfer in Speech Synthesis
Atli Sigurgeirsson and Simon King
3. Code-switched Text Generation on Parallel Data
Jie Chi and Brian Lu
4. Voice Puppetry for the People: Harnessing Dramatic Performance for Speech Synthesis
Matthew Aylett, Skaiste Butkute and Christopher Pidcock
5. Improving diagnostic procedures for epilepsy through automated recording and analysis of patients’ history
Nathan Pevy, Heidi Christensen, Traci Walker and Markus Reuber
6. Deliberation Based Multi-Pass Speech Synthesis
Qingyun Dou and Mark Gales
7. Exploring Novel Methods for Automatic Speech Recogniser Based Intelligibility Prediction
Zehai Tu, Ning Ma and Jon Barker
8. View-Specific Assessment of L2 Spoken English
Stefano Banno, Bhanu Balasu, Mark Gales, Kate Knill and Konstantinos Kyriakopolous
9. Why is My Social Robot so Slow? How a Conversational Listener can Revolutionize Turn-Taking
Matthew Aylett, Andrea Carmantini and David Braude
10. Creating New Voices using Normalizing Flows
Piotr Biliński, Thomas Merritt, Abdelhamid Ezzerg, Kamil Pokora, Sebastian Cygert, Kayoko Yanagisawa, Roberto Barra-Chicote and Daniel Korzekwa
11. Phonetic Analysis of Self-supervised Representations of English Speech
Dan Wells, Hao Tang and Korin Richmond
12. Comparison of Audio-Visual Speech Enhancement Models with Hearing Aid Key Performance Indicators
Jasper Kirton-Wingate, Mandar Gogate, Amir Hussain and Tassadaq Hussain
14. Simulation of Teacher-Learner Interaction in English Language Pronunciation Learning
Elaf Islam and Thomas Hain
15. A New Benchmark Multi-modal Speech Corpus With Two Target Speakers
Jasper Kirton-Wingate, Adeel Hussain, Amir Hussain, Kia Dashtipour, Mandar Gogate and Peter Derleth
16. Gender Bias and Universal Substitution Adversarial Attacks on Grammatical Error Correction Systems for Automated Assessment
Vyas Raina and Mark Gales
17. Is there an auditory uncanny valley for synthesised speech?
Alice Ross, Catherine Lai and Martin Corley
18. Exploration of A Self-Supervised Speech Model: A Study on Emotional Corpora
Yuanchao Li, Yumnah Mohamied, Peter Bell and Catherine Lai
19. Cross lingual wav2vec finetuning in mutually intelligible language pairs
Jeffrey Josanne Michael, Toby Godwin and Oscar Saz
20. Phonetically Guided Transfer Learning for Low-Resource Accented English
Edward Storey and Naomi Harte
21. Dysarthric Speech Recognition From Raw Waveform with Parametric CNNs
Zhengjun Yue, Erfan Loweimi, Heidi Christensen, Jon Barker and Zoran Cvetkovic
22. Joint Modelling of Automatic Speaker Verification and Spoofing Countermeasure Systems
Poppy Welch and Jennifer Williams
Room G.152, Teviot Lecture Theatre (Doorway 5, Old Medical School)
Prof Julie Berndsen
1. The 2nd Clarity Enhancement Challenge: A machine learning challenge for hearing aid speech intelligibility enhancement
Will Bailey, Michael Akeroyd, Jon Barker, Trevor Cox, John Culling, Simone Graetzer, Graham Naylor, Zuzanna Podwińska and Zehai Tu
2. Back to the Future: Extending the Blizzard Challenge 2013
Sébastien Le Maguer, Simon King and Naomi Harte
3. Fine Grained Spoken Document Summarization Through Text Segmentation
Samantha Kotey, Rozenn Dahyot and Naomi Harte
Room G.152, Teviot Lecture Theatre (Doorway 5, Old Medical School)
Future plans and farewell!
UK Speech organisers
Back to the UK Speech 2022 homepage