
| References | [1] Ardila, Rosana, et al. "Common voice: A massively-multilingual speech corpus." arXiv preprint arXiv:1912.06670 (2019). [2] Liu, Chunxi, et al. "Towards measuring fairness in speech recognition: casual conversations dataset transcriptions." ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2022. [3] Kendall, Tyler and Charlie Farrington. 2021. The Corpus of Regional African American Language. Version 2021.07. Eugene, OR: The Online Resources for African American Language Project. [4] Martin, Joshua L., and Kelly Elizabeth Wright. "Bias in Automatic Speech Recognition: The Case of African American Language." Applied Linguistics (2022). [5] Tatman, Rachael. "Gender and dialect bias in YouTube’s automatic captions." Proceedings of the first ACL workshop on ethics in natural language processing. 2017. [6] Koenecke, Allison, et al. "Racial disparities in automated speech recognition." Proceedings of the National Academy of Sciences 117.14 (2020): 7684-7689. [7] Garnerin, Mahault, Solange Rossato, and Laurent Besacier. "Investigating the impact of gender representation in speech-to-text training data: A case study on librispeech." 3rd Workshop on Gender Bias in Natural Language Processing. Association for Computational Linguistics, 2021. [8] Feng, Siyuan, et al. "Quantifying bias in automatic speech recognition." arXiv preprint arXiv:2103.15122 (2021). [9] Buolamwini, Joy, and Timnit Gebru. "Gender shades: Intersectional accuracy disparities in commercial gender classification." Conference on fairness, accountability and transparency. PMLR, 2018. [10] Cho, Won Ik, et al. "Towards cross-lingual generalization of translation gender bias." Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. 2021. [11] Radford, Alec, et al. "Robust speech recognition via large-scale weak supervision." arXiv preprint arXiv:2212.04356 (2022). [12] Ware, Olivia R., et al. "Racial limitations of Fitzpatrick skin type." Cutis 105.2 (2020): 77-80. [13] Juhn, Young J., et al. "Assessing socioeconomic bias in machine learning algorithms in health care: a case study of the HOUSES index." Journal of the American Medical Informatics Association 29.7 (2022): 1142-1151. |
| Author | Benedetta Cevoli |
| Acknowledgements | Ana Olssen, Ben Leaman, Emma Davidson, Georgina Robertson, Harish Kumar, John Hughes, Liam Steadman, Markus Hennerbichler, Tom Young |
![[alt: Sound waveform overlaid on legal documents representing word error rate in legal transcription]](/_next/image?url=https%3A%2F%2Fimages.ctfassets.net%2Fyze1aysi0225%2FQRSezBsdLCxs1BVUN8hS7%2F2039e32c7e69124576ed85a9fb8f90c5%2Fblog-image-wide-carousel__1_.webp&w=3840&q=75)
![[alt: Court reporter shortage carousel]](/_next/image?url=https%3A%2F%2Fimages.ctfassets.net%2Fyze1aysi0225%2F2merK8OIQsF78D6bf8J4k8%2F900485ee565bcce115227fdfc74b2914%2Fblog-image-wide-carousel.webp&w=3840&q=75)

![[alt: Healthcare professionals in scrubs and lab coats walk briskly down a hospital corridor. A nurse uses a tablet while others carry patient charts and attend to a gurney. The setting conveys a busy, clinical environment focused on patient care.]](/_next/image?url=https%3A%2F%2Fimages.ctfassets.net%2Fyze1aysi0225%2F3TUGqo1FcOmT91WhT3fgbo%2F9a07c229c11f8cbe62e6e40a1f8682c7%2FImage_fx__8__1-wide-carousel.webp&w=3840&q=75)
![[alt: Logos of Speechmatics and Edvak are displayed side by side, interconnected by a stylized x symbol. The background features soft, wavy lines in light blue, creating a modern and tech-focused aesthetic.]](/_next/image?url=https%3A%2F%2Fimages.ctfassets.net%2Fyze1aysi0225%2F7LI5VH9yspI5pKWFeiZBXC%2F92f6a47a06ab6a97fb7f5a953b998737%2FCyan-wide-carousel.webp&w=3840&q=75)