Abstract
The current reality is saturated with intelligent telecommunications solutions, and automatic speaker recognition systems are an integral part of many of them. They are widely used in sectors such as banking, telecommunications and forensics. The ease of performing automatic analysis and efficient extraction of the distinctive characteristics of the human voice makes it possible to identify, verify, as well as authorize the speaker under investigation. Currently, the vast majority of solutions in the field of speaker recognition systems are based on the distinctive features resulting from the structure of the speaker's vocal tract (laryngeal sound analysis), called physical features of the voice. Despite the high efficiency of such systems - oscillating at more than 95% - their further development is already very difficult, due to the fact that the possibilities of distinctive physical features have been exhausted. Further opportunities to increase the effectiveness of ASR systems based on physical features appear after additional consideration of the behavioral features of the speech signal in the system, which is the subject of this article.
References
2. Dobrowolski A., Majda E. (2011), Cepstral analysis in the speakers recognition systems, 15th Conference on Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA), pp. 85-90, Poznań,
3. Dobrowolski A., Majda E. (2012), Application of homomorphic methods of speech signal processing in speakers recognition system, Przegląd Elektrotechniczny, R. 88 NR 6/2012, pp. 12-16
4. Jaroszyk F. (2008), Biofizyka Podręcznik dla studentów, Warszawa, Wydawnictwo Lekarskie PZWL,
5. Kamiński K., Dobrowolski A. (2022), Automatic speaker recognition system based on gaussian mixture models, cepstral analysis and genetic selection of distinctive features, Sensors, 22(23), 9370, DOI: 10.3390/s22239370
6. Reddy Gade V. S. and Sumathi M. (2021), A Comprehensive Study on Automatic Speaker Recognition by using Deep Learning Techniques, 2021 5th International Conference on Trends in Electronics and Informatics (ICOEI), Tirunelveli, India, pp. 1591-1597,
7. Tirumala S. S., Shahamiri S. R., Garhwal A. S., Wang R. (2017), Speaker identification features extraction methods: A systematic review, Expert Systems With Applications, 90, pp. 250–271, DOI: 10.1016/j.eswa.2017.08.015
8. Woźniak T., Soboń J. (2015), Ocena płynności mówienia, Nowa Audiofonologia, 4(4), pp. 9–19, DOI: 10.17431/894809
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.