Voice Comparison Using Acoustic Analysis and Generative Adversarial Network for Forensics

Автор: Kruthika S.G., Trisiladevi C Nagavi, P. Mahesha, Abhishek Kumar

Журнал: International Journal of Image, Graphics and Signal Processing @ijigsp

Статья в выпуске: 2 vol.17, 2025 года.

Бесплатный доступ

Forensic Voice Comparison (FVC) is a scientific analysis that examines audio recordings to determine whether they come from the same or different speakers in digital forensics. In this research work, the experiment utilizes three different techniques, like pre-processing, feature extraction, and classification. In preprocessing, the stationery noise reduction algorithm is used to remove unwanted background noise by increasing the clarity of the speech. This in turn helps to improve the overall audio quality by reducing distractions. Further, acoustic features like Mel Frequency Cepstral Coefficients (MFCC) are used to extract relevant and distinctive features from audio signals to characterize and analyze the unique vocal patterns of different individual. Later, the Generative Adversarial Network (GAN) is used to generate synthetic MFCC features and also for augmenting the data samples. Finally, the Logistic Regression (LR) is realized using UK framework for the classification of the model to predict whether the result is true or false. The results achieved in terms of accuracy are 62% considering 3899 samples and 85% when considering set of 985 samples for the Australian English datasets.

Еще

Generative Adversarial Network (GAN), Acoustic Features, Digital Forensics, Mel Frequency Cepstral Coefficients (MFCC), Logistic Regression (LR), Forensic Voice Comparison (FVC)

Короткий адрес: https://sciup.org/15019719

IDR: 15019719   |   DOI: 10.5815/ijigsp.2025.02.07

Статья научная