Repository logo
  • English
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Italiano
  • Latviešu
  • Magyar
  • Nederlands
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Log In
    or
    New user? Click here to register.Have you forgotten your password?
Repository logo
  • Communities & Collections
  • Research Outputs
  • Projects
  • People
  • Statistics
  • English
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Italiano
  • Latviešu
  • Magyar
  • Nederlands
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Log In
    or
    New user? Click here to register.Have you forgotten your password?
  1. Home
  2. Scholalry Output
  3. Publications
  4. STATNet: Spectral and Temporal features based Multi-Task Network for Audio Spoofing Detection
 
  • Details
Options

STATNet: Spectral and Temporal features based Multi-Task Network for Audio Spoofing Detection

Date Issued
2022-01-01
Author(s)
Ranjan, Rishabh
Vatsa, Mayank
Singh, Richa
DOI
10.1109/IJCB54206.2022.10007949
Abstract
With the rise in mobile phone users and VoIP, voice has emerged as an easy and accessible biometric modality for identification or verification tasks. Given the increasing usage of voice biometrics, the security of these systems is also of paramount importance. Researchers have demon-strated that Automatic Speaker Verification (ASV) systems are prone to spoofing attacks like synthetic speech or fake speech, which can be used maliciously for a variety of tasks such as impersonation, fake news spreading, and opinion formation. This research proposes a deep convolution-based multi-task network which performs both spoof detection and source identification for synthetic speech. The pro-posed model is evaluated on three datasets ASVspoof2019 LA, FOR-Norm and In-the- Wild Audio Deepfake dataset. The results demonstrate the EER of 2.456%, 0.814%, and 0.199% on the ASVspoof2019 LA, FOR-Norm, and In-the-Wild Audio Deepfake datasets. In addition, we have also demonstrated results for cross-dataset evaluation and speech source identification.
Copyright © 2016-2025  Indian Institute of Technology Jodhpur

Developed and Maintaining by S. R. Ranganathan Learning Hub, IIT Jodhpur.

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Cookie settings
  • Privacy policy
  • End User Agreement
  • Send Feedback