Automatic Speaker Recognition using Voice Biometric

 
C-DAC Logo
 

About Voice biometric and Speaker Recognition

Biometrics are some physiological or behavioral measurements of an individual. Such Biometrics can be either Physiological like Fingerprint, Face, Iris, Retina, Hand Geometry, DNA, Ear etc. or it can be Behavioral like Signature, Voice, Gait, Keystrokes etc. Use of Voice biometric is in high research now-a-days. Voice is the only biometric that allows users to authenticate remotely. Advantages related to voice biometric usage are like i) non intrusiveness, ii) wide availability and ease of transmission, iii) low cost, requiring small storage space and iv) ease of use, compact for small electronic devices with microphone etc. In contrary, there are also some disadvantages like i) low permanence, problems with aging, cough-cold, emotional changes, ii) problems with high background & network noise, iii) Sensitivity to room acoustics and device mismatch etc. Being a behavioral biometric, human Voice is not as unique as human DNA. But still, with precisely designed scope and applications, it can be attempted for specific authentication requirements in our regular everyday life. All these form the basis or motivation behind the challenging task of recognizing a person's identity using only voice biometric, which is known as Automatic Speaker Recognition. Depending upon the problem specification, the task can be either Automatic Speaker Identification (determining who is speaking) or Automatic Speaker Verification (validating whether the same person is speaking that has being claimed, or not).

Available solution

Under CDAC Kolkata core research initiative, Advanced Speech Processing section has developed a prototype of Automatic Speaker Recognition System (SRS). It is basically a standalone desktop based person authentication application which takes microphone speech input from speakers and using voice biometrics it recognizes personnel identity. The system encompasses two different softwares for Identification as well as Verification aiming towards corporate usage like Employee attendance recorder and Secured access to restricted areas within office respectively. The key technologies being used here are Acoustic Signal Processing to extract speaker specific characteristics from spoken utterances and Pattern recognition to have compact speaker models and matching of the query voice pattern. Moreover a novel Pitch Based Dynamic Pruning (PBDP) algorithm has been introduced for search space optimization and better performance in large population size.

Solution features :

Achievements so far :

Current activities :

Top