410 likes | 646 Views
Voice Activated Un-Lock Technology. V.A.U.L.T A Matlab based Simulation. By. Siddharth Advani B2213401 Anand Gokhale B2213420 Vishal Jain B2213426 Guided by Dr. P.M. Patil. OBJECTIVE.
E N D
Voice Activated Un-Lock Technology V.A.U.L.T A Matlab based Simulation
By Siddharth Advani B2213401 Anand Gokhale B2213420 Vishal Jain B2213426 Guided by Dr. P.M. Patil
OBJECTIVE Correct decision on a speaker’s identity claim given a speech segment (password)
MOTIVATION • Speech contains speaker specific characteristics • Voiceprint as a biometric (distinguishing trait) • Natural & economical way of identification
DEFINATIONS Client: speaker registered on the system Impostor: speaker who claims a false identity Mel-filtering: a frequency scaling that takes into account the fact that the ear is sensitive to linear changes in frequency below 1000 Hz and logarithmic change in frequency above 1000 Hz
What is Simulation? A simulation is the imitation of the operation of a real world process or system over time. Using MATLAB as a tool, VAULT aims at simulating a voice recognition system
MATLAB Features: • Interpreter Meant for simulation in R&D • High performance numerical computation • Signal Processing Toolbox
Visual Basic Features • Easy to implement. • Very user friendly, interactive. • Compatible with MATLAB and any Windows version. • Less complicated than the GUI of MATLAB. • Any Microsoft application can be embedded in the VB.
Phase 1 - Identification FEATURE EXTRACTION PATTERN RECOGNITION USER ID WORD SYSTEM DATABASE TRAINING
PROCESS FEATURE EXTRACTION VECTOR QUANTIZER DECISION WORD WORD IS SAMPLED AT 11.025 kHz PHASE 1 - IDENTIFICATION THE WORD IS DIVIDED INTO SEGMENTS 256 SAMPLES IN EACH SEGMENT
8 CEPSTRUM COEFFICIENTS ARE CALCULATED FOR EACH SEGMENT PHASE 1 - IDENTIFICATION PROCESS FEATURE EXTRACTION VECTOR QUANTIZER DECISION WORD
PROCESS FEATURE EXTRACTION VECTOR QUANTIZER DECISION WORD VECTOR QUANTIZATION IS USED TO CREATE CODEBOOK PHASE 1 - IDENTIFICATION CEPSTRUM COEFFICIENTS ARE QUANTIZED USING A CODEBOOK OF 128 VECTORS
PROCESS FEATURE EXTRACTION VECTOR QUANTIZER DECISION WORD DISTANCE=8 ? DISTANCE=16 DISTANCE=5 DISTANCE=12 PHASE 1 - IDENTIFICATION 1 2 CLIENT 3 3 DISTANCE=12 4
Database 4 1 2 3 Identification EVERY SPEAKER IS GIVEN A TAG ‘Zero’ 4
PHASE 2 - Authentication ACCEPT REJECT PATTERN RECOGNITION FEATURE EXTRACTION PASS-WORD SYSTEM DATABASE TRAINING
PROCESS FEATURE EXTRACTION VECTOR QUANTIZER CODEBOOK WORD THE SPEECH IS SAMPLED AND THE CEPSTRUM COEFFICIENTS ARE CALCULATED THE SAME WAY AS IN THE IDENTIFICATION PHASE PHASE 2 - AUTHENTICATION
PROCESS FEATURE EXTRACTION VECTOR QUANTIZER CODEBOOK WORD THIS TIME THE QUANTIZER USES A PERSONAL CODEBOOK TRAINED BY THE REAL USER PASSWORD PHASE 2 - AUTHENTICATION USER
PROCESS FEATURE EXTRACTION VECTOR QUANTIZER CODEBOOK WORD PHASE 2 - AUTHENTICATION THRESHOLD PASSWORD DECISION ACCEPT/ REJECT DISTANCE USER CLIENT THRESHOLD DECIDES THE DECISION
Main Obstacle • How to define and extract the unique features of human voice CEPSTRUM cepstrum(frame)=IDFT(log(|DFT(frame)|))
STOCHASTICMODEL TEMPLATE MODEL DETERMINISTIC BETTER SCORE MIN. DIST PROBABILISTIC BETTER SCORE MAX. PROB PATTERN MATCHING Dynamic Time Warping Vector Quantization Nearest Neighbour Hidden Markov Model Gaussian Mixture Model
VECTOR QUANTIZATION Goal: finding how the data is clustered • A (feature) vector space is broken into cells • Speaker model: codebook • Codebook: set of prototype vectors (codevectors) • Codevector: vector computed from "similar" single (feature) vectors (e.g. 8 cepstrum coefficients makes 1 codevector)
RESULTS THRESHOLD = 5 REJECT ACCEPT
PERFORMANCE EVALUATION • False Rejection (FR) – A client request as himself/herself is rejected • False Acceptance (FA) – An impostor request as a client is accepted • Genuine Acceptance (GA) – A client request as himself/herself is accepted
ACCURACY • FAR (False Acceptance Rate): Prob. of false acceptance Estimate: # false acceptances ---------------------------------------- # false claims • FRR (False Rejection Rate): Prob. of false rejection Estimate: # false rejections ---------------------------------------- # true claims • GAR (Genuine Acceptance Rate): Prob. of genuine acceptance Estimate: # true acceptances ---------------------------------------- # true claims
THRESHOLD The threshold T can be determined by: 1) choosing T to satisfy a fixed FA or FR criterion 2) varying T to find different FA/FR ratios and choosing T to give the desired FA/FR ratio.
SOURCES OF ERROR CLIENT: • Bad Pronunciation • Extreme emotional states (e.g. stress) • Sickness (head colds alter the vocal tract) • Aging (vocal tract can drift away from models with age) • Channel mismatch (using different microphones for enrollment and verification) IMPOSTER: • Mimicry AMBIENT NOISE
STRENGTHS & WEAKNESSES Strengths • SPEECH IS EASY TO PRODUCE • LOW COMPUTATION REQUIREMENTS • SPEECH IS A BEHAVIORAL SIGNAL • SPOOFING OF SYSTEMS Weaknesses
APPLICATIONS • Security Systems • Voice Dialing • Access control to computers / databases • Remote access to computer networks • Electronic commerce • Forensic • Telephone banking
Hardware Application Robotics Aim: To control a robot via voice
Parallel Port Interface • 25 pin D-type Male Connector • Parallel port of computer :3 registers • Data register • Status register • Control register
FM Transmitter-Receiver • Frequency of operation: 433.92 MHz • Modulation type : ASK • Bandwidth : 200 kHz
FEA – The Robot Features • Wireless • Prime Mover: DC motors
Relay Driver IC ULN2803 • Eight Darlington Arrays • Internal Free Wheeling Diodes • Output Compatible with TTL logic
FEA’s Drivers IC L293B Motor Driver IC • Four Channel drivers • Bidirectional Motor drive • High voltage , high current output
PROJECT TIME DISTRIBUTION JAN –PARTICIPATED AT IIT TECHFEST FEB –(a) SUBMITTED PAPER AT TECHKRITI KANPUR (b) MADE FEA FOR FERVOR AT COEP (c) MATLAB & VISUAL BASIC TRAINING MAR – PHASE 1 & 2 COMPLETED IN MATLAB APR – MATLAB & VISUAL BASIC INTERFACE MAY – EVALUATION OF SOFTWARE: FAR,FRR & GAR JUNE – APPLICATION BOARD
Future Expansion • Implementation over the DSP board • Making the system to work in real time • Speech Recognition