Application of polyscale methods for speaker verification

dc.contributor.authorSedigh, Sina
dc.contributor.examiningcommitteeMcLeod, Bob (Electrical and Computer Engineering) Wang, Yang (Computer Science)en_US
dc.contributor.supervisorKinsner, Witold (Electrical and Computer Engineering)en_US
dc.date.accessioned2018-06-28T13:58:41Z
dc.date.available2018-06-28T13:58:41Z
dc.date.issued2018-05-22en_US
dc.date.submitted2018-06-27T18:22:34Zen
dc.date.submitted2018-06-27T22:11:19Zen
dc.degree.disciplineElectrical and Computer Engineeringen_US
dc.degree.levelMaster of Science (M.Sc.)en_US
dc.description.abstractVoice is a characteristic of the human body which is unique to an individual. Voice can be used for remote access applications, in order to verify the individual’s identity. However, robust feature extraction is required and the aim of this research is the establishment of security via the speaker’s voice. All the experiments in this thesis are based on a dataset recorded in an anechoic chamber, available at the Applied Electromagnetic Laboratory at the University of Manitoba. The following dataset consists of utterances, recorded using 24 volunteers raised in the Province of Manitoba, Canada. To provide a repeatable set of test words that would cover all of the phonemes, the Edinburg Machine Readable Phonetic Alphabet [KiGr08], consisting of 44 words was used. The utterances were recorded using a sampling frequency of 44.1 kilo-samples per second (kSps). The recording sessions took place between 10 AM to 3 PM, from March 27, 2017, until September 27, 2017. This thesis presents a study of text-independent speaker verification with the aim of experimental evaluation of features and embedding fractal algorithms to the front-end processing of the speaker verification system. A voice activity detection based on the variance fractal dimension was used to separate the non-speech segments of the signal. A fusion of multiple features, namely the linear prediction cepstral coefficients, Mel-frequency cepstral coefficients, Higuchi fractal dimension, variance fractal dimension, zero crossing rate, and turns count, was used to form the feature vectors. Meanwhile, an experimental sensitivity analysis was conducted to test the effects of each feature on the accuracy of classification using a support vector machine. The features were extracted using multiple voice activity detection algorithms. The best across-the-divide recognition accuracy of 91.60% was obtained by fusion of all the features that were extracted using the voice activity detection algorithm based on the variance fractal dimension. This shows that fusion of features and embedding of fractal methods to the front-end processing of text-independent speaker verification will increase the accuracy of the classifications.en_US
dc.description.noteOctober 2018en_US
dc.identifier.urihttp://hdl.handle.net/1993/33078
dc.language.isoengen_US
dc.rightsopen accessen_US
dc.subjectSpeaker verification, Polyscale methods, Multifractal methods, Voice activity detectionen_US
dc.titleApplication of polyscale methods for speaker verificationen_US
dc.typemaster thesisen_US
local.subject.manitobayesen_US
Files
Original bundle
Now showing 1 - 3 of 3
Loading...
Thumbnail Image
Name:
Sedigh_Sina.pdf
Size:
15.49 MB
Format:
Adobe Portable Document Format
Description:
Thesis PDF file
Loading...
Thumbnail Image
Name:
data.zip
Size:
124.06 MB
Format:
Winzipped zip file
Description:
Recorded data
Loading...
Thumbnail Image
Name:
matlab_codes.zip
Size:
18.37 KB
Format:
Winzipped zip file
Description:
Experimental codes
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.2 KB
Format:
Item-specific license agreed to upon submission
Description: