Application of polyscale methods for speaker verification

Sedigh, Sina

Application of polyscale methods for speaker verification

dc.contributor.author	Sedigh, Sina
dc.contributor.examiningcommittee	McLeod, Bob (Electrical and Computer Engineering) Wang, Yang (Computer Science)	en_US
dc.contributor.supervisor	Kinsner, Witold (Electrical and Computer Engineering)	en_US
dc.date.accessioned	2018-06-28T13:58:41Z
dc.date.available	2018-06-28T13:58:41Z
dc.date.issued	2018-05-22	en_US
dc.date.submitted	2018-06-27T18:22:34Z	en
dc.date.submitted	2018-06-27T22:11:19Z	en
dc.degree.discipline	Electrical and Computer Engineering	en_US
dc.degree.level	Master of Science (M.Sc.)	en_US
dc.description.abstract	Voice is a characteristic of the human body which is unique to an individual. Voice can be used for remote access applications, in order to verify the individual’s identity. However, robust feature extraction is required and the aim of this research is the establishment of security via the speaker’s voice. All the experiments in this thesis are based on a dataset recorded in an anechoic chamber, available at the Applied Electromagnetic Laboratory at the University of Manitoba. The following dataset consists of utterances, recorded using 24 volunteers raised in the Province of Manitoba, Canada. To provide a repeatable set of test words that would cover all of the phonemes, the Edinburg Machine Readable Phonetic Alphabet [KiGr08], consisting of 44 words was used. The utterances were recorded using a sampling frequency of 44.1 kilo-samples per second (kSps). The recording sessions took place between 10 AM to 3 PM, from March 27, 2017, until September 27, 2017. This thesis presents a study of text-independent speaker verification with the aim of experimental evaluation of features and embedding fractal algorithms to the front-end processing of the speaker verification system. A voice activity detection based on the variance fractal dimension was used to separate the non-speech segments of the signal. A fusion of multiple features, namely the linear prediction cepstral coefficients, Mel-frequency cepstral coefficients, Higuchi fractal dimension, variance fractal dimension, zero crossing rate, and turns count, was used to form the feature vectors. Meanwhile, an experimental sensitivity analysis was conducted to test the effects of each feature on the accuracy of classification using a support vector machine. The features were extracted using multiple voice activity detection algorithms. The best across-the-divide recognition accuracy of 91.60% was obtained by fusion of all the features that were extracted using the voice activity detection algorithm based on the variance fractal dimension. This shows that fusion of features and embedding of fractal methods to the front-end processing of text-independent speaker verification will increase the accuracy of the classifications.	en_US
dc.description.note	October 2018	en_US
dc.identifier.uri	http://hdl.handle.net/1993/33078
dc.language.iso	eng	en_US
dc.rights	open access	en_US
dc.subject	Speaker verification, Polyscale methods, Multifractal methods, Voice activity detection	en_US
dc.title	Application of polyscale methods for speaker verification	en_US
dc.type	master thesis	en_US
local.subject.manitoba	yes	en_US