Distilling knowledge through student-teacher model and BERT for sentiment analysis

dc.contributor.authorDong, Ximing
dc.contributor.examiningcommitteeShaowei Wangen_US
dc.contributor.examiningcommitteeSaumen Mandalen_US
dc.contributor.supervisorTHULASIRAMAN, PARIMALA
dc.date.accessioned2022-12-05T14:52:14Z
dc.date.available2022-12-05T14:52:14Z
dc.date.copyright2022-12-04
dc.date.issued2022-12-04
dc.date.submitted2022-12-05T01:42:20Zen_US
dc.degree.disciplineComputer Scienceen_US
dc.degree.levelMaster of Science (M.Sc.)en_US
dc.description.abstractBi-directional Encoder Representations from Transformers (BERT) is the state-of-the-art deep learning model for pre-training natural language processing (NLP) tasks such as sentiment analysis. The BERT model dynamically generates word representations according to the context and semantics using its bi-directional and attention mechanism features. The model, although, improves precision on NLP tasks, is compute-intensive and time-consuming to deploy on mobile or smaller platforms. In this thesis, to address this issue, we use knowledge distillation (KD), a "teacher-student" training technique, to compress the model. We use the BERT model as the "teacher" model to transfer knowledge to student models, ``first-generation'' convolution neural networks, and long-short term memory with attention mechanism (LSTM-atten). We conduct various experiments on sentiment analysis benchmark data sets and show that the “student models” through knowledge distillation have better performance with 70% improvement in accuracy, precision, recall, and F1-score compared to models without KD. We also investigate the convergence rate of student models and compare the results to the existing models in the literature. Finally, we show that compared to the full-size BERT model, our RNN series models are 50 times smaller in size and retain approximately 96% performance on benchmark data sets.en_US
dc.description.noteFebruary 2023en_US
dc.identifier.urihttp://hdl.handle.net/1993/36990
dc.language.isoengen_US
dc.rightsopen accessen_US
dc.subjectNatural Language Processingen_US
dc.titleDistilling knowledge through student-teacher model and BERT for sentiment analysisen_US
dc.typemaster thesisen_US
local.subject.manitobanoen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
XimingDongThesis.pdf
Size:
1.63 MB
Format:
Adobe Portable Document Format
Description:
Thesis
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.2 KB
Format:
Item-specific license agreed to upon submission
Description: