CNN based bi-directional prediction for complexity reduction of high efficiency video coding

dc.contributor.authorDe Silva, Tharuki Rangana
dc.contributor.examiningcommitteeAshraf, Ahmed (Electrical and Computer Engineering)en_US
dc.contributor.examiningcommitteeThomas, Gabriel (Electrical and Computer Engineering)en_US
dc.contributor.supervisorYahampath, Pradeepa
dc.date.accessioned2022-08-17T14:49:36Z
dc.date.available2022-08-17T14:49:36Z
dc.date.copyright2022-08-16
dc.date.issued2022-08-16
dc.date.submitted2022-08-16T17:22:57Zen_US
dc.degree.disciplineElectrical and Computer Engineeringen_US
dc.degree.levelMaster of Science (M.Sc.)en_US
dc.description.abstractReal-time video streaming has become the largest portion of internet traffic in recent years. Therefore, improving the efficiency of video coding remains an important research issue. Beyond the level of compression, there are two other factors that must be considered to determine the efficiency of a real-time video codec: decoded video quality and the computational complexity of the encoding and decoding processes. Modern video codecs rely on inter-frame prediction for efficient coding. However, inter-frame prediction used in modern codecs is one of the most computationally expensive and time-consuming operations. Convolutional neural networks (CNN) have been used in recent research for inter-frame prediction tasks. The CNN architectures in previous work have been used without regard to the model complexity and computational efficiency. The objective of this thesis is to develop a CNN based low complexity bi-prediction algorithm for video coding. The contribution of this thesis consists of three parts. In the first part, a simple floating point CNN architecture has been developed to perform the bi-prediction operation in video coding with an accuracy comparable to that produced by motion estimation and compensation used in modern video encoders. This architecture is then quantized to derive an integer arithmetic only CNN to further reduce the computational complexity. It is shown that the encoding time for integer CNN is considerably lower compared to the floating point CNN. The experimental results have shown that this conversion only causes a minor loss of prediction accuracy. In the final part, it is experimentally shown that the proposed integer arithmetic CNN bi-prediction algorithm has a lower computational cost and better video quality compared to the conventional motion estimation based bi-prediction. Further, it is shown that CNN based bi-prediction can contribute to a rate-distortion performance improvement in video coding.en_US
dc.description.noteOctober 2022en_US
dc.identifier.urihttp://hdl.handle.net/1993/36695
dc.language.isoengen_US
dc.rightsopen accessen_US
dc.subjectBi-predictionen_US
dc.subjectConvolutional neural network (CNN)en_US
dc.subjectHigh efficiency video coding (HEVC)en_US
dc.subjectComputational complexityen_US
dc.subjectInteger CNNsen_US
dc.titleCNN based bi-directional prediction for complexity reduction of high efficiency video codingen_US
dc.typemaster thesisen_US
local.subject.manitobanoen_US
oaire.awardTitleUniversity of Manitoba Graduate Fellowshipen_US
oaire.awardURIhttps://umanitoba.ca/graduate-studies/funding-awards-and-financial-aid/university-manitoba-graduate-fellowship-umgfen_US
project.funder.identifierhttp://dx.doi.org/10.13039/100010318en_US
project.funder.nameUniversity of Manitobaen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
DeSilva_Tharuki.pdf
Size:
8.12 MB
Format:
Adobe Portable Document Format
Description:
Thesis
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.2 KB
Format:
Item-specific license agreed to upon submission
Description: