Molecular representation modeling with graph neural networks for antibiotic discovery

dc.contributor.authorLiu, Chengyou
dc.contributor.examiningcommitteeCardona, Silvia (Microbiology)en_US
dc.contributor.examiningcommitteeGilmore, Colin (Electrical and Computer Engineering)en_US
dc.contributor.supervisorHu, Pingzhao
dc.contributor.supervisorMcLeod, Bob
dc.date.accessioned2022-05-16T21:07:17Z
dc.date.available2022-05-16T21:07:17Z
dc.date.copyright2022-04-28
dc.date.issued2022-04-28
dc.date.submitted2022-04-28T14:42:05Zen_US
dc.degree.disciplineElectrical and Computer Engineeringen_US
dc.degree.levelMaster of Science (M.Sc.)en_US
dc.description.abstractMotivation: With the advent of large-scale compound screening facilitated by the high-throughput technologies, a variety of machine learning methods have been integrated into the pipelines of antibiotic discovery. Feature engineering, a type of data mining technique, is often used as one of the first steps to mine patterns from big data and optimize predictive models for the goal. Clustering analysis, on the other hand, is a critical approach to get insights into the underlying biological relationships between the gene products in the high-dimensional chemical-genetic data. Converting molecules into computer-interpretable features with rich molecular information is a core problem of data-driven machine learning applications in chemical and drug-related tasks. As small molecules can be considered as non-structural data, graph convolutional neural networks (GCNs), which can learn and aggregate the local information of molecules, have been used to predict molecular properties with great success. Furthermore, given the merits and various successful practices of transformers in multiple artificial intelligence (AI) domains, it is desirable to integrate the self-attention mechanism into GCNs for better molecular representation construction. Methods and Results: This thesis begins with a review of different types of molecular representations and deep learning architectures to which they are applicable in Chapter 1 and the research objectives were described in Chapter 2. Next, Chapter 3 first applied statistical and machine learning approaches to evaluate the important features for predicting bacterial growth inhibitory activity , then applied a directed-message passing neural network to analyze a large-scale compound screen against Burkholderia cenocepacia to predict the bacterial growth inhibition of drugs. In Chapter 4, a directed-message passing neural network-based analytic framework was developed to model the large-scale chemical-genetic interaction profiles against Mycobacterium tuberculosis to predict drug mechanism of action. Finally, in Chapter 5, we proposed an atom and bond attention-based message passing neural network, namely ABT-MPNN, as an attempt to improve the molecular representation embedding process for antibiotic discovery. Conclusion: This thesis provides analytical frameworks for both large-scale compound screening datasets and chemical-genetic interaction profiles, and generates hypotheses about the mechanism of action of novel drugs based on the predicted results. More importantly, by leveraging message passing neural networks multiple times, as well as designing a novel attention-based message passing neural network, this thesis also highlights the great importance of graph-based deep neural networks in drug discovery.en_US
dc.description.noteOctober 2022en_US
dc.identifier.urihttp://hdl.handle.net/1993/36498
dc.language.isoengen_US
dc.rightsopen accessen_US
dc.subjectGraph neural networksen_US
dc.subjectMolecular representationen_US
dc.subjectBacterial growth inhibitory activityen_US
dc.subjectMechanism of actionen_US
dc.titleMolecular representation modeling with graph neural networks for antibiotic discoveryen_US
dc.typemaster thesisen_US
project.funder.identifierhttps://doi.org/10.13039/100010318en_US
project.funder.nameUniversity of Manitobaen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Liu_Chengyou.pdf
Size:
3.34 MB
Format:
Adobe Portable Document Format
Description:
Thesis
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.2 KB
Format:
Item-specific license agreed to upon submission
Description: