Computational-driven understanding of the regulatory mechanisms of the breast cancer transcriptome and its implications for drug treatment
Breast cancer (BC) is the most commonly diagnosed cancer and the major cause of cancer mortality among women worldwide. As a heterogenous disease, BC can be divided into four major molecular subtypes: luminal A, luminal B, HER2-enriched, and triple negative breast cancer (TNBC). The TNBC subtype shows the shortest survival time among the four groups and lacks effective targeted therapeutic strategies. Different BC subtypes display different transcriptome profiles. However, the dysregulated pathways and transcriptional regulators underlying the gene expression profiles of different BC subtypes have yet to be fully elucidated. Current research is investigating the dysregulated genes in BC with the aim to identify potential gene targets for BC while overlooking the fact these genes are often part of a pathway. Moreover, among the multi-omics data, gene expression profiles have been shown to be the most informative data for developing anti-cancer drug response prediction models in silico. But these models typically were developed with individual genes. Therefore, this thesis aimed to explore the breast cancer transcriptome with a focus on the TNBC subtype to address three major questions: 1) exploring the regulatory mechanisms driving the unique expression pattern of different BC subtypes; 2) identifying compounds that could affect the expression pattern of the top dysregulated pathways in BC; and 3) developing a drug response prediction model by using BC pathway activity profiles inferred from the transcriptome profiles. To address the first question, we collected the multi-omics data of BC samples from The Cancer Genome Atlas (TCGA) dataset, including gene expression, DNA methylation, copy number variation (CNV) and microRNA (miRNA) profiles, the transcription factor (TF)-binding data from TRRUST v2.0, and the miRNA-binding data from starBase v3.0. Using these data, the Lasso regression-based integrative analysis identified 25, 20, 15 and 24 key regulators for luminal A, luminal B, HER2-enriched and TNBC subtypes, respectively. A further look at the TNBC regulators found that many of them are regulating the FOXM1 (i.e., PID_FOXM1_PATHWAY) and PPARA (i.e., BIOCARTA_PPARA_PATHWAY) pathways. To address the second question, we focused on the FOXM1 and PPARA pathways. Using the Connectivity Map (CMAP) database, which provides drug-induced gene expression changes in MCF7 cell lines, we investigated how different compounds change the activity and expression pattern of the two pathways. Nineteen drugs (such as 5109870, MG-132, MG-262, celastrol, resveratrol, and cephaeline) were identified to decrease the FOXM1 pathway activity scores and reverse the FOXM1 pathway expression pattern while 13 drugs (such as cephaeline, pararosaniline, cycloheximide, monensin, wortmannin, and raloxifene) were identified to increase the PPARA pathway activity scores and reverse the PPARA pathway expression pattern. It may be of interest to validate these compounds experimentally. To address the third question, we collected the baseline gene expression profiles of 49 BC cell lines along with IC50 values of these cell lines to 220 drugs from the Genomics of Drug Sensitivity in Cancer (GDSC) dataset. Using these data, we developed a multiple-layer cell line-drug response network (ML-CDN2) by integrating a one-layer cell line similarity network based on the pathway activity profiles and a three-layer drug similarity network based on three types of drug information. ML-CDN2 demonstrated good prediction performance, with the Pearson correlation coefficient between the observed and predicted IC50 values for all cell line-drug pairs of 0.873. Moreover, the ML-CDN2 model could be used to predict the drug response for new BC cell line samples or new BC patient-derived samples. This thesis demonstrated the transcriptional regulators underlying the transcriptome profiles in different BC subtypes. Moreover, this thesis demonstrated the implications of the BC transcriptome in drug treatment by identifying the drugs to modulate the two dysregulated pathways in BC and developing the anti-cancer drug response prediction model for BC by incorporating the transcriptome profiles.
Breast cancer, Transcriptomics and genomics, Cancer heterogeneity, Pathway analysis, Precision medicine, Machine learning, Breast cancer, Cancer transcriptomics, Cancer genomics, Cancer heterogeneity, Pathway analysis, Precision medicine, Machine learning