Peak detection and statistical analysis of karyotypic variation from flow cytometry data

dc.contributor.authorHenry, Margot J M
dc.contributor.examiningcommitteeTurgeon, Max (Statistics)en_US
dc.contributor.examiningcommitteeHu, Pingzhao (Biochemistry and Medical Genetics)en_US
dc.contributor.examiningcommitteeMandal, Saumen (Statistics)en_US
dc.contributor.guestmembersMuthukumarana, Saman (Statistics)en_US
dc.contributor.supervisorGerstein, Aleeza C of Science (M.Sc.)en_US
dc.description.abstractKaryotypic variation is observed in fungal microbial populations isolated from ecological, clinical, and industrial environments and is also hallmark of many types of cancer. In order to characterize and understand the dynamics of karyotype subpopulations, we require an unbiased computational method to identify different subpopulations and quantify the number of cells within them. Flow cytometry is the gold standard method to measure genome size from each cell populations of interest. Cells within a population are typically measured from all phases of the cell cycle (G0/G1 prior to DNA replication, S phase during replication, and G2/M when cells have doubled their DNA but haven’t yet divided). Mathematical models can be fit to the distribution of genome sizes to determine the base ploidy of the population. These algorithms only work for single ploidy populations. When there are multiple subpopulations of mixed ploidy, the researcher must manually divide the original population into subpopulations prior to analysis. This is subject to considerable bias and is not feasible when there are multiple subpopulations. We developed an unbiased method to quantify karyotypic variation in populations from flow cytometry data and will release an open-source Bioconductor package, ploidyPeaks. The existing flowCore Bioconductor package was used to load flow cytometry data from reference cell populations with known and variable karyotypes into the R programming language. We used reference populations with known ploidy to determine a threshold for single ploidy population and flag the rest as possible mixed ploidy populations. We implemented a peak detection algorithm to identify G0/G1 and G2/M populations, we identified karyotypic subpopulations for mixed populations, and applied a nonlinear least squares to test how well data from each population fit the Dean-Jett-fox cell cycle models to provide a confidence term. Our method improves on existing algorithms by providing a measure of model fit, and the ability to quantify populations that contain multiple karyotypic subpopulations in an unbiased manner.en_US
dc.description.noteOctober 2022en_US
dc.description.sponsorshipNSERC CREATE in Visual and Automated Disease Analyticsen_US
dc.rightsopen accessen_US
dc.subjectPeak Detectionen_US
dc.subjectFlow Cytometryen_US
dc.subjectCell Cycle Modelsen_US
dc.titlePeak detection and statistical analysis of karyotypic variation from flow cytometry dataen_US
dc.typemaster thesisen_US
oaire.awardTitleAlexander Graham Bell Canada Graduate Scholarship-Master’s (CGS M)en_US
project.funder.nameNatural Sciences and Engineering Research Council of Canadaen_US
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
14.35 MB
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
Thumbnail Image
2.2 KB
Item-specific license agreed to upon submission