Reconstruction of ancestral non-coding RNA sequences using sequential and structural information with tree decomposition
dc.contributor.author | Hu, Songdi | |
dc.contributor.examiningcommittee | Durocher, Stephane (Computer Science) | |
dc.contributor.examiningcommittee | Turgeon, Maxime (Statistics) | |
dc.contributor.supervisor | Tremblay-Savard, Olivier | |
dc.date.accessioned | 2025-01-17T22:27:56Z | |
dc.date.available | 2025-01-17T22:27:56Z | |
dc.date.issued | 2024-12-31 | |
dc.date.submitted | 2025-01-01T00:42:52Z | en_US |
dc.date.submitted | 2025-01-17T20:01:11Z | en_US |
dc.degree.discipline | Computer Science | |
dc.degree.level | Master of Science (M.Sc.) | |
dc.description.abstract | Ancestral sequence reconstruction aims to infer what was the content of certain biological sequences of interest for ancestral species that do not exist anymore. This is accomplished by comparing and extracting similarities and differences from the sequences in extant (i.e. living) species. Since the search space is quite large, a lot of research has been devoted to the design of efficient and accurate methods to solve different variations of this problem. However, ancestral sequence reconstruction becomes even more complex when the goal is to reconstruct the ancestors of sequences that are not well conserved in extant species. This is the case with non-coding RNA (ncRNA) sequences, for which the structure (formed by base pairing) is more conserved than the actual sequences. One recent approach to tackle the ancestral reconstruction of ncRNA sequences involved considering the sequences of two related ncRNA families simultaneously. Although this helped avoid biases in the reconstruction, some cost calculations had to be simplified for efficiency. In this thesis, the goal was to improve the cost calculation of that approach by using a more advanced structural model and tree decomposition to partition the cost calculation into subproblems. Our results demonstrate an important gain in accuracy and a significant reduction in the number of optimal sequences inferred. | |
dc.description.note | February 2025 | |
dc.identifier.uri | http://hdl.handle.net/1993/38836 | |
dc.language.iso | eng | |
dc.subject | ncRNA | |
dc.subject | ancestral inference | |
dc.subject | secondary structure | |
dc.subject | tree decomposition | |
dc.subject | phylogeny | |
dc.title | Reconstruction of ancestral non-coding RNA sequences using sequential and structural information with tree decomposition | |
local.subject.manitoba | no | |
oaire.awardNumber | RGPIN-2016-06051 | |
oaire.awardTitle | Analysis of tRNA gene evolution in bacterial genomes | |
oaire.awardURI | https://www.nserc-crsng.gc.ca/ase-oro/Details-Detailles_eng.asp?id=749933 | |
project.funder.identifier | https://doi.org/10.13039/501100000038 | |
project.funder.name | Natural Sciences and Engineering Research Council of Canada |