Reconstruction of ancestral non-coding RNA sequences using sequential and structural information with tree decomposition
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Ancestral sequence reconstruction aims to infer what was the content of certain biological sequences of interest for ancestral species that do not exist anymore. This is accomplished by comparing and extracting similarities and differences from the sequences in extant (i.e. living) species. Since the search space is quite large, a lot of research has been devoted to the design of efficient and accurate methods to solve different variations of this problem. However, ancestral sequence reconstruction becomes even more complex when the goal is to reconstruct the ancestors of sequences that are not well conserved in extant species. This is the case with non-coding RNA (ncRNA) sequences, for which the structure (formed by base pairing) is more conserved than the actual sequences. One recent approach to tackle the ancestral reconstruction of ncRNA sequences involved considering the sequences of two related ncRNA families simultaneously. Although this helped avoid biases in the reconstruction, some cost calculations had to be simplified for efficiency. In this thesis, the goal was to improve the cost calculation of that approach by using a more advanced structural model and tree decomposition to partition the cost calculation into subproblems. Our results demonstrate an important gain in accuracy and a significant reduction in the number of optimal sequences inferred.