Show simple item record

dc.contributor.author Sarumi, Oluwafemi
dc.contributor.author Leung, Carson
dc.contributor.author Adetunmbi, Adebayo
dc.date.accessioned 2019-01-08T18:19:47Z
dc.date.available 2019-01-08T18:19:47Z
dc.date.issued 2018
dc.date.submitted 2019-01-08T07:32:14Z en
dc.identifier.citation O.A. Sarumi, C.K. Leung, A.O. Adetunmbi. Spark-based data analytics of sequence motifs in large omics data. Procedia Computer Science, 126 (2018), pp. 596-605 en_US
dc.identifier.uri http://hdl.handle.net/1993/33656
dc.description.abstract Data explosion in bioinformatics in recent years has led to new challenges for researchers to develop novel techniques to discover new knowledge from the avalanche of omics data (e.g., genomics, proteomics, transcriptomics). These data are embedded with a wealth of information including frequently repeated patterns (i.e., sequence motifs). In genomics, deoxyribonucleic acid (DNA) sequence motifs are short repeated contiguous frequent subsequences located in the prompter region. Due to the high volume and various degrees of veracity of these DNA datasets generated by the next-generation sequencing techniques, sequence motif mining from DNA sequences poised a major challenge in bioinformatics. In this article, we present a distributed sequential algorithm—which uses the MapReduce programming model on a cluster of homogeneous distributed-memory system running on an Apache Spark computing framework—for DNA sequence motif mining. Experimental results show the effectiveness of our algorithm in Spark-based data analytics of sequence motifs in large omics data. en_US
dc.description.sponsorship Natural Sciences and Engineering Research Council of Canada (NSERC); Tertiary Education Trust fund (TETFund) of Nigeria; University of Manitoba en_US
dc.language.iso en en_US
dc.publisher Elsevier en_US
dc.rights info:eu-repo/semantics/openAccess
dc.subject bioinformatics en_US
dc.subject Spark en_US
dc.subject MapReduce en_US
dc.subject deoxyribonucleic acid (DNA) en_US
dc.subject genomics en_US
dc.subject sequence motifs en_US
dc.title Spark-based data analytics of sequence motifs in large omics data en_US
dc.type Article en_US
dc.type info:eu-repo/semantics/article
dc.identifier.doi http://dx.doi.org/10.1016/j.procs.2018.07.294


Files in this item

This item appears in the following Collection(s)

Show simple item record

View Statistics