Multifractal analysis of DNA
This thesis presents two techniques for analyzing DNA using a multifractal methodology. The DNA analysis presented in the thesis is motivated by the intriguin possibility of identifying biological functionality using information contained within the DNA sequence. In addition, the analysis may give insight into the nature of DNA complexity, and provide guidelines for the selection of operating parameters such as the minimum DNA sequence length which can be analyzed. The first technique breaks a DNA sequence into four subsequences based on the individual constituent bases, and treats each of these as strange attractors from which the multifractal dimension may be estimated. Results show that the generated subsequences exhibit multifractal properties which can be localized to different positions along the sequences. A minimum window size of 256 bases, and a scaling range from 64 to 256 bases is needed for estimation of the multifractal measures. The second technique estimates the multifractal spectrum of DNA based on n-block entropies. The minimum window size was selected to be 1024 bases along with a scale range of one to three base pair sequence lengths. Experimental results show that DNA has a multifractal characteristic using this measure, and that the multifractality changes depending upon the position in a sequence. The phylogeny of organisms based on their multifractality was demonstrated with only two misclassifications, which may have other unresolved issues.