Analysis And Predictions Of DNA Sequence Transformations On Grids

Joshi, Yadnyesh R

dc.contributor.advisor	Vadhiyar, Sathish
dc.contributor.author	Joshi, Yadnyesh R
dc.date.accessioned	2009-05-11T11:21:23Z
dc.date.accessioned	2018-07-31T05:08:45Z
dc.date.available	2009-05-11T11:21:23Z
dc.date.available	2018-07-31T05:08:45Z
dc.date.issued	2009-05-11T11:21:23Z
dc.date.submitted	2007
dc.identifier.uri	https://etd.iisc.ac.in/handle/2005/489
dc.description.abstract	Phylogenetics is the study of evolution of organisms. Evolution occurs due to mutations of DNA sequences. The reasons behind these seemingly random mutations are largely unknown. There are many algorithms that build phylogenetic trees from DNA sequences. However, there are certain uncertainties associated with these phylogenetic trees. Fine level analysis of these phylogenetic trees is both important and interesting for evolutionary biologists. In this thesis, we try to model evolutions of DNA sequences using Cellular Automata and resolve the uncertainties associated with the phylogenetic trees. In particular, we determine the effect of neighboring DNA base-pairs on the mutation of a base-pair. Cellular Automata can be viewed as an array of cells which modifies itself in discrete time-steps according to a governing rule. The state of the cell at the next time-step depends on its current state and state of its neighbors. We have used cellular automata rules for analysis and predictions of DNA sequence transformations on Computational grids. In the first part of the thesis, DNA sequence evolution is modeled as a cellular automaton with each cell having one of the four possible states, corresponding to four bases. Phylogenetic trees are explored in order to find out the cellular automata rules that may have guided the evolutions. Master-client paradigm is used to exploit the parallelism in the sequence transformation analysis. Load balancing and fault-tolerance techniques are developed to enable the execution of the explorations on grid resources. The analysis of the sequence transformations is used to resolve uncertainties associated with the phylogenetic trees namely, intermediate sequences in the phylogenetic tree and the exact number of time-steps required for the evolution of a branch. The model is further used to find out various statistics such as most popular rules at a particular time-step in the evolution history of a branch in a phylogenetic tree. We have observed some interesting statistics regarding the unknown base pairs in the intermediate sequences of the phylogenetic tree and the most popular rules used for sequence transformations. Next part of the thesis deals with predictions of future sequences using the previous sequences. First, we try to find out the preserved sequences so that cellular automata rules can be applied selectively. Then, random strategies are developed as base benchmarks. Roulette Wheel strategy is used for predicting future DNA sequences. Though the prediction strategies are able to better the random benchmarks in most of the cases, average performance improvement over the random strategies is not significant. The possible reasons are discussed.	en
dc.language.iso	en_US	en
dc.relation.ispartofseries	G21479	en
dc.subject	DNA Sequence	en
dc.subject	Grid Transformation	en
dc.subject	Phylogenetics	en
dc.subject	Cellular Automata	en
dc.subject	Phylogenetic Trees	en
dc.subject	DNA Sequence - Evolution	en
dc.subject	Grid Computing	en
dc.subject	Grids	en
dc.subject.classification	Biochemical Genetics	en
dc.title	Analysis And Predictions Of DNA Sequence Transformations On Grids	en
dc.type	Thesis	en
dc.degree.name	MSc Engg	en
dc.degree.level	Masters	en
dc.degree.discipline	Faculty of Engineering	en

Files in this item

Name:: G21479.pdf
Size:: 532.2Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Supercomputer Education and Research Centre (SERC) [116]

Show simple item record