Repository logo
Log In(current)
  1. Home
  2. Colleges & Schools
  3. Graduate School
  4. Doctoral Dissertations
  5. Scalable Graph Algorithms with Applications in Genetics
Details

Scalable Graph Algorithms with Applications in Genetics

Date Issued
December 1, 2008
Author(s)
Zhang, Yun  
Advisor(s)
Michael A. Langston
Additional Advisor(s)
Michael W. Berry, Elissa J. Chesler, Lynne E. Parker
Link to full text
http://etd.utk.edu/2008/December2008Dissertations/ZhangYun.pdf
Permanent URI
https://trace.tennessee.edu/handle/20.500.14382/26750
Abstract

Graph theoretical approaches have been widely used to solve problems arising in bioinformatics and genomic analysis. In particular, enumeration problems such as maximal clique and maximal biclique finding are cores for addressing many biological problems, such as the integration of genome mapping data. However, the enumeration problems may become computation and memory bot- tlenecks for genome-scale elucidation of biological networks due to their NP-hard nature and the huge memory requirements.


Therefore, this research is interested in developing exact, scalable, and efficient algorithms for these biological problems. The Clique Enumerator including a maximal clique enumeration algo- rithm and its parallel implementation is developed to provide scalable and exact solutions to the maximal clique problem by taking advantage of (a) ultra-large globally addressable memory archi- tectures, (b) the theory of fixed parameter tractability, and (c) a novel bitmap data representation scheme for memory and search space reduction. An implementation of the parallel algorithm scales linearly to at least 64 processors.

Maximal biclique finding on bipartite graphs is the other focus of this research because of its key role in the study of gene-phenotype association graphs built from heterogeneous data types. A general purpose algorithm is developed for enumerating maximal bicliques in a bipartite graph, without any problem restriction. Experimental results using both gene expression data and syn- thetic data indicate that the new algorithm can be two to three orders of magnitude faster than the best previous alternatives.

The clique and biclique enumeration algorithms are applied to two applications in genetics: the linkage disequilibrium analysis in population genetics with clique algorithms, and the ontology discovery for systems genetics with biclique algorithms. The clique model is applied to extract linkage disequilibrium networks from all loci on the whole-genome. The structures of such networks are analyzed to compare the genetic structure of populations. The biclique model is applied to enumerate bicliques in gene-phenotype association graphs constructed from empirical gene sets. Bicliques are integrated into a directed acyclic graph to provide an empirical discovery of phenotype ontology.

Disciplines
Computer Sciences
Degree
Doctor of Philosophy
Major
Computer Science
Embargo Date
December 1, 2011
File(s)
Thumbnail Image
Name

ZhangYun.pdf

Size

2.32 MB

Format

Adobe PDF

Checksum (MD5)

916c2bece99520f5c13ef6f6dc949ad0

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Privacy policy
  • End User Agreement
  • Send Feedback
  • Contact
  • Libraries at University of Tennessee, Knoxville
Repository logo COAR Notify