Repository logo
Log In(current)
  1. Home
  2. Colleges & Schools
  3. Graduate School
  4. Doctoral Dissertations
  5. Multipartite Graph Algorithms for the Analysis of Heterogeneous Data
Details

Multipartite Graph Algorithms for the Analysis of Heterogeneous Data

Date Issued
December 1, 2015
Author(s)
Phillips, Charles Alexander  
Advisor(s)
Michael A. Langston
Additional Advisor(s)
Bruce J. MacLennon, Brynn H. Voy, David J. Icove
Permanent URI
https://trace.tennessee.edu/handle/20.500.14382/24736
Abstract

The explosive growth in the rate of data generation in recent years threatens to outpace the growth in computer power, motivating the need for new, scalable algorithms and big data analytic techniques. No field may be more emblematic of this data deluge than the life sciences, where technologies such as high-throughput mRNA arrays and next generation genome sequencing are routinely used to generate datasets of extreme scale. Data from experiments in genomics, transcriptomics, metabolomics and proteomics are continuously being added to existing repositories. A goal of exploratory analysis of such omics data is to illuminate the functions and relationships of biomolecules within an organism. This dissertation describes the design, implementation and application of graph algorithms, with the goal of seeking dense structure in data derived from omics experiments in order to detect latent associations between often heterogeneous entities, such as genes, diseases and phenotypes. Exact combinatorial solutions are developed and implemented, rather than relying on approximations or heuristics, even when problems are exceedingly large and/or difficult. Datasets on which the algorithms are applied include time series transcriptomic data from an experiment on the developing mouse cerebellum, gene expression data measuring acute ethanol response in the prefrontal cortex, and the analysis of a predicted protein-protein interaction network. A bipartite graph model is used to integrate heterogeneous data types, such as genes with phenotypes and microbes with mouse strains. The techniques are then extended to a multipartite algorithm to enumerate dense substructure in multipartite graphs, constructed using data from three or more heterogeneous sources, with applications to functional genomics. Several new theoretical results are given regarding multipartite graphs and the multipartite enumeration algorithm. In all cases, practical implementations are demonstrated to expand the frontier of computational feasibility.

Subjects

graph algorithms

combinatorics

clustering

multipartite

biclique

Disciplines
Computational Biology
Degree
Doctor of Philosophy
Major
Computer Science
Embargo Date
January 1, 2011
File(s)
Thumbnail Image
Name

dissertation_final.pdf

Size

2.5 MB

Format

Adobe PDF

Checksum (MD5)

db3594e0d1c44a1192bc533142dbbfa5

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Privacy policy
  • End User Agreement
  • Send Feedback
  • Contact
  • Libraries at University of Tennessee, Knoxville
Repository logo COAR Notify