Doctoral Dissertations
Date of Award
8-2022
Degree Type
Dissertation
Degree Name
Doctor of Philosophy
Major
Energy Science and Engineering
Major Professor
Sudarsanam Babu
Committee Members
Daniel Jacobson, Helen A. Baghdoyan, Ralph Lydic, Michael A. Langston, James Fordyce
Abstract
With the continuous improvements in biological data collection, new techniques are needed to better understand the complex relationships in genomic and other biological data sets. Explainable Artificial Intelligence (X-AI) techniques like Iterative Random Forest (iRF) excel at finding interactions within data, such as genomic epistasis. Here, the introduction of new methods to mine for these complex interactions is shown in a variety of scenarios. The application of iRF as a method for Genomic Wide Epistasis Studies shows that the method is robust in finding interacting sets of features in synthetic data, without requiring the exponentially increasing computation time of many classic association study methods. Leveraging the non-parametric prediction capabilities of iRF, new genomic insights are used to improve Genomic Selection and Progeny Prediction. This capability enables breeders to make informed selections for crosses without first requiring a full progeny trial. A new algorithm, Tensor Iterative Random Forest (TiRF), expands upon the foundation of iRF, to provide information on relationships not only between the features and targets of the model, but also the between the targets themselves. This algorithm is validated with the capture of information from gene regulatory networks from the DREAM competition. The impact of the SARS-CoV-2 virus has necessitated a method that can capture the changing nature of the genetic architecture of the virus and incorporate potential recombination events, paving the way for a better understanding of how the virus has changed and will change. A new method is introduced that identifies likely parents of haplotypes designated to be the result of recombination. Together, these new methods aim to provide a stronger insight into genetic architecture complexities.
Recommended Citation
Romero, Jonathon C., "Better Understanding Genomic Architecture with the use of Applied Statistics and Explainable Artificial Intelligence. " PhD diss., University of Tennessee, 2022.
https://trace.tennessee.edu/utk_graddiss/7431
Included in
Applied Statistics Commons, Artificial Intelligence and Robotics Commons, Bioinformatics Commons, Data Science Commons, Plant Breeding and Genetics Commons