Repository logo
Log In(current)
  1. Home
  2. Colleges & Schools
  3. Graduate School
  4. Doctoral Dissertations
  5. Enabling Reproducibility, Scalability, and Orchestration of Scientific Workflows in HPC and Cloud-Converged Infrastructure
Details

Enabling Reproducibility, Scalability, and Orchestration of Scientific Workflows in HPC and Cloud-Converged Infrastructure

Date Issued
August 1, 2024
Author(s)
Olaya, Paula Fernanda
Advisor(s)
Michela Taufer
Additional Advisor(s)
Michael Jantz, Jian Huang, Rodrigo Vargas, Yoonho Park, Jay Lofstead
Abstract

Scientific communities across different domains increasingly run complex workflows for their scientific discovery. Scientists require that these workflows ensure robustness; where workflows must be reproducible, scale in performance; and exhibit trustworthiness in terms of the computational techniques, infrastructures, and people. However, as scientists leverage advanced techniques (big data analytics, AI, and ML) and infrastructure (HPC and cloud), their workflows grow in complexity, leading to new challenges in scientific computing; hindering robustness.


In this dissertation, we address the needs of diverse scientific communities across different fields to identify three main challenges that hinder the robustness of workflows: (i) lack of traceability, explainability, and reproducibility; (ii) hidden intermediate data reducing scalability; and (iii) inefficient data management in workflow orchestration. We codesign scientific workflows and HPC and cloud-converged infrastructure to develop robust science, bridging the gap between computational and domain scientists.

First, we develop fine-grained containerized environments that enable data traceability and results explainability by automatically annotating provenance information, to advance widespread reproducibility. Second, we integrate the workflows in HPC and cloud infrastructure and tune the storage technology to enable better I/O and data scalability. Finally, we provide a software architecture that enables efficient data management (scalable and trustworthy data) in the orchestration of scientific workflows while leveraging the high throughput and low latency of node-local storage.

Subjects

scientific workflows

high-performance comp...

cloud computing

reproducibility

scalability

orchestration

Disciplines
Data Science
Earth Sciences
Other Computer Sciences
Degree
Doctor of Philosophy
Major
Computer Science
File(s)
Thumbnail Image
Name

Olaya_Dissertation_2024_2.pdf

Size

9.07 MB

Format

Adobe PDF

Checksum (MD5)

235ce0e941e69b80607209cf2a801921

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Privacy policy
  • End User Agreement
  • Send Feedback
  • Contact
  • Libraries at University of Tennessee, Knoxville
Repository logo COAR Notify