Repository logo
Log In(current)
  1. Home
  2. Colleges & Schools
  3. Graduate School
  4. Doctoral Dissertations
  5. A Scalable Architecture for Simplifying Full-Range Scientific Data Analysis
Details

A Scalable Architecture for Simplifying Full-Range Scientific Data Analysis

Date Issued
December 1, 2011
Author(s)
Kendall, Wesley James
Advisor(s)
Jian Huang
Additional Advisor(s)
Jack Dongarra, Joshua Fu, Richard Mills
Abstract

According to a recent exascale roadmap report, analysis will be the limiting factor in gaining insight from exascale data. Analysis problems that must operate on the full range of a dataset are among the most difficult. Some of the primary challenges in this regard come from disk access, data managment, and programmability of analysis tasks on exascale architectures. In this dissertation, I have provided an architectural approach that simplifies and scales data analysis on supercomputing architectures while masking parallel intricacies to the user. My architecture has three primary general contributions: 1) a novel design pattern and implmentation for reading multi-file and variable datasets, 2) the integration of querying and sorting as a way to simplify data-parallel analysis tasks, and 3) a new parallel programming model and system for efficiently scaling domain-traversal tasks.


The design of my architecture has allowed studies in several application areas that were not previously possible. Some of these include large-scale satellite data and ocean flow analysis. The major driving example is of internal-model variability assessments of flow behavior in the GEOS-5 atmospheric modeling dataset. This application issued over 40 million particle traces for model comparison (the largest parallel flow tracing experiment to date), and my system was able to scale execution up to 65,536 processes on an IBM BlueGene/P system.

Subjects

Large-Data Analysis

Visualization

MapReduce

Parallel I/O

Parallel Processing

Disciplines
Systems Architecture
Degree
Doctor of Philosophy
Major
Computer Science
Embargo Date
December 1, 2011
File(s)
Thumbnail Image
Name

kendall.pdf

Size

12.02 MB

Format

Adobe PDF

Checksum (MD5)

2d922e2bf7fabfded82b68093a3b51e2

Learn more about how TRACE supports reserach impact and open access here.

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Privacy policy
  • End User Agreement
  • Send Feedback
  • Contact
  • Libraries at University of Tennessee, Knoxville
Repository logo COAR Notify