Repository logo
Log In(current)
  1. Home
  2. Colleges & Schools
  3. Graduate School
  4. Masters Theses
  5. Interactive Data Analysis of Multi-Run Performance Data
Details

Interactive Data Analysis of Multi-Run Performance Data

Date Issued
May 1, 2023
Author(s)
Lama, Vanessa  
Advisor(s)
Michela Taufer
Additional Advisor(s)
Jakob Luettgau
Silvina Caino-Lores
Michael W. Berry
Olga Pearce
Stephanie Brink
Permanent URI
https://trace.tennessee.edu/handle/20.500.14382/45812
Abstract

Multi-dimensional performance data analysis presents challenges for programmers, and users. Developers have to choose library and compiler options for each platform, analyze raw performance data, and keep up with new technologies. Users run codes on different platforms, validate results with collaborators, and analyze performance data as applications scale up. Site operators use multiple profiling tools to optimize performance, requiring the analysis of multiple sources and data types. There is currently no comprehensive tool to support the structured analysis of unstructured data, when holistic performance data analysis can offer actionable insights and improve performance. In this work, we present thicket, a tool designed based on the experiences and insights of programmers, and users to address these needs. Thicket is a Python-based data analysis toolkit that aims to make performance data exploration more accessible and user-friendly for application code developers, users, and site operators. It achieves this by providing a comprehensive interface that allows for the easy manipulation, modeling, and visualization of data collected from multiple tools and executions. The central element of Thicket is the ”thicket object,” which unifies data from multiple sources and allows for various data manipulation and modeling operations, includingfiltering, grouping, and querying, and statistical operations. Thicket also supports the useof external libraries such as scikit-learn and Extra-P for data modeling and visualization in an intuitive call tree context. Overall, Thicket aims to help users make better decisions about their application’s performance by providing actionable insights from complex and multi-dimensional performance data. Here, we present some capabilities extended by the components of thicket and important use cases that have implications beyond the data structure that provide these capabilities.

Subjects

Performance Analysis

High Performance Comp...

HPC

Data Analysis

Disciplines
Data Science
Software Engineering
Degree
Master of Science
Major
Computer Science
File(s)
Thumbnail Image
Name

Thesis_2023_Vanessa.pdf

Size

1.7 MB

Format

Adobe PDF

Checksum (MD5)

aea487731485fa06739e7533ae75b5a9

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Privacy policy
  • End User Agreement
  • Send Feedback
  • Contact
  • Libraries at University of Tennessee, Knoxville
Repository logo COAR Notify