Date of Award
Master of Science
Jakob Luettgau, Silvina Caino-Lores, Michael W. Berry, Olga Pearce, Stephanie Brink
Multi-dimensional performance data analysis presents challenges for programmers, and users. Developers have to choose library and compiler options for each platform, analyze raw performance data, and keep up with new technologies. Users run codes on different platforms, validate results with collaborators, and analyze performance data as applications scale up. Site operators use multiple profiling tools to optimize performance, requiring the analysis of multiple sources and data types. There is currently no comprehensive tool to support the structured analysis of unstructured data, when holistic performance data analysis can offer actionable insights and improve performance. In this work, we present thicket, a tool designed based on the experiences and insights of programmers, and users to address these needs. Thicket is a Python-based data analysis toolkit that aims to make performance data exploration more accessible and user-friendly for application code developers, users, and site operators. It achieves this by providing a comprehensive interface that allows for the easy manipulation, modeling, and visualization of data collected from multiple tools and executions. The central element of Thicket is the ”thicket object,” which unifies data from multiple sources and allows for various data manipulation and modeling operations, includingfiltering, grouping, and querying, and statistical operations. Thicket also supports the useof external libraries such as scikit-learn and Extra-P for data modeling and visualization in an intuitive call tree context. Overall, Thicket aims to help users make better decisions about their application’s performance by providing actionable insights from complex and multi-dimensional performance data. Here, we present some capabilities extended by the components of thicket and important use cases that have implications beyond the data structure that provide these capabilities.
Lama, Vanessa, "Interactive Data Analysis of Multi-Run Performance Data. " Master's Thesis, University of Tennessee, 2023.