Doctoral Dissertations

Date of Award

12-2025

Degree Type

Dissertation

Degree Name

Doctor of Philosophy

Major

Data Science and Engineering

Major Professor

Jack J. Dongarra

Committee Members

Heike Jagode, Anthony Danalis, Russell L. Zaretzki

Abstract

Efficient execution of scientific applications on high-performance computing (HPC) systems depends heavily on effective performance analysis. Performance analysis refers to the process of evaluating how well an application performs on an HPC system, including its interaction with underlying hardware components, to understand how well it operates and identify potential areas for improvement. This typically involves the measurement of various performance metrics, such as speed, efficiency, resource utilization, and responsiveness. The insights gained from performance analysis allow the user of an HPC system to optimize hardware usage, pinpoint inefficiencies and bottlenecks, and ensure scalability. Many of the most informative performance metrics can only be obtained by monitoring hardware events (i.e., low-level operations tracked by the hardware itself). Without them, the only available performance metric is execution time.

However, the sheer volume of hardware events in modern HPC systems is overwhelming, making them difficult for users to comprehend and use effectively. The research in this dissertation has focused on mitigating this problem by developing an approach that quantitatively characterizes hardware events, automatically classifies them, and automatically derives meaningful performance metrics from them. To better understand the system behaviors captured by hardware events, this dissertation presents benchmarks consisting of well-defined operations that stress different hardware attributes in isolation. In addition, it elucidates an automated mathematical analysis to identify key hardware events and define useful performance metrics using them. Lastly, it establishes strategies for benchmarking the hardware shared among processor cores and identifying key inter-core events.

Files over 3MB may be slow to open. For best results, right-click and select "save as..."

Included in

Data Science Commons

Share

COinS