Doctoral Dissertations

Date of Award

8-2025

Degree Type

Dissertation

Degree Name

Doctor of Philosophy

Major

Computer Science

Major Professor

Michael R. Jantz

Committee Members

Michael Jantz, Michael Berry, Micha Beck, George Bosilca

Abstract

In the era of Exascale computing, the efficiency of scientific applications is increasingly constrained by communication overhead rather than computational capacity. This dissertation addresses two critical bottlenecks in MPI (Message Passing Interface) communication: inefficient handling of non-contiguous data and unexploited opportunities for hiding compression overhead for the communication data.

The first part of this work focuses on enhancing the Open MPI datatype engine. By redesigning internal datatype representations using a flattened IOVEC format and introducing Memory Access Rearrangements (MARs), this study minimizes the bookkeeping overhead associated with packing complex datatypes. A comprehensive performance model is developed to predict packing efficiency. Additionally, Just-In-Time (JIT) compilation is integrated into the datatype engine using libgccjit, allowing runtime generation of tailored packing functions. These functions eliminate conditional branching and are particularly effective for pipelined communication, yielding speedups of up to 3.65x. To mitigate the JIT compilation overhead, an offline caching mechanism is introduced, enabling reuse of compiled functions across multiple runs.

The second part of the dissertation explores early compression techniques to hide the overhead of data compression in MPI communication. A novel framework is proposed that leverages the Linux userfaultfd (uffd) mechanism to detect write access and strategically offload compression to idle CPU resources. By overlapping compression with the delay between the last write and message transmission, the framework masks compression latency without disrupting application logic. A detailed evaluation across representative benchmarks and real-world MPI applications demonstrates that this approach can potentially reduce communication time while preserving correctness and portability.

Together, these contributions present a holistic enhancement of the Open MPI communication stack. By bridging datatype optimization and early compression hiding, this work improves communication efficiency and scalability on current and emerging HPC platforms. The solutions are implemented as lightweight extensions at the MPI user level, requiring minimal developer intervention, and are compatible with existing MPI applications.

Files over 3MB may be slow to open. For best results, right-click and select "save as..."

Share

COinS