Doctoral Dissertations

Kernel-assisted and Topology-aware MPI Collective Communication among Multicore or Many-core Clusters

Teng MaFollow

Date of Award

12-2012

Degree Type

Dissertation

Degree Name

Doctor of Philosophy

Major

Computer Science

Major Professor

Jack J. Dongarra

Committee Members

George Bosilca, Jian Huang, Louis J. Gross

Abstract

Multicore or many-core clusters have become the most prominent form of High Performance Computing (HPC) systems. Hardware complexity and hierarchies not only exist in the inter-node layer, i.e., hierarchical networks, but also exist in internals of multicore compute nodes, e.g., Non Uniform Memory Accesses (NUMA), network-style interconnect, and memory and shared cache hierarchies.

Message Passing Interface (MPI), the most widely adopted in the HPC communities, suffers from decreased performance and portability due to increased hardware complexity of multiple levels. We identified three critical issues specific to collective communication: The first problem arises from the gap between logical collective topologies and underlying hardware topologies; Second, current MPI communications lack efficient shared memory message delivering approaches; Last, on distributed memory machines, like multicore clusters, a single approach cannot encompass the extreme variations not only in the bandwidth and latency capabilities, but also in features such as the aptitude to operate multiple concurrent copies simultaneously.

To bridge the gap between logical collective topologies and hardware topologies, we developed a distance-aware framework to integrate the knowledge of hardware distance into collective algorithms in order to dynamically reshape the communication patterns to suit the hardware capabilities. Based on process distance information, we used graph partitioning techniques to organize the MPI processes in a multi-level hierarchy, mapping on the hardware characteristics. Meanwhile, we took advantage of the kernel-assisted one-sided single-copy approach (KNEM) as the default shared memory delivering method. Via kernel-assisted memory copy, the collective algorithms offload copy tasks onto non-leader/not-root processes to evenly distribute copy workloads among available cores. Finally, on distributed memory machines, we developed a technique to compose multi-layered collective algorithms together to express a multi-level algorithm with tight interoperability between the levels. This tight collaboration results in more overlaps between inter- and intra-node communication.

Experimental results have confirmed that, by leveraging several technologies together, such as kernel-assisted memory copy, the distance-aware framework, and collective algorithm composition, not only do MPI collectives reach the potential maximum performance on a wide variation of platforms, but they also deliver a level of performance immune to modifications of the underlying process-core binding.

Recommended Citation

Ma, Teng, "Kernel-assisted and Topology-aware MPI Collective Communication among Multicore or Many-core Clusters. " PhD diss., University of Tennessee, 2012.
https://trace.tennessee.edu/utk_graddiss/1541

Download

Files over 3MB may be slow to open. For best results, right-click and select "save as..."

Included in

Computational Engineering Commons, Computer and Systems Architecture Commons

COinS

Doctoral Dissertations

Kernel-assisted and Topology-aware MPI Collective Communication among Multicore or Many-core Clusters

Date of Award

Degree Type

Degree Name

Major

Major Professor

Committee Members

Abstract

Recommended Citation

Included in

Search

Browse

Contributors

Links

About Trace

Doctoral Dissertations

Kernel-assisted and Topology-aware MPI Collective Communication among Multicore or Many-core Clusters

Author

Date of Award

Degree Type

Degree Name

Major

Major Professor

Committee Members

Abstract

Recommended Citation

Included in

Share

Search

Browse

Contributors

Links

About Trace