Repository logo
Log In(current)
  1. Home
  2. Colleges & Schools
  3. Graduate School
  4. Doctoral Dissertations
  5. Dynamic Task Execution on Shared and Distributed Memory Architectures
Details

Dynamic Task Execution on Shared and Distributed Memory Architectures

Date Issued
December 1, 2012
Author(s)
YarKhan, Asim
Advisor(s)
Jack J. Dongarra
Additional Advisor(s)
Michael W. Berry, Kenneth Stephenson, Stanimire Tomov
Permanent URI
https://trace.tennessee.edu/handle/20.500.14382/22509
Abstract

Multicore architectures with high core counts have come to dominate the world of high performance computing, from shared memory machines to the largest distributed memory clusters. The multicore route to increased performance has a simpler design and better power efficiency than the traditional approach of increasing processor frequencies. But, standard programming techniques are not well adapted to this change in computer architecture design.


In this work, we study the use of dynamic runtime environments executing data driven applications as a solution to programming multicore architectures. The goals of our runtime environments are productivity, scalability and performance. We demonstrate productivity by defining a simple programming interface to express code. Our runtime environments are experimentally shown to be scalable and give competitive performance on large multicore and distributed memory machines.

This work is driven by linear algebra algorithms, where state-of-the-art libraries (e.g., LAPACK and ScaLAPACK) using a fork-join or block-synchronous execution style do not use the available resources in the most efficient manner. Research work in linear algebra has reformulated these algorithms as tasks acting on tiles of data, with data dependency relationships between the tasks. This results in a task-based DAG for the reformulated algorithms, which can be executed via asynchronous data-driven execution paths analogous to dataflow execution.

We study an API and runtime environment for shared memory architectures that efficiently executes serially presented tile based algorithms. This runtime is used to enable linear algebra applications and is shown to deliver performance competitive with state-of- the-art commercial and research libraries.

We develop a runtime environment for distributed memory multicore architectures extended from our shared memory implementation. The runtime takes serially presented algorithms designed for the shared memory environment, and schedules and executes them on distributed memory architectures in a scalable and high performance manner. We design a distributed data coherency protocol and a distributed task scheduling mechanism which avoid global coordination. Experimental results with linear algebra applications show the scalability and performance of our runtime environment.

Disciplines
Computer and Systems Architecture
Numerical Analysis and Scientific Computing
Software Engineering
Systems Architecture
Degree
Doctor of Philosophy
Major
Computer Science
File(s)
Thumbnail Image
Name

YarKhan_Asim_Dec_2012.pdf

Size

3.49 MB

Format

Adobe PDF

Checksum (MD5)

3c0802ac71d99b6a15e49e2a7f1ff2f9

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Privacy policy
  • End User Agreement
  • Send Feedback
  • Contact
  • Libraries at University of Tennessee, Knoxville
Repository logo COAR Notify