Repository logo
Log In(current)
  1. Home
  2. Colleges & Schools
  3. Graduate School
  4. Masters Theses
  5. Programming Dense Linear Algebra Kernels on Vectorized Architectures
Details

Programming Dense Linear Algebra Kernels on Vectorized Architectures

Date Issued
May 1, 2013
Author(s)
Peyton, Jonathan Lawrence
Advisor(s)
Gregory D. Peterson
Additional Advisor(s)
Michael W. Berry, Nathanael R. Paul
Abstract

The high performance computing (HPC) community is obsessed over the general matrix-matrix multiply (GEMM) routine. This obsession is not without reason. Most, if not all, Level 3 Basic Linear Algebra Subroutines (BLAS) can be written in terms of GEMM, and many of the higher level linear algebra solvers' (i.e., LU, Cholesky) performance depend on GEMM's performance. Getting high performance on GEMM is highly architecture dependent, and so for each new architecture that comes out, GEMM has to be programmed and tested to achieve maximal performance. Also, with emergent computer architectures featuring more vector-based and multi to many-core processors, GEMM performance becomes hinged to the utilization of these technologies. In this research, three Intel processor architectures are explored, including the new Intel MIC Architecture. Each architecture has different vector lengths and number of cores. The effort given to create three Level 3 BLAS routines (GEMM, TRSM, SYRK) is examined with respect to the architectural features as well as some parallel algorithmic nuances. This thorough examination culminates in a Cholesky (POTRF) routine which offers a legitimate test application. Lastly, four shared memory, parallel languages are explored for these routines to explore single-node supercomputing performance. These languages are OpenMP, Pthreads, Cilk and TBB. Each routine is developed in each language offering up information about which language is superior. A clear picture develops showing how these and similar routines should be written in OpenMP and exactly what architectural features chiefly impact performance.

Subjects

MIC

Vectorization

Linear Algebra

Matrix Multiply

Cholesky

Disciplines
Computer and Systems Architecture
Computer Engineering
Numerical Analysis and Scientific Computing
Degree
Master of Science
Major
Computer Engineering
Embargo Date
January 1, 2011
File(s)
Thumbnail Image
Name

my_dissertation.pdf

Size

1.13 MB

Format

Adobe PDF

Checksum (MD5)

29d8247f69df72fc58e4ceb912c9fa61

Learn more about how TRACE supports reserach impact and open access here.

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Privacy policy
  • End User Agreement
  • Send Feedback
  • Contact
  • Libraries at University of Tennessee, Knoxville
Repository logo COAR Notify