Doctoral Dissertations
Date of Award
8-2023
Degree Type
Dissertation
Degree Name
Doctor of Philosophy
Major
Computer Science
Major Professor
Michael W. Berry
Committee Members
Michael Jantz, Audris Mockus, Joan Lind
Abstract
Tensors, or n-way arrays, are incredibly useful for storing indexable data in an arbitrary number of dimensions. Interest in tensor analysis using tensor decomposition has expanded to a variety of fields, including data mining, signal processing, computer vision, and machine learning. Tensors modelling interesting data may also be sparse, where the majority of its values are zero. These tensors can be extremely large and contain millions of entries that cannot be stored explicitly. To address this problem, various formats have arisen in the past decade to compress and compact such massive data. However, most of these existing structures are static and do not support tensor updates. This motivated the proposal of a new format in 2021, Hashed Coordinate Storage (HaCOO), a mode-agnostic format that stores sparse tensor indexes and values in a separate chaining hash table to rapidly insert and access arbitrary entries in constant time. To investigate the benefits of this novel format, we introduce a MATLAB class to create and manipulate sparse tensors in HaCOO format. This class was evaluated alongside MATLAB Tensor Toolbox using several real-world sparse tensor datasets to compare tensor update capability and MTTKRP, a key kernel in Canonical Polyadic Decomposition. Additionally, we discuss how HaCOO format can greatly accelerate building document tensors in a practical application of using sparse tensor decomposition in a text analysis model.
Recommended Citation
Charles, Jama MeiLi, "Hashed Coordinate Sparse Tensor Storage with MATLAB. " PhD diss., University of Tennessee, 2023.
https://trace.tennessee.edu/utk_graddiss/8721
Comments
fixed small typo