Date of Award

8-2016

Degree Type

Dissertation

Degree Name

Doctor of Philosophy

Major

Computer Science

Major Professor

Michael W. Berry

Committee Members

Jack Dongarra, Louis Gross, Qing Cao

Abstract

Latent Semantic Analysis (LSA) is a mathematically based machine learning technology that has demonstrated success in numerous applications in text analytics and natural language processing. The construction of a large hyper-dimensional space, a LSA space, is central to the functioning of this technique, serving to define the relationships between the information items being processed. This hyper-dimensional space serves as a semantic mapping system that represents learned meaning derived from the input content. The meaning represented in an LSA space, and therefore the mappings that are generated and the quality of the results obtained from using the space, is completely dependent on the content used to construct the space. It can be easily observed that modifying the content used to build a LSA space changes the meaning represented by the space, but in current practice the impact of these changes upon the overall body of meaning represented by the space is not understood. The research described here seeks to identify the impact of changes in the content of a LSA space on the meaning represented by that space through the development of quantitative measures. These measures will facilitate the comparison of different LSA spaces to assess their degree of semantic similarity. This insight will in turn provide reasoning leverage for answering questions about the characteristics of LSA spaces related to the overall body of meaning that they represent.

Files over 3MB may be slow to open. For best results, right-click and select "save as..."

Share

COinS