Repository logo
Log In(current)
  1. Home
  2. Colleges & Schools
  3. Graduate School
  4. Masters Theses
  5. Using latent semantic indexing for data mining
Details

Using latent semantic indexing for data mining

Date Issued
December 1, 1997
Author(s)
Jiang, Jingqian
Advisor(s)
Michael W. Berry
Additional Advisor(s)
Bradley Vander Zanden
June Donato
Permanent URI
https://trace.tennessee.edu/handle/20.500.14382/31787
Abstract

Data Mining is the application of algorithms for extracting valuable informa-tion from large databases in order to make important business decisions. This study explores a new technique for data mining - Latent Semantic Indexing (LSI). LSI is an efficient information retrieval method for textual documents. By determining the singular value decomposition (SVD) of a large sparse term-by-document matrix, LSI constructs an approximate vector space model which rep-resents important associative relationships between terms and documents that are not evident in individual documents. This thesis explores the applicability of the LSI model to numerical databases, especially consumer product data. By properly chosing attributes of data records as terms or documents, a term-by-document in-cidence matrix is built and then a distribution-based indexing scheme is employed to construct a correlated distribution matrix. Hence a similar LSI vector space model can be generated to detect useful or hidden patterns in the databases. The extracted information can then be validated using statistical hypotheses testing or resampling. LSI is an automatic yet intelligent indexing method, its application to numerical data introduces a promising way to discover knowledge in important commercial application areas such as retail and consumer banking.

Degree
Master of Science
Major
Computer Science
File(s)
Thumbnail Image
Name

Thesis97J53.pdf

Size

2.17 MB

Format

Unknown

Checksum (MD5)

f019b11a6f97f9345cecad9ee17f38eb

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Privacy policy
  • End User Agreement
  • Send Feedback
  • Contact
  • Libraries at University of Tennessee, Knoxville
Repository logo COAR Notify