Masters Theses

Date of Award

12-1997

Degree Type

Thesis

Degree Name

Master of Science

Major

Computer Science

Major Professor

Bruce Whitehead

Committee Members

Kenneth Kimble, Dinesh Mehta

Abstract

Data mining is the discovery of non-obvious knowledge about a large set of related data. Pattern matching is one data mining technique often used to find nonobvious relationships and associations between data items. The problem addressed by this research is the reconciliation of two data sets of different origin that contain information about the same real-world entities. This research compares the effectiveness of the back-propagation neural network model and the least-squares multiple linear regression model by using each method to recognize when a record in one data set describes the same real-world entity as a record in the other data set. Results of this research indicate that back-propagation can easily over-fit mismatched data but does outperform least-squares approximations when the number of hidden layer neurons is carefully chosen.

Files over 3MB may be slow to open. For best results, right-click and select "save as..."

Share

COinS