Repository logo
Log In(current)
  1. Home
  2. Colleges & Schools
  3. Graduate School
  4. Masters Theses
  5. Automated reading of indexing fields in document images
Details

Automated reading of indexing fields in document images

Date Issued
August 1, 1993
Author(s)
Floyd, Steven R.
Advisor(s)
Rafael C. Gonzalez
Additional Advisor(s)
Michael G. Thomason, Reinhold C. Mann
Abstract

When converting paper documents into a format suitable for storage in a computer system, electronic documents or document images can only be retrieved when they are indexed. Unfortunately, manual indexing can account for over 75% of the conversion cost, thus, for many applications, it is important to consider automated methods of reading indexing fields in document images.


An automated technique was developed to read document control information on labels that are affixed in a free-form manner to documents recorded by the Office of the Knox County Register of Deeds. The method consisted of two main steps: Intelligent Field Detection (IFD) and field recognition. An IFD algorithm was developed to find the labels and is discussed, with particular emphasis on accuracy, speed, and practicality in a production environment. The method used to recognize indexing fields on the label is based on commercially available Optical Character Recognition (OCR) technology and a post-processing approach that uses a modifiable set of image enhancement operations to improve recognition. The algorithms were tested extensively using a database of over 13,000 document images; rates of 97% and 87% were achieved for label detection and the recognition of indexing fields, respectively. The methods developed can be used now to verify the correctness of the existing database of 1,000,000 document images, assist in the indexing of scanned images, and monitor the print quality of the labels that are affixed to documents recorded at the Knox County registry. However, they are also applicable to similar problems at most registries in the United States and other organizations where indexing fields are positioned on documents in an unconstrained manner.

Degree
Master of Science
Major
Computer Science
File(s)
Thumbnail Image
Name

Thesis93.F463.pdf_AWSAccessKeyId_AKIAYVUS7KB2IXSYB4XB_Signature_GGCjVSRtdVSWj8QygIeP6McFgMs_3D_Expires_1728585632

Size

8.89 MB

Format

Unknown

Checksum (MD5)

1f792dda925786d63ea13795960e5785

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Privacy policy
  • End User Agreement
  • Send Feedback
  • Contact
  • Libraries at University of Tennessee, Knoxville
Repository logo COAR Notify