Masters Theses

Date of Award

12-1996

Degree Type

Thesis

Degree Name

Master of Science

Major

Computer Science

Major Professor

Jens Gregor

Abstract

We are concerned with those problems where information can be encoded as strings, and classes of information can be modeled by regular grammars. When the string encoding is imperfect, it may be that classification is impossible unless some form of error correction is used. To handle this situation, we study the Lyon parser which is an error correcting modification of the Earley parser for context-free grammars. In particular, we simplify this parser from working with context-free grammars to working only with regular grammars. Once simplified, we try different approaches to reducing the number of deletion edits to the string.

We compare two modifications of the Lyon parser that reduce the number of deletions: one that uses an adjustable queue, and one that uses a shortest common supersequence approach. The focus of this thesis is the comparison of these two approaches. We compare these approaches by parsing strings using three different types of regular grammars: feed forward, regular expression, and free- form grammars. Also, parsing test strings that generate many deletions when using a free-form grammar are examined. For these three types of grammars we compare the number of deletions for each approach and, for the free-form grammars, their timings. In conclusion, for grammars requiring multiple deletion passes, we find using the shortest common supersequence to be the better one.

Files over 3MB may be slow to open. For best results, right-click and select "save as..."

Share

COinS