Masters Theses
Date of Award
12-1996
Degree Type
Thesis
Degree Name
Master of Science
Major
Computer Science
Major Professor
Jens Gregor
Abstract
We are concerned with those problems where information can be encoded as strings, and classes of information can be modeled by regular grammars. When the string encoding is imperfect, it may be that classification is impossible unless some form of error correction is used. To handle this situation, we study the Lyon parser which is an error correcting modification of the Earley parser for context-free grammars. In particular, we simplify this parser from working with context-free grammars to working only with regular grammars. Once simplified, we try different approaches to reducing the number of deletion edits to the string.
We compare two modifications of the Lyon parser that reduce the number of deletions: one that uses an adjustable queue, and one that uses a shortest common supersequence approach. The focus of this thesis is the comparison of these two approaches. We compare these approaches by parsing strings using three different types of regular grammars: feed forward, regular expression, and free- form grammars. Also, parsing test strings that generate many deletions when using a free-form grammar are examined. For these three types of grammars we compare the number of deletions for each approach and, for the free-form grammars, their timings. In conclusion, for grammars requiring multiple deletion passes, we find using the shortest common supersequence to be the better one.
Recommended Citation
Richardson, Eric Wright, "A shortest common supersequence approach to error correcting parsing with regular grammars. " Master's Thesis, University of Tennessee, 1996.
https://trace.tennessee.edu/utk_gradthes/10944