Unfuzzying fuzzy parsing

By Carvalho, P.; Oliveira, N.; Henriques, P.R.

OpenAccess Series in Informatics



Recognizing sentences of a language in an efficient and precise manner has always been a strong subject within computer science. Many theories, algorithms and techniques have been proposed along computing history, but at the end it all comes down to performing lexical and syntactic analysis of the source, originating a parse tree as the result. Sometimes there is no need for full precision or even a full parse tree. A good example of one of these cases is architecture extraction from source code. In this case only a small portion of the code is of interest. Another good example is recognizing handwritten expressions, because it is entirely impossible to predict the kind of calligraphy that will be analyzed, it is also impossible to perform an one hundred percent precise recognition. This need for tolerant parsing lead to the development of many forms of tolerant parsing along the years. This master work will focus on one form of tolerant parsing in particular, Fuzzy Parsing. From this work it is expected the emergence of a new Fuzzy Parsing technique based on automata, where automata states would represent context and edges would represent potential matches inside that context. The hypothesis of this work is that such an approach reduces uncertainty and recognition time. It is also expected the creation of a tool suit that facilitates the process of developing fuzzy parsers. We believe that such a tool will be a great addition to areas such as Program Comprehension or IDE construction.


Google Scholar: