|CS 331 Spring 2013 > Lecture Notes for Wednesday, January 23, 2013|
In the late 1950s, linguist Noam Chomsky published a hierarchy of types of languages, defined in terms of the kinds of grammars that could describe them. Chomsky was trying to develop a framework for studying natural languages (e.g., English); however, his hierarchy has proved to be useful in the theoretical study of computing.
Below is the hierarchy. In the Grammar column, upper-case letters represent nonterminal symbols, while lower-case letters represent terminal symbols.
|Language Category||Generator||Recognizer||Why We Care|
Grammar in which each production is in one of the following forms:
Another kind of generator: regular expressions.
Think: Program that uses a small, fixed amount of memory
|In this category lie things like
the set of all legal identifiers in some programming language.
Thus, lexical analysis: breaking a program into words.
These are also the languages that can be described by regular expressions (used, for example, in text search & replace).
Grammar in which the left-hand side of each production consists of a single nonterminal.
||Nondeterministic Push-Down Automaton
Think: Finite Automaton + Stack (roughly)
|For most programming languages, the set of all syntactically correct programs, lies in this category. Thus, parsing: determining whether a program is syntactically correct and, if so, how it is structured.|
Grammar in which each production is in the following form:
||Don’t worry about it
(“Linear Bounded Automaton”, if you must know)
|Actually, we don’t care much.|
|Type 0||Recursively Enumerable||Arbitrary grammar.
Think: Computer program
|For a language in this category, the task of analyzing a given string and saying “yes” precisely when it lies in the language, neatly encompasses those things that can be done using a computer program.|