CS 331 Spring 2013  >  Lecture Notes for Wednesday, January 23, 2013

# CS 331 Spring 2013 Lecture Notes for Wednesday, January 23, 2013

## Languages & Grammars (cont’d)

### The Chomsky Hierarchy

In the late 1950s, linguist Noam Chomsky published a hierarchy of types of languages, defined in terms of the kinds of grammars that could describe them. Chomsky was trying to develop a framework for studying natural languages (e.g., English); however, his hierarchy has proved to be useful in the theoretical study of computing.

Below is the hierarchy. In the Grammar column, upper-case letters represent nonterminal symbols, while lower-case letters represent terminal symbols.

Language Category Generator Recognizer Why We Care
Chomsky’s Number Name
Type 3 Regular Grammar in which each production is in one of the following forms:
• $$A \rightarrow \varepsilon$$
• $$A \rightarrow b$$
• $$A \rightarrow bC$$

Another kind of generator: regular expressions.

Finite Automaton
Think: Program that uses a small, fixed amount of memory
In this category lie things like the set of all legal identifiers in some programming language. Thus, lexical analysis: breaking a program into words.

These are also the languages that can be described by regular expressions (used, for example, in text search & replace).

Type 2 Context-Free Grammar in which the left-hand side of each production consists of a single nonterminal.
• $$A \rightarrow \textrm{[any]}$$
Nondeterministic Push-Down Automaton
Think: Finite Automaton + Stack (roughly)
For most programming languages, the set of all syntactically correct programs, lies in this category. Thus, parsing: determining whether a program is syntactically correct and, if so, how it is structured.
Type 1 Context-Sensitive Grammar in which each production is in the following form:
• $$\textrm{[str1]}A\textrm{[str2]} \rightarrow \textrm{[str1]}\textrm{[any]}\textrm{[str2]}$$
where $$\textrm{[str1]}$$, $$\textrm{[str2]}$$ are fixed strings.
(“Linear Bounded Automaton”, if you must know)
Actually, we don’t care much.
Type 0 Recursively Enumerable Arbitrary grammar.
• $$\textrm{[any]} \rightarrow \textrm{[any]}$$
Turing Machine
Think: Computer program
For a language in this category, the task of analyzing a given string and saying “yes” precisely when it lies in the language, neatly encompasses those things that can be done using a computer program.

Notes

• Each of the above categories is contained in the next. So every regular language is context-free, etc.
• In this class we are primarily interested in regular languages and context-free languages. See the “Why We Care” column for reasons.
• There are languages that are not recursively enumerable, that is, that cannot be described by a grammar, and that we cannot write a computer program to recognize. One example of such a language is the set of all programs (in your favorite programming language) whose execution never stops.
• For more on the Chomsky Hierarchy, see CS 451.

CS 331 Spring 2013: Lecture Notes for Wednesday, January 23, 2013 / Updated: 23 Jan 2013 / Glenn G. Chappell / ggchappell@alaska.edu