CS 331 Spring 2013  >  Lecture Notes for Wednesday, January 23, 2013

CS 331 Spring 2013
Lecture Notes for Wednesday, January 23, 2013

Languages & Grammars (cont’d)

The Chomsky Hierarchy

In the late 1950s, linguist Noam Chomsky published a hierarchy of types of languages, defined in terms of the kinds of grammars that could describe them. Chomsky was trying to develop a framework for studying natural languages (e.g., English); however, his hierarchy has proved to be useful in the theoretical study of computing.

Below is the hierarchy. In the Grammar column, upper-case letters represent nonterminal symbols, while lower-case letters represent terminal symbols.

Language Category Generator Recognizer Why We Care
Chomsky’s Number Name
Type 3 Regular Grammar in which each production is in one of the following forms:
  • \(A \rightarrow \varepsilon\)
  • \(A \rightarrow b\)
  • \(A \rightarrow bC\)

Another kind of generator: regular expressions.

Finite Automaton
Think: Program that uses a small, fixed amount of memory
In this category lie things like the set of all legal identifiers in some programming language. Thus, lexical analysis: breaking a program into words.

These are also the languages that can be described by regular expressions (used, for example, in text search & replace).

Type 2 Context-Free Grammar in which the left-hand side of each production consists of a single nonterminal.
  • \(A \rightarrow \textrm{[any]}\)
Nondeterministic Push-Down Automaton
Think: Finite Automaton + Stack (roughly)
For most programming languages, the set of all syntactically correct programs, lies in this category. Thus, parsing: determining whether a program is syntactically correct and, if so, how it is structured.
Type 1 Context-Sensitive Grammar in which each production is in the following form:
  • \(\textrm{[str1]}A\textrm{[str2]} \rightarrow \textrm{[str1]}\textrm{[any]}\textrm{[str2]}\)
where \(\textrm{[str1]}\), \(\textrm{[str2]}\) are fixed strings.
Don’t worry about it
(“Linear Bounded Automaton”, if you must know)
Actually, we don’t care much.
Type 0 Recursively Enumerable Arbitrary grammar.
  • \(\textrm{[any]} \rightarrow \textrm{[any]}\)
Turing Machine
Think: Computer program
For a language in this category, the task of analyzing a given string and saying “yes” precisely when it lies in the language, neatly encompasses those things that can be done using a computer program.


CS 331 Spring 2013: Lecture Notes for Wednesday, January 23, 2013 / Updated: 23 Jan 2013 / Glenn G. Chappell / ggchappell@alaska.edu