CS 331 Spring 2013 > Assignment 3

CS 331 Spring 2013
Assignment 3

Assignment 3 is due at 5 p.m. Thursday, February 21. It is worth 25 points.

Procedures

E-mail answers to the exercises below to ggchappell@alaska.edu, using the subject “PA3”.

Your answers should consist of four files: lex2.h & lex2.cpp, from Assignment 2, and rdparse2.h & rdparse2.cpp, from Exercise A of this assignment. The four files (or a single archive file containing them) should be attached to your e-mail message.
Be sure to include your name in your e-mail.
I may not read your homework e-mail immediately. If you wish to discuss the assignment (or anything else) with me, send me a separate message with a different subject line.

Note: If your code for Assignment 2 worked correctly, then do not modify it; simply send the same thing again. If your code did not work, or was not finished, then get it working and send it. Regardless, send it.

Exercises (25 pts total)

Exercise A — Recursive-Descent Parser/Interpreter Class

Purpose

In this exercise, you will write a C++ class that implements an interpreter for a simple programming language, including expression evaluation and output. This interpreter will be primarily a parser that uses the Recursive-Descent method. This parser will be built on top of the lexer from the previous assignment.

Instructions

Implement a C++ class that parses and executes code, according to the grammar and semantics described below.

Name your class RDParse2, and implement it in files rdparse2.h and rdparse2.cpp.
Class RDParse2 must do parsing using the Recursive-Descent method.
The interface of class RDParse2 is essentially the same as that of RDParse, which was written in class (with some differences, to be listed shortly). In particular, RDParse2 has:
- A constructor taking an optional string object.
- Copy constructor, copy assignment operator, and destructor, as usual.
- Public member functions set, parse, and done.
The interface of RDParse2 differs from that of RDParse as follows:
- Parsing is done based on the Grammar in this document. not the one distributed in class.
- Function parse takes a string parameter by reference. If the code is syntactically correct, then the output of the interpreted program (based on the Semantics below) is returned in this string.
  [C++]
```
class RDParse2 {
    ...
    bool parse(std::string & result)
    ...
};
  
```
Class RDParse2 should have no public members other than those listed above. You may write any private members you want. Similarly, the only things declared in rdparse2.h should be class RDParse2 and its members. You may declare anything you want in rdparse2.cpp.

Grammar

Your class should parse based on the following grammar. Lexical conventions are as for class Lex2 from Assignment 2. Italic lower-case words are nonterminals. Unquoted upper-case words are terminals representing tokens from class Lex2. Quoted words are terminals described using literal characters.

program → "begin" { ID "=" expr | "print" expr } "end"
expr → aexpr { ( "==" | "!=" ) aexpr }
aexpr → term { ( "+" | "-" ) term }
term → factor { ( "*" | "/" ) factor }
factor → ID | INT | FLOAT | "(" expr ")"

Note: aexpr above is supposed to stand for “arithmetic expression”.

The above grammar is in the proper form for a Recursive-Descent parser; you will not need to tranform it.

Semantics

Rough Summary—The value of an expression is the usual thing, with everything being represented as a double—even boolean values—and other meanings as in C++. Evaluation may be done using the corresponding C++ operators. What looks like an assignment statement is an assignment statement; all variables can be treated as if they have type double. The output of a program is a string returned to the caller. This should contain the numeric values of the expressions printed, one on each line. String conversions, both number-to-string and string-to-number, may be done using the C++ stream operators (“<<”, “>>”).

Details—If a program parses correctly, then the parser passes a string containing the output of the program back to the caller.

The programming language is a very simple imperative language with no control structures. Statements are executed in the order they appear in the code, and then the program terminates. The values of all literals, variables, and expressions are of a type equivalent to the C++ double (even INT literals!).

There are two kinds of statements: assignment statements and print statements.

An assignment statement is of the following form:

ID = expr

ID represents an identifier—used as the name of a variable—and expr represents an expression. For example:

abc = (3+5)*-4.2

This statement evaluates the expression and sets the named variable equal to its value. Note that variables do not need to be declared.

A print statement is of the following form.

print expr

expr represents an expression, as above. For example:

print (3+5)*-4.2

This statement evaluates the expression and outputs its value, followed by a newline ('\n'). In our parser, “output” means to append the output to a string which is eventually sent to the caller.

There are six operators that are legal in expressions: *, /, +, -, ==, !=. These can be evaluated using the C++ operators represented by the same symbols, except that, for “==” and “!=”, true is represented by 1.0, and false is represented by 0.0.

All operators are binary and left associative. Operators * and / have the highest precedence, followed by + and -, and then == and !=. Parentheses change the order of evaluation of operators in the usual manner.

String conversions (source-code literal to internal double value and internal double to output string) should be those performed by the C++ Standard Library stream operators. In particular, the following function will evaluate a source-code numeric literal correctly.

// Must include <string>, <sstream>
double stringToNumber(const std::string & inputString)
{
    std::istringstream iss(inputString);
    double val;
    iss >> val;
    return val;
}

Similarly, you can convert a number to a string using a std::ostringstream.

std::ostringstream oss;
oss << val;  // val is a double
// Now oss.str() is the string form of the number

If an identifier is in an expression, then its value is the stored value from the most recent assignment to the variable named. The value of a variable that has never been assigned to is undefined.

Things Not to Worry About

Here are some things that will not be checked or tested in any way. So you do not need to worry about them.

If the parse member function returns false, then the return value of done will not be examined, and the output of the program will be ignored.
Programs will never attempt to print the value of a variable that has never been assigned to.
There will be no computations that cause numeric overflow, underflow, or division by zero.

Test Program

A test program is available: rdparse2_test.cpp. If you compile and run your package (along with the Lex2 package) with this program—unmodified—then it will test whether your package works properly.

Do not turn in rdparse2_test.cpp.

Suggestions

Here is how I wrote class RDParse2. You are not required to follow my suggestions.

Function parse_program passes a string back to the caller through a reference parameter; the string contains the output of the program. All other parsing functions pass a double back to the caller through a reference parameter: this contains the value of the expression. For example:

class RDParse2 {
    ...
    bool parse_program(std::string & result);
    bool parse_expr(double & value);

Each of Functions parse_expr, parse_aexpr, and parse_term has a loop as in function parse_expr in class RDParseb. It keeps track of the result of the computation so far, passing it back to the caller in the reference parameter when the loop exits.

I have three helper member functions. Function matchString is much as in class RDParse. Function matchID checks for an ID token in the input. If there is a match, then the string value of the lexeme is returned via a string reference parameter, and the lexer is advanced. Function matchNumber checks for an INT or FLOAT token in the input. If there is a match, then the numeric value of the lexeme is returned via a double reference parameter, and the lexer is advanced.

class RDParse2 {
    ...
    bool matchString(const std::string & str);
    bool matchID(std::string & name);
    bool matchNumber(double & value);

I store variable values in a data member vars_ of type std::map<std::string, double>. Suppose I wish to execute an assignment statement. If n is a string holding the name of the variable, and v is a double holding its desired value, then I set the value as follows.

vars_[n] = v;

Now suppose I wish to execute a print statement. If output is a string holding the output of the program so far, then I do

std::ostringstream oss;  // Must include <sstream>
oss << vars_[n];
output += oss.str();

CS 331 Spring 2013: Assignment 3 / Updated: 17 Feb 2013 / Glenn G. Chappell / ggchappell@alaska.edu

CS 331 Spring 2013 Assignment 3