CS 331 Spring 2025  >  Assignment 6 (Writing an Interpreter)


CS 331 Spring 2025
Assignment 6 (Writing an Interpreter)

Assignment 6 is due at 5 pm on Tuesday, April 15. It is worth 80 points.

Procedures

This assignment is to be done individually.

Turn in answers to the exercises below on the UA Canvas site, under Assignment 6 for this class.

Exercises (A only, 80 pts)

Exercise A — Interpreter in Lua

Purpose

In this exercise, you will write a Lua module that implements a simple tree-walk interpreter for ASTs resulting from parsing the Fulmar programming language.

Your interpreter will not directly use the Fulmar parser you have written. Instead, the interpreter will be given the AST of a Fulmar program. When your interpreter is combined with a lexer and parser for Fulmar (lexit.lua & parseit.lua) and an application that glues them all together (which I have written; see fulmar.lua), the result will be a complete interpreter than can execute Fulmar programs.

Instructions

Write a Lua module that executes a Fulmar program using the tree-walk interpretation method, given the AST of the program.

Be sure to follow the Coding Standards.

State

Fulmar Variables—The Fulmar programming language stores only integer values and functions. Integers can be stored in simple variables or in array items.

Arrays do not have specified dimensions; every integer is a legal index for every array. This includes negative integers.

The state Table—Values of all defined Fulmar variables and functions are stored in a Lua table named state. This table has three members: f, a table that holds functions, v, a table that holds simple variables, and a, a table that holds arrays.

A Fulmar function is stored as a key-value pair in the state.f table. The key is a string holding the name of the function. The associated value is a Lua table holding the AST of the function body, in the same form as the AST returned by parseit.parse.

A Fulmar simple variable is stored as a key-value pair in the state.v table. The key is a string holding the name of the variable. The associated value is a number equal to the variable’s numeric value.

A Fulmar array is stored as a key-value pair in the state.a table. The key is a string holding the name of the array. The associated value is a Lua table holding the array items. In this table, each defined item is stored as a key-value pair. The key is a number equal to the index of the item. The associated value is a number equal to the variable’s numeric value.

Below are examples of where values are stored.

Kind of Variable Example Where the Value is Stored
Function xyz state.f["xyz"]
Simple Variable xyz state.v["xyz"]
Array Item xyz[5] state.a["xyz"][5]
xyz[0] state.a["xyz"][0]
xyz[-2] state.a["xyz"][-2]

Passing and Return—A state table is passed to function interp. This will always contain members f, v, and a. It may or may not include defined variables/functions. Variables that already have values in state must be treated exactly as if their values were set by previous Assignment statements.

The state table, as modified by the execution of the Fulmar program, must be returned by function interp. All variables given in the initial table must still be defined in the returned state table; Fulmar variables are never deleted. If a variable was set by the Fulmar program, then its value in the returned table must be its final value in the program. Otherwise, it must be the same as it was initially.

Justification—The above may seem a bit mysterious. Why would variables be given values before the execution of a program? The reason for this is to allow a Fulmar program to be entered interactively, as a series of statements, each of which is parsed and executed separately. Maintaining the state from one program to the next allows such statements to have the same effect as they would if they were parsed and executed as a single program.

Test Program

A test program is available in the Git repository: interpit_test.lua. If you compile and run this program (unmodified!) with your code, then it will test whether your code works properly. Be sure to use a version of this file that says “VERSION 3” (or later) at the top.

Do not turn in the test program.

Fulmar Programming Language: Semantics

The semantics of Fulmar is specified here, using informal methods. A formal syntax of Fulmar and the format of an AST were covered in Assignment 4.

General—Fulmar is a very small programming language with simple imperative semantics. Statements are executed in order, first to last, as modified by the three flow-of-control structures: If statement, While loop, and Function call. The current statement must be executed completely, with all side effects completed, before execution of the next statement begins. When the last statement has executed, program execution terminates, with the current state being returned to the execution environment.

Fulmar has no fatal runtime errors. Fulmar programs never crash or terminate abnormally.

Fulmar programs have three kinds of side effects: variable modification, I/O, and pseudorandom-number generation. Values of variables—including functions—may be specified by the execution environment when a Fulmar program begins. Variable values are returned to the execution environment by the Fulmar program for later use. I/O is described next. Pseudorandom-number generation is described below, in Expressions.

I/O—A Fulmar program may do text input and output.

A Fulmar program does text input by reading a line of text from the standard input and interpreting this as an integer value. If the line read does not represent an integer, then it is interpreted as zero. Input is done by a call to readnum in an expression.

A Fulmar program does text output by printing a string, or integer value converted to a string, to the standard output. Output is done by a Print statement or Println statement.

*** For information on how to perform text input and output, see Implementation Notes, below.

Variables—Fulmar has three kinds of variables: functions, simple variables, and arrays. These are always named. Distinct identifiers never refer to the same variable. Identifiers for functions, identifiers for simple variables and identifiers for arrays lie in three separate namespaces.

A simple variable holds an integer value.

An array holds zero or more items, each indexed by an integer that may have any integer value: positive, negative, or zero. Array dimensions are not specified; every integer index is usable with every array. Each array item holds an integer value. The legal values for a Fulmar integer are implementation-defined.

*** For information on the legal values of a Fulmar integer, see Implementation Notes, below.

A function holds the AST for its body.

All variables in Fulmar are global. The scope of every identifier is the entire program, along with every program executed later, based on the state returned by the current program.

The value of a Fulmar simple variable or array item may be set by an Assignment statement or passed in by the execution environment in the initial state.

A function variable may be set by a Function definition, or passed in by the execution environment in the initial state.

A variable is defined if it has ever been set, or if it had a value in the initial state specified by the execution environment. The value of a defined variable is its most recently set value.

The value of a variable that is not defined is its default value as indicated below.

Kind of Variable Default Value
Simple Variable 0 (zero)
Array Item 0 (zero)
Function { PROGRAM }

Expressions—Fulmar expressions are evaluated eagerly; that is, expressions are evaluated when they are encountered during execution (as opposed to lazy evaluation).

The various parts of an expression may be evaluated in any order—except that the index for an array item must be evaluated before the array item is looked up. The only parts of an expression that may have side effects are function calls, readnum calls, and rnd calls; other parts of an expression have no side effects. In particular, the fact that the value of a variable is used in an expression, does not cause the variable to become defined.

When a NumericLiteral is encountered in an expression, it is evaluated by converting its string form to a number.

*** For information on integer conversions, and the method for evaluating a NumericLiteral, see Implementation Notes, below.

When a variable is encountered in an expression, it is evaluated to its current value in the program state, or its default value of 0 (zero) if it is not defined.

A function call inside an expression executes the AST that is the value bound to the given function identifier, or the default AST if the function identifier is not defined: { PROGRAM }. The value of the expression is the value of the simple variable return after the AST has been executed—or its default value of 0 (zero) if this variable is not defined.

Calling rnd in an expression generates and returns a pseudorandom integer. Internally, this number is obtained by calling util.random, passing the value of the argument to rnd in the Fulmar program.

You can expect that, when called with an integer n, util.random will return an integer in the range 0 to n−1, inclusive, or zero if n is less than 2. However, interpit.interp does not need to check or enforce this.

Calling readnum in an expression results in a line being read. The value of the readnum call is the result of converting the string read to an integer.

*** For information on reading a line and converting a string to an integer, see Implementation Notes, below.

The result of evaluating an expression involving a Fulmar operator is the same as for the Lua operator with the same name, followed by conversion to an integer using the appropriate provided function (numToInt or boolToInt), with the following exceptions.

Statements—Fulmar has eight kinds of statements: Print statement, Println statement, Return statement, Assignment statement, Function call, Function definition, If statement, and While loop. We discuss the semantics of each of these.

A Print statement writes one or more strings to the standard output. For each Print argument, one string is written.

*** For information on converting the numeric value of an expression to a string, see Implementation Notes, below.

A Println statement writes one or more strings to the standard output, followed by a newline. Its semantics is identical to that of a Print statement, except that, after the arguments have been output, one additional string is output: a string of length 1 containing a newline character.

When a Return statement is executed, the expression after the return is evaluated. The simple variable named return is set to this value. Note that this is the only way to set the value of this variable. Since return is a reserved word, the value of this variable cannot be set in an Assignment statement. Executing a Return statement does not terminate the execution of a function; it only sets the value of a variable.

An Assignment statement evaluates the expression on the right-hand side of the assignment operator (=) and then sets the Lvalue on the left-hand side to that value. If the Lvalue was not previously defined, then its status is defined after the Assignment statement is executed. If the Lvalue is an array item, then the expression representing its index must be evaluated before the Lvalue is set, in order to determine which item to set.

A Function call executes the AST that is the value bound to the given function identifier, or the default AST if the function identifier is not defined: { PROGRAM }.

A Function definition binds the given function identifier to the AST for the given function body.

When an If statement is executed, the expression in parentheses after the if, along with any expressions after elseif that are part of the same statement, are evaluated, in order. If any of these expressions evaluates to a nonzero value, then no more such expressions are evaluated; the corresponding body (program) is executed. If none of the expressions evaluates to a nonzero value, and there is an else, then its body is executed. If no expression evaluates to a nonzero value, and there is no else, then the If statement has no effect.

*** For information on determining whether the value of an expression is nonzero, see Implementation Notes, below.

When a While statement is executed, the expression after the while is evaluated. If this value is zero, then execution of the While statement terminates. If this vaue is nonzero, then the loop body (program) is executed, and then execution of the While statement starts over at the beginning.

*** For information on determining whether the value of an expression is nonzero, see Implementation Notes, below.

Implementation Notes

All text input and output in a Fulmar program must be done by calling the passed functions util.input and util.output. The former inputs a line of text and returns it, without the newline. The latter outputs the given string; no newline is added. Similarly, all pseudorandom number generation must be done by calling util.random.

The legal values for a Fulmar simple variable or array item are all the integers that may be represented as a Lua number.

When executing an If statement or While loop, to determine whether a Fulmar expression has a nonzero value, use a Lua expression of the form VALUE ~= 0, where VALUE is the numeric value of the Fulmar expression.

In the posted file interpit.lua, I have provided five utility functions: numToInt, strToNum, numToStr, boolToInt, and astToStr. Do not modify these functions! They are to be used as follows.

numToInt
When evaluating an expression involving one of the arithmetic operators (+ – * / %), the number returned by the Lua operator is passed to this function; the return value of numToInt is the actual result of the Fulmar computation. For example, the result of evaluating the Fulmar expression 42/10 can be computing in Lua using numToInt(strToNum("42")/strToNum("10")).
strToNum
This is to be used for all string → number conversions. In particular, it is used when executing a call to readnum, to convert the entered string to a number. And it is used when evaluating NumericLiteral lexemes.
numToStr
This is to be used for all number → string conversions. In particular, it is used during Output statement execution, when an argument is an expression. It converts the result of evaluating the expression into a string to be written.
boolToInt
This is to be used for all Boolean → number conversions. In particular, it is used when evaluating an expression involving one of the comparison or logical operators (== != < <= > >= and or not), to convert the Boolean returned by the Lua operator to the integer that Fulmar requires.
astToStr
This is to be used for debugging only; it must not be called in the final version of your code. This function takes a Fulmar AST and returns a human-readable string form of the AST, suitable for printing.

Provided Code

Once again, I have written a test program for your work: interpit_test.lua.

I have also provided a partially written version of file interpit.lua (you want the version that is marked “SKELETON” at the top). This includes the five utility functions mentioned above: numToInt, strToNum, numToStr, boolToInt, and astToStr. Do not modify these five functions; use them exactly as I have written them.

Thirdly, I have written a Lua application that uses the lexit, parseit, and interpit modules, forming a complete Fulmar interpreter: fulmar.lua. When fulmar.lua is executed, it displays a prompt (“>>>”), at which Fulmar code may be typed.

Alternatively, at the “>>>” prompt you may type a command beginning with a colon (:); these are listed when fulmar.lua starts up. In particular, typing “:r FILENAME”, where FILENAME is the filename of a Fulmar source file, will execute the program in that file.

If you have access to a Unix-like command line, then you may also pass the source filename to fulmar.lua as a command-line parameter. (This will probably not work on MacOS, which supports the shebang syntax, but only allows recognized interpters to be specified on it.)

[*ix command line]

fulmar.lua myprog.fmar