Home CPSC 401

Sloth Interpreter

 

Due: March 18

 

Introduction

For this assignment, you will write a complete interpreter for a programming language called Sloth (Simple Language Of Tiny Heft). The language has no functions, and the only data type is the double-precision floating point number. However, it does have variables, mathematical expressions and common control structures.

This assignment is broken into three portions: the lexical analyzer, the syntax analyzer and the interpreter. The lexical analyzer takes textual source code and reduces it to a stream of tokens. The syntax analyzer takes the stream of tokens and parses it into a tree structure. The interpreter then takes the parse tree and interprets it to execute the program.


 

Test Programs

You can use the same set of input files to test your parser:

ProgramDescription
hello.slPrints hello world
simple.slPrints 1 + 1.
area.slInputs the radius of a circle and computes its area.
fact.slInputs a number and computes the factorial of that number.
fibs.slInputs a number N and displays the first N Fibonacci numbers.
minmax.slInputs a number N, then inputs N numbers, then displays the ones with the least and greatest values.
guess.slA guess the number game
table.slPrints a 2D multiplication table
acid.slA program that tests out most aspects of the language, and simply outputs 42.

 

Representing Values

For this portion of this assignment, you will write code that plugs into ANTLR to take actions as different parts of the tree are visited. We will now add the -visitor flag to the ANTLR tool when we give it our grammar. This causes it to setup the visitor classes we can hook into. You can use the updated build and clean scripts for this project.

The way ANTLR works, each time we visit a node of the parse tree, we can return back one value for it. Nodes that represent statements will not really use this value, but those that represent expressions, such as an addition node will return back the value produced. We have to return back the same Java type for all nodes, those this may be null.

To fill this purpose, so you should create a class called SlothValue which represents any 1 value in a Sloth program. This will be able to store an int, bool, real, or string which are the four types Sloth supports. In order to make a class which can handle this, the SlothValue class should have two instance varialbes:

  1. type which tells us which of the four types this SlothValue is storing. You could use just an int for this, but I'd really recommend making an enum called SlothType to make it easier to keep straight.
  2. value which is an Object. Being an Object (the root of all classes in Java) will allow us to assign Integer, Double, Boolean, or String values into it.

You should make four constructors for this class, one for each of the four types, setting both type and value for each. This way we can make a SlothValue for any type we need one for.

This class should also have four get methods, each of which tries to return the value as the requested type. Each of these functions will have a switch based on the actual type of the SlothValue and do the thing that makes sense based on this table:

From stringFrom intFrom realFrom bool
To string - ConvertConvertConvert
To intError-TruncateError
To realErrorConvert-Error
To boolErrorErrorError-

For the error cases, you can throw an exception which halts the program.


 

Tagging the Grammar

We should be able to use our grammar and lexer files from part 2 of this project, with one small change. Now we need to be able to distinguish between different options of grammar rules with multiple choices for production. The way ANTLR handles this is by including tags in the grammar file.

These begin with the # character, followed by a unique identifier. ANTLR uses this identifier in the generated Java code, allowing us to handle the different cases. You can look at the parser of the calculator visitor example to see how this looks.

You should add these tags to the grammar rules you have which include multiple options. If there is only one option, this is not needed as ANTLR uses the name of the non-terminal itself as the name.


 

A Tree-Walk Interpreter

With tags, and the SlothValue class in place, we can now work on the interpreter itself. We will do this by creating a file called SlothVisitor.java. This will override the empty visitor class that ANTLR provides. The declaration at the top of this class can look like this:


public class SlothVisitor extends SlothParserBaseVisitor<SlothValue>

Here we extend the base visitor class, as well as fill in SlothValue for the types of each node in our parse tree. The main work to do here is override each of the methods in the base class (which you can look at after building the grammar) to do the work of interpreting the program via syntax tree.

This is a recursive process in which you will call visit on the children nodes to evaluate them, or get their value. We can use the names of the grammar symbols as methods on the context parameters passed into each method.

For the nodes which represent statements, you will carry out what the statement says to do, returning null for the value. For instance, the print statement node might look like this:


@Override
public SlothValue visitPrint(SlothParser.PrintContext ctx) {
    SlothValue expr = visit(ctx.expression());
    System.out.print(expr);
    return null;
}

We are given the "PrintContext" parameter from which we can get the parse tree node representing the expression to the left of the print. We evaluate that, storing the value which is produced, which we then pass into System.out.println.

For expression nodes, we will compute a value which is returned back from the method. For example, here is a method which visits a times or divide node. Notice that we have two expressions on the right hand side of the grammar rule, so must pass an index into the expression method.


@Override
public SlothValue visitTimesdiv(SlothParser.TimesdivContext ctx) {
    SlothValue left = visit(ctx.expression(0));
    SlothValue right = visit(ctx.expression(1));
    
    // if both are ints, do int math, otherwise try and do reals
    if (left.getType() == SlothType.INT && right.getType() == SlothType.INT) {
        if (ctx.op.getType() == SlothParser.TIMES) {
            return new SlothValue(left.toInt() * right.toInt());
        } else {
            return new SlothValue(left.toInt() / right.toInt());
        }
    } else {
        if (ctx.op.getType() == SlothParser.TIMES) {
            return new SlothValue(left.toReal() * right.toReal());
        } else {
            return new SlothValue(left.toReal() / right.toReal());
        }
    }
}

Also notice that we do integer math when both operands are integers. Otherwise, we try to get both as real values, returning a real result. If one of them is a bool or string, the toReal method will throw an exception, which will chain up from this method too.


 

Short Circuit Evaluation

One small detail int eh evaluation of expressions that nearly all languages follow is the "short-circuit" evaluation of boolean operations. If we have code like the following:


if a && b then
    print("Hi!");

if c || d then
    print("Bye");

Short circuit evaluation means we do not evaluate the other side of an && or || operator if we know the result. Here, if a is false, then we do not even need to check b — we know the result is false. Likewise, if c is true, there is no need to check d.

In simple cases, there is not much importance, but it allows programs to do interesting things such as the following patterns:


if (ptr != NULL && ptr->count > 0)

if (somethingSucceeds() || panicAndQuit())

The first example could trigger a crash without short circuit evaluation, since ptr being null means that even looking at one of its members would be illegal.

The second example will only call the panicAndQuit function when the first one returns false.

Sloth doesn't have functions yet, but you should implement short-circuit evaluation.


 

Variables

Another aspect of this assignment is keeping track of the values of any variables. To do this, you will maintain a symbol table which keeps track of variable / value pairs. You can use a Java HashMap for this, which can be an instance variable in the SlothVisitor class. The keys are strings (the variable name) and the values are SlothValue objects.

When an assignment statement assigns a value into a new variable, you must insert that variable into the symbol table. When an assignment is made to an existing variable, you must search for the variable in the symbol table and update its value. When an expression references a variable, you must search for its current value to use.

If a variable is referenced before its assigned to, your program should signal an error.


 

Testing

We no longer need the grun program to test our work — now we can run the programs directly!

The following demonstrates the output of the fact.sl program:

Please enter a number: 5
5! = 120

And here is the output for a run of the guess.sl program:

Is your number 50? no
Answers are 'yes', 'low', or 'high'.
Is your number 50? high
Is your number 25? low
Is your number 37? high
Is your number 31? low
Is your number 34? low
Is your number 35? low
Is your number 36? yes
Got it!

 

Submitting

Submit the files for your interpreter to Canvas for this assignment.

Copyright © 2024 Ian Finlayson | Licensed under a Creative Commons BY-NC-SA 4.0 License.