Project 3: higher order functions and functional languages
Goals of the project
The goals of this project are to give you experience programming in a functional
language (SML). More specifically, you will get experience with the
following language features:
-
Higher-order functions
-
Programming without assignments
-
Pattern matching
-
Type inference
Introduction
The goal of compiler optimizations is to improve the performance of the
code without changing the meaning of the code. In this project we
we consider a simple optimization, arithmetic simplification of expressions
made up of constants, variables, addition, subtraction, and negation.
Here are some examples of expressions:
x + 7 + y
a + -7 - 3
a + b - -c
The expressions are represented as a tree, each internal node of which
is a operator (addition, subtraction, or negation) and each leaf is either
a constant or a string (which represents a variable with that name).
Here is the representation that I provide:
datatype node = Constant of int
| Symbol of string
| Negation of node
| Add of node*node
| Subtract of node*node;
Your goal is to transform the tree so that it contains fewer calculations.
In the next four sections I'll give you some examples and describe the
three components of the project: transformations, simplifications, and
heuristics.
Examples
-
Consider the tree for "a + (5 + 3)":
Add(Symbol(a), Add(Constant(5), Constant(3)))
A simple analysis can find that the inner add operates on two constants
and the inner tree can be simply replaced by Constant(8) to yield:
Add(Symbol(a), Constant(8))
-
Consider the tree for "(a + -5) + 3":
Add(Add(Symbol(a), Negate(Constant(5))), Constant(3))
When we look at this expression and tree it is obvious that we can
replace it by a simpler tree: a - 2. However, the way the tree is
organized, there is no Add or Subtract node with only constant operands.
Thus, we can transform the tree in the following steps to end up with something
that is easy to simplify:
Add(Add(Symbol(a), Constant(-5)), Constant(3)) (* Simplify Negate which
has a constant operand *)
Add(Symbol(a), Add(Constant(-5), Constant(3))) (* Using (a+b)+c ==
a + (b + c) *)
Now we have an Add which has two constant operands. thus we can
replace that with Constant(-2) to get:
Add(Symbol(a), Constant(-2)
The question to consider here is: how and why did we decide to use
the associative property? We could have transformed the tree in many
other ways (e.g., by commuting it to get Add(Constant(3), Add(Symbol(a),
Constant(-5)))
-
Consider the tree for "(x + 5) + (z + 7)":
Add(Add(x, 5), Add(z, 7))
(for simplcity, in the remainder of the examples, when I write "x"
I mean Symbol(x) and when I write a constant, say 5, I mean Constant(5))
Like the previous example, none of the Add or Subtract can be immediately
simplified since none of them have only constants as arguments.
Here is one sequence of transformations that allows us to simplify the
tree:
Add(x, Add(5, Add(z, 7))) (* associate: (a + b) + c == a + (b + c)
*)
Add(x, Add(5, Add(7, z))) (* commute: a + b == b + a *)
Add(x, Add(Add(5, 7), z)) (* associate: a + (b + c) == (a + b) + c
*)
Now we have an Add with two constant operands, thus we get:
Add(x, Add(12, z))
-
Consider the tree for "4 - (x - 5)":
Subtract(4, Subtract(x, 5))
Let's transform this:
Subtract(4, Add(x, Negate(5))) (* x - y = Negate(y) + x *)
Subtract(4, Add(Negate(5), x)) (* commute: x + y = y + x *)
Subtract(Subtract(4, Negate(5)), x) (* associate: x - (y + z) == x
- y - z *)
Subtract(Subtract(4, -5), x) (* Negate(constant) == -constant *)
Now we can do the subtraction on constants.
Transformations
As we saw in the Examples section, there are a
small number of transformations that we routinely use in order to get the
expression tree in a form that we can simplify it using the simplifications.
Here they are:
-
Commute: a + b == b + a, and a - b == Negate(b) + a;
-
Associate: a + (b + c) == (a + b) + c, a + (b - c) == (a +
b) - c, a - (b + c) == (a - b) - c, a - (b - c) == (a - b) + c;
-
AntiAssociate: (reverse of associate)
Simplifications
If we consider the example, there are a very small number of simplifications
we use repeatedly:
-
NegConstant: Negate(Constant(n)) = -n if n is a positive constant,
and n if n is a negative constant;
-
AddConstants: Add(Constant(c1), Constant(c2)), c1 and c2 are constants
== Constant(c1+c2);
-
SubConstants: Subtract(Constant(c1), Constant(c2)), c1 and c2 are
constants == Constant(c1-c2);
-
AddZero or SubtractZero: Add(x, Constant(0)), or Subtract(x,
Constant(0)), where x is anything, is simply x;
All of these simplifications operate on constants. Here are two that
are more general:
-
SubSym: Subtract(Symbol(x), Symbol(x)) == Constant(0);
-
DoubleNeg: Negate(Negate(x)), where x is anything is x.
There are many other simplifications that we can imagine. For at
least Stage 0 of project we will stick with the ones above.
Heuristics
The goal of the heuristics is to decide when to apply a particular transformation.
Whenever a simplification is applicable, it makes sense to apply it;
however, it doesn't make sense to apply tranformations always. For
example, one could go into an infinite loop applying Associative and AntiAssociative
to x + (y + z). Clearly we need some strategies for applying the
heuristics. In this project, you will need to come up with some heuristics.
Stage 0
Copy the files in ~csci3155/proj3handout into your directory. These
files are mostly empty: you need to put your code in these files.
One of these files, setup.sml (described below)
contains the datatype declaration which I've already given above and a
tostring function that takes a tree and returns a string that contains
a flattened infix representation of the tree. The file test.sml gives
a small tree and invokes tostring on it. This code in test.sml is
just for illustration purposes.
For Stage 0 you need to write and submit one routine for each of the
above mentioned transformations and simplifications. Each of these
routines takes a node and return a (possibly the same) node. In addition,
you need to write a driver routine, MapTree, that you will use to invoke
the transformations and simplfications. The rest of this section
describes the functionality and interface of the MapTree routine.
MapTree takes two arguments, a node, n, and a function, f, of type node->node.
It applies the function f to each node in the tree rooted at n in a bottom-up
fashion and builds a new tree in the process using the value returned by
f. For example consider the following tree:
Add(Constant(10), Symbol(x))
MapTree will perform the following steps:
val t = f(Constant(10));
val u = f(Symbol(x));
val k = f(Add(t, u))
You should put all your routines in program.sml and submit it electronically
using our usual submission mechanism. My solution for this adds up
to about 40 lines of code. However, the hard part is writing your
first routine: once you get a handle on it, the rest are easy.
Discussion
In the previous stage we build some mechanisms: one mechanism for traversing
a tree and building a new tree from it (MapTree) and a number of mechanisms
for transforming the tree (or rather building a new transformed version
of the tree). However, we didn't implement any policy. In particular,
when should the heuristics be? This stage
proceeds similarly to the discussion stage for Project 2.
The discussion is in two part. The first part ends Monday April
23rd at 5:00 p.m. and the second part ends on Tuesday April 24th
at 5:00 p.m. Here are the description of what you need to do
for each part:
-
For the first part you need to log into PEP (as in discussion for Project
2), click on Project 3, and enter one heuristic (in the Option
1 box) and one criteria for evaluating the heuristics (in the Criterion
1 box). Note that in the earlier projects I gave you the criteria.
In this project since we are not evaluating programming language features,
the criteria may or may not be the same as ones we have been using.
As with Project 2, you can see the criteria and options that your fellow
students have picked at the bottom of the screen. In addition,
you should enter justification for why you think your option/criteria
or someone elses' option/criteria makes sense or does not make sense.
To do this, click on the "all of your submissions" link and put your comments
in the form there. Note that you can log into the system multiple
times and put in additional comments--view these comments as an online
dialog between yourself and your fellow students.
-
For the second part, you need to rate the options and the criteria that
I select from the options and criteria submitted in part 1. This
is just the same as in the earlier projects except that both the options
and criteria are provided by the students.
The discussion proceeds just as the discussion for Project 2. In
this discussion you will suggest different heuristics for how to apply
the transformations. This project is different from the other two
projects because you are not writing a language processor. Thus,
some of the criteria for evaluating the heuristics will be different than
the ones we have been using to evaluate language extensions. Thus,
during this discussion you should submit not only an option (heuristics)
but also a possible criteria for evaluating the heuristics. In addition
to submitting an
After you have submitted the options and the criteria, I will pick
some options and criteria and we will all rate the options and criteria
much like the discussions in the previous assignments.
Stage 2
In this stage we will implement the option that we picked in the discussion
stage and call MapTree with the heuristic. Here are the steps you
need to follow:
-
Write a function, say myHeuristic (type: node->node) that applies the heuristic
to a node and depending on what the heuristic says, it may apply a transformation
to the node. For example, let's suppose our heuristic is: if there
is a constant as a left operand of an Add, then commute. Then,
myHeuristic will be:
fun myHeuristic (n as Add(Constant(x), y)) = commute(n)
| myHeuristic x = x;
-
Create a function that is the composition of all your simplifications.
For example,
val allSimplifications = SubConstants o NegateConstants o ...
This function will simply try to apply each simplification once.
-
Create a composition of your heuristic and the simplifications:
val myHeuristicAndSimplifications = allSimplifications o myHeuristic;
This function when called on a node will first transform the node, based
on the heuristic and then try to simplifiy the node;
-
Test your program by calling MapTree:
MapTree (myHeuristicAndSimplifications, atree)
"atree" is a tree that you want to test your analysis on.
Note that I've already given you the code for steps 2 and 3 (for step 2,
you just need to fill in the names of all your simplifications).
In addition to the code, you need to submit an answer to the following
question: How would the coding experience have been different if you
had written your code in C++ or Java?
What to submit: Submit all the three sml files. The test.sml
files should have the code that you use to test your code.
Using the system
To use the system, add the following directory to your path:
/tools/cs/smlnj/bin
The above path works at least on the solaris and the linux machines
in the ugrad lab.
To invoke the compiler type "sml". This will give you an interpreter
prompt in which you can type in SML code. However, I recommend you
put your code in one or more files and issue the following command when
you invoke sml:
use "myfile";
This will open your file and execute all the code in it as if you had
typed it in at the interpreter prompt. My code is spread out over
three files: setup.sml, program.sml, and test.sml. I will provide
setup.sml to you; it contains declarations for the tree. program.sml
contains my solution to this project. test.sml contains the tests
for my project. Since tests.sml depends on setup.sml and program.sml
and program.sml depends on setup.sml, I've added:
use "setup.sml";
use "program.sml";
at the top of my test.sml. Thus, to run and test my program,
I simply need to type in "sml" at the shell prompt and
use "setup.sml";
at the sml interpreter prompt. Note that I'm providing you with
the full contents of setup.sml and stubs for program.sml and test.sml.
Resources
There are several good tutorials for SML. I recommend you start with
this
one since it is written very nicely and walks you through all the main
features of SML. Ignore the "Module system" chapter of the tutorial:
we won't be needing it for this project. Bob Harper, the author of
the first tutorial, has revised his tutorial and that is available here.
The new tutorial is even better but it is also a bit more advanced.
I recommend you use the first tutorial and refer to the second one only
for looking up things when you are stuck.
This
web site has a number of other tutorials on SML and SML/NJ (the compiler
that we are using).