You will often find yourself in a situation where you have to evaluate a language for a specific task or compare one language to another language. Languages come and go: when I was a graduate student in the early 1990s, C and awk were the dominant language for the kinds of things that I was doing. This changed to C++ and perl in mid-1990s and to Java, Python, and JavaScript in early 2000. In other words, every five years or so I've had to evaluate the languages out there and decide which one was right for me. Other people often made different decisions: e.g., many of my colleagues are still using C++ or have moved past Java into C#. The skills you acquire in this topic will help you systematically and objectively evaluate a language or compare one language to another language.
Broadly speaking, each language has two parts: syntax and semantics. The syntax of most languages is expressed in a notation called BNF (the book describes two notations: BNF and EBNF; we will just use BNF). If you pick up any book on a language or visit a web site on a language, there is a good chance that you will be able to find a chapter or web page that describes the syntax of the language in BNF. Reading this syntax description will give you a clear and unambiguous understanding of the syntax of the language. You will most likely learn many new languages in your career and thus knowing how to read BNF will be valuable. Many of you will also design new languages in our career. For example even many games nowadays have little languages; if you work for a company that builds computer games you will most likely design a new language from a scratch. For this task, knowing how to write an BNF grammar for a syntax will be invaluable.
As discussed above, many of you will design a programming language at some point in your career. When designing the syntax, you will need to deal with issues such as ambiguity, operator precedence, and operator associativity.
A skill related to the above is to take a grammar that is ambiguous or does not respect precedence or associativity and to rewrite it so that it is unambiguous and respects precedence and associativity. While this is an important topic, we will not have time for this in our class. This material is normally covered in a compiler construction course.
Bindings are a central concept in all programming languages. A binding is just an association between an entity and an attribute. For example, there is a binding between a variable and its type or its value. The skills we are going to focus on here have to do with type bindings and storage bindings. Storage bindings, in particular, are very important to understand. Most languages support more than one storage binding and knowing the properties of the different storage bindings will enable you to pick the storage binding that best matches your needs.
Types are the most interesting and varied aspect of programming languages. E.g., C++ and Java "look" very similar (have similar syntax, control structures, etc.) but differ greatly in their type systems. C++'s type system is very liberal: you can cast anything to anything else and very few kinds of type errors are detected. On the other hand, Java's type system is much more restrictive: you can cast between types only in limited situations and catches all type errors either at compile time or at run time. Thus, an in-depth understanding of how types work and their implications is key to understanding how to make the best use of programming languages. If you do not understand or fully appreciate the type system for your language, you will find yourself working around it (which is rarely effective) rather than working with it. This set of skills is our first foray into understanding types; we will spend a lot of time on types this semester.
One of the important principles behind modern software engineering is the "need to know principle". If a client does not need some aspect of our code in order to effectively use our code, then we should hide that aspect from the client. For example, consider writing a linked list package that is internally implemented as an array. A client of the linked list does not have to know how the linked list is actually implemented. They just want to know how to add things and remove things from the list. Scoping is one of the several mechanisms that languages provide for hiding data. With scoping a variable can be hidden from code that should not be accessing the variable. The skills here will help you use scopes effectively in order to hide variables.
In order to enable programmers to model their data as naturally and cleanly as possible, languages provide a number of different data types. If you use the right data types for the job, not only will you end up with cleaner code, but most likely you will also end up with better type checking. Thus, it is good to know what is out there.
You are most likely already very familiar with some data types: array, record (or struct in C/C++ terminology), primitive types (integers, booleans, ...). There are some that you are probably not as familiar with: subrange types, enumeration types, union types, associative arrays. We will focus primarily on the ones that you don't already know well.
Pointers (or references) are incredibly powerful data types: they allow programmers to build unbounded recursive data types (such as lists or trees). By "unbounded" I mean that the sizes of the recursive data structures do not need to be known in advance. However, with this power comes many difficulties, two if the most common being the dangling pointer and memory leak (also called lost heap-dynamic variable in the text) bugs. Garbage collection eliminates the dangling pointer problem but can cause very subtle memory leaks, which are hard to track. Having a good understanding of how these mechanisms work will help you use pointers more effectively and also debug memory management related bugs (which are the most common kinds of bugs in C/C++ programs).
NOTE: The book distinguishes between "reference counters" and "garbage collection". The research literature does not distinguish between them: "reference counters" are a form of "garbage collection". So when I use the term "garbage collection" I mean it to include reference counting.
Expressions perform the computations in programs. To make them more intuitive, many languages borrow the concepts of precedence and associativity from mathematical conventions, which programmers are already familiar with it. Precedence and associativity determine the order of evaluation of operators, i.e., given an expression, which operator will we evaluate first and which one after that, and so on. In addition, to understand the semantics of an expression, one also needs to know about operand evaluation order, i.e., the order in which one evaluates the operands of an expression. For example, in a+b+c, does one evaluate "a" first, then "b", then "c" or ...? Operands evaluation order is relevant in programming languages but not in mathematics because mathematics does not have side effects: operand evaluation order is relevant only if one has side effects. In this topic you will acquire the skills to understand what an expression means in a given language.
Note that the skill 9.2 is an expanded version of skill 3.4
Control constructs in a language determine what computation a program performs and when. Most of the control constructs in the reading are things you are already familiar with and thus we will not spend time on this in the class (but I will expect you to know them). Here are the skills related to this reading:
Subtyping and inclusion polymorphism form the backbone of modern object-oriented languages. One cannot understand object-oriented languages without understanding these two concepts. While the skill set for this topic is small, the skills themselves are large and subtle.
Parameter passing is the preferred way for passing information from a caller to a callee and vice versa. Different parameter passing modes provide different capabilities and thus, many languages support more than one parameter passing mode. For example, C++ and Modula-3 support pass-by-value and pass-by-reference. C++ also (weakly) supports pass-by-name through its macro mechanism (#define ...). Here are the skills you need to know for this topic:
We will continue exploring parameter passing mechanisms by exploring issues having to do with types, and deep-versus-shallow binding.
Inclusion polymorphism (skill 11) allows one to reuse code for many different types. Inclusion polymorphism is the backbone behind object-oriented languages. However, it is not the only kind of polymorphism. There is at least one other kind of polymorphism which is very useful: parametric polymorphism. There are some things for which inclusion polymorphism is more suitable and others for which parametric polymorphism is more suitable. Thus, modern object-oriented languages support both kinds of polymorphism: inclusion polymorphism through subtyping of objects and parametric polymorphism through generic subprograms (or templates in C++ parlance). In this skill set we will understand how and when to use generics.
Programmers use abstractions to handle the complexity in their programs. With large programs (e.g., it is not uncommon to have programs that are millions of lines of code) it is essential to break down the program into a number of abstract units, each of which can be understood largely in isolation. Thus, when you have a bug in one unit, you only need to understand that unit to fix the bug. Most modern languages provide two kinds of abstractions: process abstractions (i.e., subprograms) and data abstractions. In this skill set we will understand the data abstraction support in programming languages.
In this skill set you will learn the foundations behind modern object-oriented languages.
Separating interfaces from implementations is one of the key hallmarks of modern software engineering. Thus, most modern languages support mechanisms to support interfaces from their implementation. Since different languages often provide very different mechanisms and it would take too long to cover them all, we will focus primarily on how Java and C# do it.
In this course we are focusing primarily on programming language concepts and not on implementation. A compilers course focuses on the implementation. However, sometimes having a good understanding of the implementation can shed light on the concepts. This, in my opinion, is particularly true for O-O concepts: understanding how they are implemented can help you understand not only how they work but also why languages make different choices (e.g., single versus multiple inheritance).
For many problems one can get a much more elegant and compact solution with functional languages than with imperative languages. Also, many of the features from functional languages have crept into the more commonly used imperative languages (such as garbage collection). In this skill set we will learn about the main concepts behind functional languages. In the next skill set we will get some experience writing programs in a specific functional language, Standard ML.
SML is one of the most commonly used functional languages today. It is also a very interesting language to study since it has many powerful features such as: pattern matching, type inference, and parametric polymorphism.
In order for programs to be robust, they must detect and handle any errors that may arise. Exception handling is a mechanism for handling exceptional situations. Many commonly used languages, such as Java, C#, C++, and SML support exceptions.