Home CPSC 401

Expressions & Control Structures

Expressions

An expression is the basic way of specifying a computation in a programming language.

An expression is any piece of code that has a single value. Examples:


Statements

A statement is a piece of code that has doesn't necessarily have a value, but causes some operation to occur.

Statements represent instructions to be carried out. Examples:

These are statements in C-based languages, but not expressions. None of these pieces of code evaluates to a single value. Statements can contain expressions however.


Expressions vs. Statements

Expressions can typically be used as statements. In some cases that is rather useless:


a + 5;

In others it can make sense:


strcpy(a, b);     // returns a

Statements, can not generally be used as expressions, however:


int x = if(a < b) 5; else 10;

The if statement can not be used as a value, since it has no value. C, C++ and Java provide the ?: operator which can be used as an expression:


int x = (a < b) ? 5 : 10;

In Racket and many other functional languages, there are no statements and everything is an expression. That allows us to use the if expression inside of another:


(display (if (< a b) 5 10))

Evaluation Order

When an expression includes multiple sub-expressions in it, they must be evaluated before the overall expression can be. This is different for several types:

In some cases, the order sub-expressions are evaluated is important:


int f(int* x) {
  *x = *x + 1;
  return *x;
}

int main() {
  int a = 10;
  cout << a + f(&a) << endl;
  return 0;
}

What is printed when this code runs?

It depends on the evaluation order of operands. There are different ways to solve this problem:


Short Circuit Evaluation

Most languages do short circuit evaluation when evaluating boolean expressions. If we have the expression:


a && b

Then a is evaluated first. If a is false, then there is no possible way the entire expression can be true. Thus b will not be evaluated.

Likewise, in the expression:


a || b

If a is evaluated to true, then b will be skipped.

Programmers rely on this in code like the following:


if((current != NULL) && (current->next != NULL))

if((index < MAX) && (array[index] == value))

Control Structures

Control structures allow us to direct the flow of execution inside of computer programs. They include if statements, switches, and loops.

Expressions and statements determine computations that happen and control structures are used to determine which expressions and statements take place when.


Goto Statements

The original control flow structure is the goto which jumps directly to a line or label in a program. There are both conditional and unconditional goto statements.

The following code uses goto to compute the sum of the numbers 1 to 100:


  int i = 1;
  int sum = 0;

top:
  if(i > 100) goto done;
  sum = sum + i;
  i = i + 1;
  goto top;

done:
  printf("Sum is %d.\n", sum);

Typically a goto is the only control structure available in assembly languages.


If / Else Statements

An if /else expression is a control construct used to select between multiple sets of statements based on some condition:

if condition
  statement1
else
  statement2

The else portion is typically optional.

The compiler will map this construct onto plain goto statements as follows:


  if not condition goto label1
  statement1
  goto label2
label1:
  statement2
label2:

More than two possibilities are possible by using "else if" statements:


if(condition1) {
  statement1;
} else if(condition2) {
  statement2;
} else if(condition3) {
  statement3;
} else {
  statement4;
}

This is nothing but a combination of single if/else statements where the statement under the else happens to be an if/else itself. The statements above can be equivalently written as:


if(condition1) {
  statement1;
} else {
  if(condition2) {
    statement2;
  } else {
    if(condition3) {
      statement3;
    } else {
      statement4;
    }
  }
}

Code Blocks

The bodies of control structures can be more than one statement. Different languages use different mechanisms for grouping statements:


Switch Statements

Another selection control structure is the switch statement. Switch statements in C family languages have the form:


switch(expression) {
  case constant1:
    statement;

  case constant2:
    statement2;
  
  // ...

  default:
    statementn;
}

In C, C++ and Java, the switch expression must be an integer type, and the code "falls through" into the next statement if there is no break.

C# allows switch statements to be used for strings and specifies each statement clause ends in a break or goto.

Switch was originally created as a means to specify a jump table in C. Compilers still try to use jump tables when compiling switches. The following switch:


switch(value) {
  case 10:
    printf("A");
    break;
  case 11:
    printf("B");
  case 12:
    printf("C");
    break;
  default:
    print("D");
}

Can be translated into a jump table as follows:


table = [label1, label2, label3, label4]
index = value - 10
if index < 0 || index > 3 goto label4
goto table[index]

label1:
  prinf("A");
  goto done

label2:
  printf("B");

label3:
  printf("C");
  goto done

label4:
  printf("D");

done:

This can be much faster than testing each condition if there are many cases.

If the compiler cannot reasonably do this it will translate it to a chain of if/else statements.


Pattern Matching

Languages in the ML family provide a feature called pattern matching which is an extension of the switch statement. In Ocaml, this is introduced with the "match" keyword:


match expression with
  expression1 -> value1
  | expression1 -> value1
  | expression1 -> value1;;

Unlike switch statements, the match construct is an expression, so it evaluates to a value. the switch above could be written with a match as:


print
  match value with
    10 -> "A"
    | 11 -> "BC"
    | 12 -> "C"
    | _ -> "D";;

The _ character is a wildcard which matches anything. The power of pattern matching lies in the fact that the patterns can be more complex than simple values. For example, we can match a tuple of three items using various rules:


match tuple with
  (x, _, x) -> print "The first and last are the same"
  | (1, _, _) -> print "The first one is one"
  | (a, b, _) when a > b -> print "The first is larger than the second"
  | _ -> print "None of these.";;

This can be used to write powerful code such as this code to decide the winner of Tic-Tac-Toe:


match game_board with
  | [x, x, x,
     _, _, _,
     _, _, _] -> x

  | [_, _, _,
     x, x, x,
     _, _, _] -> x

  | [_, _, _,
     _, _, _,
     x, x, x] -> x

  | [x, _, _,
     x, _, _,
     x, _, _] -> x

  | [_, x, _,
     _, x, _,
     _, x, _] -> x


  | [_, _, x,
     _, _, x,
     _, _, x] -> x

  | [x, _, _,
     _, x, _,
     _, _, x] -> x

  | [_, _, x,
     _, x, _,
     x, _, _] -> x

  | _ -> "Nobody"

While still somewhat long, this is considerably easier to code than it would be with if or switch statements alone.


Loops

Loops allow the repetition of a block of code in a more structured way than gotos. The most common loop is the while loop which has the general form:


while condition
  statements

The condition is tested to see if it is true first. If it is, then the statements in the loop body are executed. Then the condition is checked again etc.

While loops can be translated by compilers into a series of gotos like so:


top:
  if not condition goto done
  statements
  goto top

done:

Many languages also provide a do/while loop which has the form:


do
  statements
while condition

The difference is that the statements are always executed at least once which is not the case for a while loop. The compiler can generate code for this as:


top:
  statements
  if condition goto top

C, C++ and Java also provide a for loop which has the form:


for(statement1; condition; statement2)
  body

This type of for loop is equivalent to the fiollowing while loop:


statement1;
while(condition) {
  body;
  statement2;
}

The for is a convenience because this pattern is so common.


Break & Continue

break and continue provide more fine grained control over a loop.

break is an unconditional jump to the end of the loop, and is implemented with a single goto. It can be handy if there are multiple conditions to cause a loop to exit:


for(i = 0; i < MAX; i++) {
  if(array[i] == target) {
    // process it
    break;
  }
}

This loop searches for an item in an array. The main loop condition is used to avoid reading too much of the array. The break is used to end the loop when the item is found.

This can be written without a break as:


bool found = false;
for(i = 0; (i < MAX) && !found; i++) {
  if(array[i] == target) {
    // process it
    found = true;
  }
}

continue is an unconditional jump to the top of the loop. In a for loop, it also includes the code to move onto the next iteration. Continue can be used to filter out elements we don't want to process:


for(i = 0; i < MAX; i++) {
  /* if the pointer is NULL don't process it */
  if(array[i] == NULL) continue;

  // process array[i]
}

This can be written without a continue as:


for(i = 0; i < MAX; i++) {
  /* if the pointer is NULL don't process it */
  if(array[i] != NULL) {
    // process array[i]
  }
}

Some consider break and continue to be bad form as they make the control jump around similar to a goto.


Foreach Loops

For loops that process each element of a data structure are extremely common:


for(i = 0; i < MAX; i++) {
  // process array[i]
}

for(current = head; current != NULL; current = current->next) {
  // process *current
}

Because this is so common, many languages have a loop that specifically iterates through a data structure:

Python:

for item in collection:
  # process item
Java:

for (int item : collection) {
  // process item
}
C#:

foreach (int item in collection) {
  // process item
}

These loops assign the variable "item" to each element in the collection variable in turn. Foreach loops are a convenient addition to a language.


Exceptions

Handling errors was a common use of the goto statement (or worse setjmp/longjmp). To avoid this, exception handling was introduced.

There are typcially three parts of an exception:

As an example:


int divide(int a, int b) {
  if(b == 0) {
    throw DivideByZero();
  }

  return a / b;
}

int main() {
  int a, b;
  cin >> a >> b;
  
  try {
    cout << divide(a, b);
  }
  catch(DivideByZero e) {
    cout << "You can't divide by zero!";
  }
}

Notice that the exception is caught by a catch in a different function. When an exception is thrown, the program returns from multiple functions searching for a matching catch. This is called stack unwinding.

Copyright © 2018 Ian Finlayson | Licensed under a Creative Commons Attribution 4.0 International License.