Home CPSC 340

References and Memory

Suppose we have a class declaration like this in Java:


public class Person {
    private String name;
    private int birthYear;

    private Person bestFriend;

    // ...
}

Here, we have a class declaration with a number of variables declared inside of it. A Person object contains a name, birth year, and who the person's best friend is.

But the way we store the best friend is by including a Person object inside of the Person. But that Person must also include a Person for that person's best friend. Is it possible to have objects of the same type nested inside of each other like this?

A picture of a Person object with name, birthYear and bestFriend
stored inside it.  The bestFriend is another Person object with those
things, and so on.
An (incorrect) view of how the Person class will be stored in memory

 

References

The answer to this riddle is that, when you declare an object in Java, it does not actually create the object in memory. Instead it creates a reference to an object, which may be created later.

A reference (which is also called a pointer) is essentially a variable which holds the address in memory of another variable. When you first declare an object, it creates the reference variable, and initializes the address to null (which is memory address 0).

So the way that a Person object is actually stored in memory would look like this:

The Person object stores a 4 byte integer for the birth year, 8 bytes for
the name (which is a reference to a String), and 8 bytes for the best friend (which
is a reference to a Person)
The actual memory layout of a Person object

 

Instantiation

We now need to actually create objects and put them into these references. Objects are generally created with the new keyword. A special case is String objects which can be created with new, or by putting text within quotes (which Java supports for convenience sake).

The code below fills in some references this way:


class Person {

    private String name;
    private int birthYear;
    private Person bestFriend;

    public Person(String name, int birthYear) {
        this.birthYear = birthYear;
        this.name = new String(name);
    }

    public void setFriend(Person friend) {
        bestFriend = friend;
    }
}

public class x {
    public static void main(String args[]) {
        // make one person
        Person p1 = new Person("Alice Anderson", 1997);

        // make another person
        Person p2 = new Person("Bill Barber", 1998);

        // set them as each other's friend
        p1.setFriend(p2);
        p2.setFriend(p1);
    }
}

There are a few things happening here. First we initialize the "name" field inside of the constructor using new. This is optional with String objects in Java, but is shown here. We also initialize the birthYear field. Primitive objects in Java are not references, so we can't use new for that.

The constructor leaves the bestFriend field as null. It can later be set with the setFriend method. The main method makes two person objects and passes sets them as each other's best friend.

After running this program, this is the way that the objects might look in memory:

The memory layout of the fields of the two objects in the example
above.  The reference fields contain the address of the objects they
refer to in memory.
Memory diagram of the objects in the program above

The exact memory addresses used are arbitrary examples. The important thing to understand is that these reference objects store the memory addresses of the objects which they refer to.

Because the exact memory addresses themselves don't really matter, we normally draw a diagram like this using arrows instead. That way we can still indicate which objects they are referring to without needing to specify addresses.

In this version of the image we replaced the memory addresses with
arrows indicating which fields refer to which objects in memory.
Memory diagram using arrows instead of addresses to show relationships

Because we draw reference variables as arrows like this, they are also called "pointers".


 

Stack vs. Heap Memory

There are actually two distinct areas of memory programs have access to: the stack and the heap. They are used for different purposes:

StackHeap
Allocated automaticallyAllocated with new
Stores primitives and referencesStores objects
Have namesAre anonymous
Destroyed at end of scopeDestroyed when not referred to

Let's say that we have the following main method:


public static void main(String[] args) {
    Scanner in = new Scanner(System.in);

    // get user throw
    System.out.println("Enter throw (1)Rock, (2)Paper, (3)Scissors");
    int user = in.nextInt();

    // get computer throw
    Random rng = new Random();
    int comp = rng.nextInt(3) + 1;

    // figure winner
    int difference = user - comp;
    switch (difference) {
        case 0:
            System.out.println("Tie!");
            break;
        case 1:
        case -2:
            System.out.println("You won!");
            break;
        case -1:
        case 2:
            System.out.println("You lost :(");
            break;
    }
}

Which things are placed on the heap and which on the stack?


 

Stack Frames

The important thing about the stack is that each time you call a method, you are given a new place on the stack to store all of the variables that method might need. This is called a stack frame.

When you return from a method, all of the variables on that stack frame are destroyed. The only variables that can be accessed in a program are those that are in the currently executing method (or objects on the heap it has access to).

The stack essentially keeps track of our history of method calls from oldest to most-recent. For example, consider the following code:


class Stacks {
    public static void f(int x) {
        System.out.println(x);
    }

    public static void g(int x) {
        f(x + 1);
    }

    public static void h(int x) {
        g(x * 2);
    }

    public static void main(String[] args) {
        h(7);
    }
}

When this code runs, execution starts in main, then goes to h, then g, then f. When the functions begin to return, the chain of execution then goes back from f, back to g, then h and finally back to main where the program ends:

At first only main is on the stack.  When a function is called, a new
stack frame for it is placed on top, so the stack grows bigger.  Then the
functions begin to return and the stack frames are removed until only main
is left again.
The stack as this program is run

When a program runs, a stack is maintained to keep track of which function we are in. The block for each function is the stack frame, and contains all of the variables that method uses. We will not have to worry about the call stack very often because it is maintained for us by the virtual machine.


 

Arrays and Memory

When we make an array of primitive types, each cell in the array is large enough to physically contain the primitive value. For example if we make an array of 8 integers, it would contain 32 bytes of space:


int[] array = new int[8];
Shows that an array takes 8 4 byte values for 32 bytes total.
An array of primitives

However, when we create an array of objects we are not making space for all of the actual objects themselves, but only for references to those objects. The references themselves are null until we initialize them to something:


Person[] array = new Person[8];
Shows that an array of objects creates references which initially refer to null
An array of objects

To be able to use the array, we would need to set the references to point to actual objects: either by instantiating them, or by referring them to objects that already exist.


 

Common Memory Mistakes

There are two common mistakes when dealing with memory in Java. The first is to use a reference that has not been instantiated yet. For example, we could do that with code like this:


Person p;
// ...
p.show();

This will produce the famous "NullPointerException":

Exception in thread "main" java.lang.NullPointerException
	at Example.main(Example.java:28)

The fix for this is to make sure that all objects you're trying to use have actually been instantiated.

The second most common mistake regarding memory in Java programs is not understanding that object variables are just references, and not objects themselves. Misunderstanding this will lead to countless issues.

For example, the following code makes an array of Person objects. It then makes a "defaultPerson" object with some default properties. It sets each slot in the array to this person, then tries to set the names of each individual element after that.


Person defaultPerson = new Person("Default", 1990);
for (int i = 0; i < 8; i++) {
    array[i] = defaultPerson;
}

array[0].setName("Alice");
array[1].setName("Billy");
array[2].setName("Claire");
array[3].setName("Dominic");
// ...

What will print if we print if we run the following code:


array[0].printName();

Copyright © 2019 Ian Finlayson | Licensed under a Creative Commons Attribution 4.0 International License.