// allocate
Object* a = new Object;
// de-allocate
delete a;
Manual memory management leaves us open to errors:
Most modern languages do not require the programmer to manage memory themselves.
Instead they provide garbage collectors which automatically free unused objects. Garbage collection was first created for the LISP language.
A simple way of performing garbage collection is called reference counting. It works by keeping track of how many pointers are pointing to a given object.
Each time a pointer is pointed at an object, the counts are updated.
The downside of reference counting is that it does not catch cyclical references.
A cycle of references is present in several data structures such as a circularly linked list:
Because each node has a reference to it from another node, its count will never hit zero, even when the head pointer is removed.
This example demonstrates this problem with smart_ptr.
Reference counting is usually not used for garbage collection for this reason.
Garbage collectors work by tracing all of the objects in a program. If we start with all of the objects that can be directly accessed, then scan out, we will reach all accessible objects. Any object not reached is garbage.
Root Set
Objects that are directly accessible: local variables and globals.
Reachable
Any object that is part of the root set, or accessible indirectly from the root set.
Garbage
Any object on the heap that is not reachable.
The basic garbage collection algorithm is called mark and sweep. It works as follows:
In a compacting garbage collector, the heap is divided into live and dead sections:
Once the set of reachable objects have been found, they are moved to the live section. This compacts the live objects together.
This means we have to move every live object, but has a few benefits:
In typical programs, there tend to be two types of objects:
Objects that exist for most of the program are allocated early. Recently allocated objects are more likely to become garbage sooner. This is called infant mortality.
A generational garbage collector maintains several generations of objects:
A generational collector works as follows:
Generational GC has a few benefits:
Simple garbage collection halts the running program to perform collection. This can produce noticeable lags in execution.
To avoid this, modern garbage collectors use incremental collection. This means that execution and collection are interleaved, usually in separate threads.
This presents many challenges:Incremental collectors usually keep objects in three categories:
Incremental collection is complex and must still pause execution in some instances.
Many collectors can also be explicitly invoked by the programmer. For example, in Java:
System.gc();
Garbage collection involves some overhead, but modern garbage collectors are quite well optimized.
Also manual allocation and de-allocation has an overhead as well. Consider the following two programs:
The Java program will run much faster because of GC optimizations.
Manual memory management is not always faster. Large programs in C or C++ typically try to optimize memory usage by pooling objects and reusing space.
Copyright © 2024 Ian Finlayson | Licensed under a Creative Commons BY-NC-SA 4.0 License.