Big-O notation is great because it gives us a precise way of being vague.
We want to be somewhat vague because we don't want to waste time figuring out
exactly how many clock cycles an algorithm takes to run (which also depends
on the language, compiler, and machine).

But we want to be vague in a precise way. Ignoring constants is fine, and so is
ignoring coefficients, but we want to capture the essence of how the work we need
to do scales up to different input sizes.

This section will talk about the mathematical definition of Big-O,
which is as follows:

If $f(n)$ and $g(n)$ are functions, then:

$f(n) = O(g(n))$

Is true if there is some constant value $c$ that we can multiply $g(n)$ and have $g(n)$
be greater than or equal to $f(n)$ for sufficiently large values of $n$.

For example, take our function from before:

$f(n) = 2 \cdot n + 4$

Then we can say that:

$f(n) = O(n)$

Because there is some value $c$ (for example 3) where, for sufficiently
large values of $n$, $3 \cdot n$ is greater than $2 \cdot n + 4$.

Big-O analysis basically only considers the "order" of a function such as
whether it is constant, linear, quadratic or exponential, rather than the full
definition of the function.

Examples

What is the Big-O of the following functions? Try to think
of each answer before seeing it.

$f(n) = 12 \cdot n^2 + 20$

$O(n^2)$

Like before, the coefficient of 12 is dropped, as is the constant 20.
Mathematically, this satisfies the definition of Big-O, because there is a constant
(13 for example) which we can multiply $n^2$ by to get a larger value than the
original function for large values of $n$.

$f(n) = 3 \cdot n^2 + 4 \cdot n + 15$

$O(n^2)$

This one is a little trickier. In addition to dropping the coefficient of 3, and
the constant of 15, we can drop the $4 \cdot n$ term as well. The intuitive
reason is that, when $n$ gets to be really big, $n^2$ is going to be way bigger
than $4 \cdot n$, so it's the only part that matters. The mathematical reason is
that there is a constant (4 for example), which we can multiply by $n^2$ that will
be bigger than the original function.

$f(n) = 5 \cdot n^2 + 2^n + 1000$

$O(2^n)$

Here the $2^n$ is larger than the other terms, so it dominates the function
for large values of $n$.

So basically to simplify a function with Big-O notation, you find the term which is
the biggest for large values of $n$, without the coefficient.