Dynamic programming vs memoization vs tabulation
Dynamic programming is a technique for solving problems recursively. It can be implemented by memoization or tabulation.
Dynamic programming
Dynamic programming, DP for short, can be used when the computations of subproblems overlap.
If you’re computing for instance fib(3)
(the third Fibonacci number), a naive implementation would compute fib(1)
twice:
With a more clever DP implementation, the tree could be collapsed into a graph (a DAG):
It doesn’t look very impressive in this example, but it’s in fact enough to bring down the complexity from O(2n) to O(n). Here’s a better illustration that compares the full call tree of fib(7)
(left) to the corresponding collapsed DAG:
This improvement in complexity is achieved regardles of which DP technique (memoization or tabulation) is used.
Memoization
Memoization refers to the technique of caching and reusing previously computed results. Here’s a comparison of a square
function and the memoized version:
square(x) { square_mem(x) {
return x * x if (mem[x] is not set)
} mem[x] = x * x
return mem[x]
}
A memoized fib
function would thus look like this:
fib_mem(n) {
if (mem[n] is not set)
if (n < 2) result = n
else result = fib_mem(n-2) + fib_mem(n-1)
mem[n] = result
return mem[n]
}
As you can see fib_mem(k)
will only be computed at most once for each k. (Second time around it will return the memoized value.)
This is enough to cause the tree to collapse into a graph as shown in the figures above. For fib_mem(4)
the calls would be made in the following order:
fib_mem(4) fib_mem(3) fib_mem(2) fib_mem(1) fib_mem(0) fib_mem(1) already computed fib_mem(2) already computed
This approach is top-down since the original problem, fib_mem(4)
, is at the top in the above computation.
Tabulation
Tabulation is similar in the sense that it builds up a cache, but the approach is different. A tabulation algorithm focuses on filling the entries of the cache, until the target value has been reached.
While DP problems, such as the fibonacci computation, are recursive in nature, a tabulation implementation is always iterative.
The tabulation version of fib
looks like this:
fib_tab(n) {
mem[0] = 0
mem[1] = 1
for i = 2...n
mem[i] = mem[i-2] + mem[i-1]
return mem[n]
}
The computation of fib_tab(4)
can be described as follows:
mem[0] = 0 mem[1] = 1 mem[2] = mem[0] + mem[1] mem[3] = mem[1] + mem[2] mem[4] = mem[2] + mem[3]
As opposed to the memoization technique, this computation is bottom-up since original problem, fib_tab(4)
, is at the bottom of the computation.
Complexity Bonus: The complexity of recursive algorithms can be hard to analyze. With a tabulation based implentation however, you get the complexity analysis for free! Tabulation based solutions always boils down to filling in values in a vector (or matrix) using for loops, and each value is typically computed in constant time.
Should I use tabulation or memoization?
If the original problem requires all subproblems to be solved, tabulation usually outperformes memoization by a constant factor. This is because tabulation has no overhead for recursion and can use a preallocated array rather than, say, a hash map.
If only some of the subproblems need to be solved for the original problem to be solved, then memoization is preferrable since the subproblems are solved lazily, i.e. precisely the computations that are needed are carried out.
Comments (7)
Best explanation you can find! Precise & Concise!
by Anonymous |
Could you comment on the memory complexity difference between them, because it could be an important factor to choose between memorization and “tabulation”. Pseudo-tabulation and a bottom-up approach has better memory complexity when we do not need to “remember” all previous solutions. (In Fibonacci for example, only the last 2 values must be remembered)
by Mrio |
That’s a good point. In the tabulation approach you have more control over what data to save for later. Inside a memoized function it is hard to make a decision on what previous results can be discarded.
I guess the bottom line is this: If you want such finer grained control over memory usage, memoization may not be a viable option.
by Andreas Lundblad |
Could you please comment on the time complexity of both of approaches. Does memoization still take O(2n) since we are anyways making the recursive call?
by Abi |
Both algorithms are O(n), but I understand your confusion since fib_mem
is being called more than n times. There are however two important things to note:
-
We are in fact not always making the recursive calls. Even if
fib_mem(5)
is called more than once, it will only recurse intofib_mem(4)
andfib_mem(3)
the first time it's called. -
One must keep in mind what's being analyzed. In algorithms like these, it's typically arithmetic operations, such as number of additions, that's being analyzed. A call to
fib_mem(2)
incurs zero cost if the value has already been computed.
by Andreas Lundblad |
This explanation... It has different kind of simplicity. It reminds me the quote that Einstein once said, "If you can't explain it to a six year old, you don't understand it yourself" Thank you sir :)
Great explanation thanks!
by JustMe |