Contents

Benchmarking and caching decorators

Let’s say you have a really slow program and you want to benchmark where your program is taking most of the time to run. If you can find that you can just optimize that part of the program to run faster.

There is couple of way of doing this going through this manually or using some kind of library like cProfile to generate a report on the function’s workings.

You can pretty much use this on any python function as you like, small-big-has other dependency anything.

Let’s see a few example as how to use these Benchmarking decorator.

from AKDSFramework.applications.decorators import benchmark

Now after importing we write a function and before the function we mention @benchmark and it’ll generate reports of the innerworkings of the function.

We gonna see an example of implementation of benchmarking by building a max heap and adding 2 numbers to the heap and again building it. To make max heaps I’ll use AKDSFramework, let’s create a heap and build it now with around 600 elements.

from AKDSFramework.applications.decorators import benchmark
from AKDSFramework.structure import MaxHeap

@benchmark
def buildHeap(array):
    h = MaxHeap(array)
    h.build()

    h.add(68)
    h.add(13)
    h.build()

buildHeap([data**2 for data in range(601)])
         3597 function calls (3003 primitive calls) in 0.002 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.002    0.002 <ipython-input-2-f022f9cc4326>:4(buildHeap)
        2    0.000    0.000    0.002    0.001 /Users/royakash/Documents/GitHub/AKDSFramework/AKDSFramework/structure/heap.py:136(build)
 1195/601    0.001    0.000    0.002    0.000 /Users/royakash/Documents/GitHub/AKDSFramework/AKDSFramework/structure/heap.py:155(heapify)
     1195    0.000    0.000    0.000    0.000 /Users/royakash/Documents/GitHub/AKDSFramework/AKDSFramework/structure/heap.py:67(get_left_child)
     1195    0.000    0.000    0.000    0.000 /Users/royakash/Documents/GitHub/AKDSFramework/AKDSFramework/structure/heap.py:53(get_right_child)
        1    0.000    0.000    0.000    0.000 /Users/royakash/Documents/GitHub/AKDSFramework/AKDSFramework/structure/heap.py:128(__init__)
        2    0.000    0.000    0.000    0.000 /Users/royakash/Documents/GitHub/AKDSFramework/AKDSFramework/structure/heap.py:26(add)
        1    0.000    0.000    0.000    0.000 /Users/royakash/Documents/GitHub/AKDSFramework/AKDSFramework/structure/heap.py:21(__init__)
        2    0.000    0.000    0.000    0.000 {built-in method builtins.len}
        2    0.000    0.000    0.000    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Notice the @benchmark decorator at the beginning of the declaration of the function, that calls cProfile to start calculating what taking what.

This has all the call’s report and how much time it’s taking. If you see the second last function call {method 'append' of 'list' objects} see that’s called 2 times total as we are appending 2 elements.

So this way you can see how much each function taking time and how many times they are called. If you wish you can reduce the number of calls or use a different approach to solve the part where it’s slow.

Caching

Now there is now way you can optimize the function, what you can do instead is that you can store results from a previous computation and reuse those results in a new computation to find solutions to other problem.

Let’s create world’s worst fibonacci series computation code. The algorithm might look like this:

FIBONACCI (n):
    if n -> 0: f = 0
    elif n -> 1: f = 1
    else:
        f = FIBONACCI(n - 1) + FIBONACCI (n - 2)
    return f

This is a correct algorithm for fibonacci. But if you see the recurrence relation T(n) = T(n-1) + T(n-2) + O(1) you can see that the code is running in exponential time O(2^N) which is really really bad.

The equivalent python code would be:

def fibonacci(n):
    if n == 0:
        return 0
    elif n == 1:
        return 1
    else:
        return fibonacci(n - 1) + fibonacci(n - 2)

If you draw a recursion tree you can find that you are computing same computation over and over again in different trees. Let’s see what I mean:

+--+-----------+-----------+--------+-----------+-----------+--+
|  |           |           | Fib(n) |           |           |  |
+--+-----------+-----------+--------+-----------+-----------+--+
|  |           | Fib (n-1) |        | Fib (n-2) |           |  |
+--+-----------+-----------+--------+-----------+-----------+--+
|  | Fib (n-2) | Fib (n-3) |        | Fib (n-3) | Fib (n-4) |  |
+--+-----------+-----------+--------+-----------+-----------+--+

See for calculating fib(n) you are calculating Fib (n-1) and Fib (n-2). In a separate computation you are computing Fib (n-2) for that you are computing Fib (n-3) and Fib (n-4).

If you had Fib (n-2) from the previous computation stored, you wouldn’t have to recompute that Fib (n-2) and it’s subsequent branches. So you would’ve saved a lot of time by just not recomputing anything.

Let’s without caching how much time it would take to compute fib(35) that is 35th fibonacci number:

import time

def fibonacci(n):
    if n == 0:
        return 0
    elif n == 1:
        return 1
    return fibonacci(n - 1) + fibonacci(n - 2)


start = time.perf_counter()
print(fibonacci(35))
end = time.perf_counter()

print(f"Computed in {(end - start)} seconds")
9227465
Computed in 4.039259026000025 seconds

Total time for computation is 4.039259026000025 seconds. So our python program is taking 4.039259026000025 seconds to compute fib(35). Now let’s store intermediate step’s data in a dictionary so that we can retrieve those data at a later time in constant time.

First we import the caching decorator.

from AKDSFramework.applications.decorators import cached

Now we write the function but with the caching decorator.

import time

@cached
def fibonacci(n):
    if n == 0:
        return 0
    elif n == 1:
        return 1
    return fibonacci(n - 1) + fibonacci(n - 2)


start = time.perf_counter()
print(fibonacci(40))
end = time.perf_counter()

print(f"Computed in {(end - start)} seconds")
102334155
Computed in 0.0004765559999668767 seconds

Now it takes around 0.0004765559999668767 seconds, which is a huge heap in performance.