average

Incremental average

There are three ways to calculate an Average depending on the way we receive the data :

  1. Basic : we have all the data. In this case we just use the basic well known formula : $$Avg = \frac{1}{n} \sum_{i=0}^n x_i$$
  2. Moving average : calculated using rolling window
  3. Incremental average : the one that we will discuss now

The idea is to calculate the Basic average at every step w/o recalculating it from the whole sequence i.e. the data comes one value at a every time step.

Here is the formula :

$$a_n = a_{n-1} + \frac{x_n – a_{n-1}}{n}$$

here is how you can use it as a python closure function, so that you don’t have to carry the state :

def iavg():
  avg = 0
  n = 0
  def calc(value):
    nonlocal n,avg
    n += 1
    avg = avg + ((value - avg) / n)
    return avg
  return calc
  
avg = iavg()
print(f'2 => {avg(2)}')
print(f'4 => {avg(4)}')
print(f'6 => {avg(6)}')

-----

2 => 2.0
4 => 3.0
6 => 4.0

# (2+4+6) / 3 = 12/3 = 4

Calculating Moving (Average) ++

If you have time series data and want to calculate Moving-function like Moving Average you can use a Rolling window like shown below … enjoy

import numpy as np

def rolling_window(a, window):
    shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
    strides = a.strides + (a.strides[-1],)
    return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)

a = np.arange(0,10)

print(rolling_window(a,5))

print(np.mean(rolling_window(a,5), axis=1))

-----
[[0 1 2 3 4]
 [1 2 3 4 5]
 [2 3 4 5 6]
 [3 4 5 6 7]
 [4 5 6 7 8]
 [5 6 7 8 9]]

[2. 3. 4. 5. 6. 7.]


b = np.random.randint(0,100,10)

print(rolling_window(b,5))

print(np.mean(rolling_window(b,5), axis=1))

-----
[[42 93 30 69 53]
 [93 30 69 53 93]
 [30 69 53 93 61]
 [69 53 93 61 22]
 [53 93 61 22 53]
 [93 61 22 53 71]]

[57.4 67.6 61.2 59.6 56.4 60. ]

or here is another way to do moving average :

import numpy as np

def moving_average(a, n=3) :
    ret = np.cumsum(a, dtype=float)
    ret[n:] = ret[n:] - ret[:-n]
    return ret[n - 1:] / n

print(moving_average([1,2,3,4,5,4,3,2,1]))

-------

[2.0 3.0 4.0 4.33 4.0 3.0 2.0]