When working with time series data with NumPy
I often find myself needing to compute *rolling* or *moving*
statistics such as mean and standard deviation. The simplest way
compute that is to use a for loop:

A loop in Python are however very slow compared to a loop in C code. Fortunately there is a trick to make NumPy perform this looping internally in C code. This is achieved by adding an extra dimension with the same size as the window and an appropriate stride:

Using this function it is easy to calculate for example a rolling mean without looping in Python:

More about the “stride trick”: SegmentAxis, GameOfLifeStrides

comments powered by Disqus