python Archives - Page 2 of 4

Cython: Tips and tricks

if you have more tricks comment below …

The last couple of days I started learning Cython …

What is Cython ?

Cython is compiled language that seamlesly integrates C with Python, plus more …

C, Cython, Python

The nice thing about is that the learning curve is very gradual. You can start w/o even changing your Python code.
Simply compiling it may speed up your script.

The next step is to start using Cython-the-language constructs.
Those include :

Declaring the type of the variables
Specifying who and how to call Functions/methods
Extension types : Classes which are implemented using Struct, instead of Dict allowing the dispatch resolution to happen at compile time, rather than runtime.

And finally you have the syntax to integrate directly C/C++ libraries and code.

Now on the

tips and tricks …

Creaing an array

Use 1D array instead of lists or numpy array whenever you can.
Red somewhere it is twice as fast than numpy.
In addition you can dynamically .resize() it in-place.

Here is the fastest way to create empty/zeroth array. First you need to have array templates prepared :

from cpython cimport array

cdef iARY = array.array('i') #integer
cdef IARY = array.array('I') #unsigned integer
cdef fARY = array.array('f') #float
cdef dARY = array.array('d') #double

then :

cdef ary = array.clone(fARY, size, 1)

Other options are :

cdef ary = array.array('f')
array.resize(ary, size)
array.zero(ary)

slower variant, but works on other types too :

cdef ary = array.array('f')
array.resize(ary, size)
ary[:] = 0

Accessing array elements

Here are several ways to access elements of array … from slower to faster.

ary[i] = value
ary._f[i] = value
#fastest, cause access the union struct directly
ary.data.as_floats[i] = value

the last one sped some portions of my code by ~30 times.

There are variations of the example above depending on the type :

_f, _i, _u …..
as_floats, as_ints, as_uints …..

A different len()

from cpython.object cimport Py_SIZE
#does not work on range()
cdef inline unsigned int clen(obj): return Py_SIZE(obj)

generates cleaner code, it should be faster. cpdef’d version is slower, which is expected.

type vs isinstance

if you have to do type checks use “type is …” instead of isinstance(), especially if you do several of them.

: x=type(5) 

: %timeit x is int
27.2 ns ± 4.12 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

: %timeit x is float
26.4 ns ± 0.731 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

: %timeit isinstance(5,int) 
52.6 ns ± 0.237 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

: %timeit isinstance(5,float) 
74.2 ns ± 1.28 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

: %timeit isinstance(x,int)                                                                                                                                           
71.5 ns ± 0.357 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

: %timeit isinstance(x,float)                                                                                                                                         
81 ns ± 1.32 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

---

: %timeit type(5) is int                                                                                                                                              
55 ns ± 0.487 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

: %timeit type(5) == int                                                                                                                                              
57.6 ns ± 1.59 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

: %timeit type(5) is float                                                                                                                                            
58 ns ± 1.26 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

2022-02-09 by myriad cython python 0

Python : How to check if variable is X ?

Here is how you can check the type of a variable :

In [240]: isinstance(5, int)                                                                                                                                                 
Out[240]: True

In [241]: var = 5                                                                                                                                                            

In [242]: isinstance(var, int)                                                                                                                                               
Out[242]: True

In [243]: isinstance(var, str)                                                                                                                                               
Out[243]: False

In [244]: isinstance([1,2,3], list)                                                                                                                                          
Out[244]: True

#is it one of many types
In [245]: isinstance([1,2,3], (list,tuple))                                                                                                                                  
Out[245]: True

Here is a function you can use to check if variable is an iterator :

def is_iter(x):
  try:
    iter(x); return True
  except TypeError: return False

2022-01-29 by myriad python 0

Numpy : min/max of integer types

Here is how to find the MIN or MAX values for the Integer types in numpy :

In [235]: np.iinfo(np.int8)                                                                                                                                                  
Out[235]: iinfo(min=-128, max=127, dtype=int8)

In [236]: np.iinfo(np.int16)                                                                                                                                                 
Out[236]: iinfo(min=-32768, max=32767, dtype=int16)

In [237]: np.iinfo(np.int32)                                                                                                                                                 
Out[237]: iinfo(min=-2147483648, max=2147483647, dtype=int32)

In [238]: np.iinfo(np.int64)                                                                                                                                                 
Out[238]: iinfo(min=-9223372036854775808, max=9223372036854775807, dtype=int64)

In [239]: np.iinfo(np.int)                                                                                                                                                   
Out[239]: iinfo(min=-9223372036854775808, max=9223372036854775807, dtype=int64)

2022-01-29 by myriad numpy 0

Pretty printing 2D python/numpy array

Below I show you quick and dirty way to print 2D array Column and Row labels/indexes. It is often more convenient to have those available so you can easily track visually the results of operations.

First lets try with numpy array :

import numpy as np
import pandas as pd

a = np.random.randint(0,100,(5,5))

print(a)

print()
print(pd.DataFrame(a))


print()
print(pd.DataFrame(a,columns=['A','B','C','D','E']))

[[70 40 64 22 91]
 [82 41 35 42 19]
 [21  7 42 63 85]
 [26 43 23  1 34]
 [44 79 88 46 62]]

    0   1   2   3   4
0  70  40  64  22  91
1  82  41  35  42  19
2  21   7  42  63  85
3  26  43  23   1  34
4  44  79  88  46  62

    A   B   C   D   E
0  70  40  64  22  91
1  82  41  35  42  19
2  21   7  42  63  85
3  26  43  23   1  34
4  44  79  88  46  62

Of course it is similar for normal Python arrays :

import numpy as np
import pandas as pd

b = [[1,2],[3,4]]

print()
print(pd.DataFrame(b,columns=['A','B']))

   A  B
0  1  2
1  3  4

here if you are too lazy to type : https://onecompiler.com/python/3xm3ms6fb

2021-12-12 by myriad numpy python 0

Calculating Moving (Average) ++

If you have time series data and want to calculate Moving-function like Moving Average you can use a Rolling window like shown below … enjoy

import numpy as np

def rolling_window(a, window):
    shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
    strides = a.strides + (a.strides[-1],)
    return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)

a = np.arange(0,10)

print(rolling_window(a,5))

print(np.mean(rolling_window(a,5), axis=1))

-----
[[0 1 2 3 4]
 [1 2 3 4 5]
 [2 3 4 5 6]
 [3 4 5 6 7]
 [4 5 6 7 8]
 [5 6 7 8 9]]

[2. 3. 4. 5. 6. 7.]


b = np.random.randint(0,100,10)

print(rolling_window(b,5))

print(np.mean(rolling_window(b,5), axis=1))

-----
[[42 93 30 69 53]
 [93 30 69 53 93]
 [30 69 53 93 61]
 [69 53 93 61 22]
 [53 93 61 22 53]
 [93 61 22 53 71]]

[57.4 67.6 61.2 59.6 56.4 60. ]

or here is another way to do moving average :

import numpy as np

def moving_average(a, n=3) :
    ret = np.cumsum(a, dtype=float)
    ret[n:] = ret[n:] - ret[:-n]
    return ret[n - 1:] / n

print(moving_average([1,2,3,4,5,4,3,2,1]))

-------

[2.0 3.0 4.0 4.33 4.0 3.0 2.0]

2021-11-29 by myriad numpy python 0

Get the size of data structures in Python

If you want to find or measure the size of memory a data structure or objects occupies, try the function below.

import sys
    
def get_size(obj, seen=None):
    """Recursively finds size of objects"""
    size = sys.getsizeof(obj)
    if seen is None:
        seen = set()
    obj_id = id(obj)
    if obj_id in seen:
        return 0
    # Important mark as seen *before* entering recursion to gracefully handle
    # self-referential objects
    seen.add(obj_id)
    if isinstance(obj, dict):
        size += sum([get_size(v, seen) for v in obj.values()])
        size += sum([get_size(k, seen) for k in obj.keys()])
    elif hasattr(obj, '__dict__'):
        size += get_size(obj.__dict__, seen)
    elif hasattr(obj, '__iter__') and not isinstance(obj, (str, bytes, bytearray)):
        size += sum([get_size(i, seen) for i in obj])
    return size
    

print(get_size(list(range(100))))
print(get_size(list(range(1000))))
print(get_size([[ i for i in range(10)] for j in range(100)]))
print(get_size([[ i for i in range(100)] for j in range(10)]))


-----
3804
37108
20388
12108

2021-11-29 by myriad python 0

python

Cython: Tips and tricks

What is Cython ?

C, Cython, Python

tips and tricks …

Creaing an array

Accessing array elements

A different len()

type vs isinstance

Python : How to check if variable is X ?

Numpy : min/max of integer types

Enums

Stats functions

Combinatorics: (n choose r)

Distance measures

Pretty printing 2D python/numpy array

Calculating Moving (Average) ++

Get the size of data structures in Python

For gifts and updates

python

What is Cython ?

C, Cython, Python

tips and tricks …

Creaing an array

Accessing array elements

A different len()

type vs isinstance

Categories

Tags

For gifts and updates