programming

Call method by String or Reference

Sometimes you may need to call a method, but you don’t know which one in advance i.e. you want to postpone the decision to runtime.
There are two basic ways to that :

  • Store the method name as a String and build the “call” on the fly when you need it
  • Store a reference to a method in a variable and then use language specific syntax to invoke the method

For example let say you have written multiple standalone tests and you want to run different combination of them based on different needs. In other words you want to parameterize method calls.

Something along the lines :

run_tests -suite=3,5,8
run_tests -suite=2..5

If you have a few combinations you can hard code them, but it is more flexible to “pick” the tests/methods on the fly.

This is just one example of the need for selecting the function/method at the last moment. In most cases you prepare or store the name in a String.

What follows are examples of how to do this in several languages.

Python

There are many different ways to call a function or a method indirectly. Below you can see most of them.

#shortcut of : def say(*a): print(*a)
say = print

#using eval()
eval("say")("hello")
say_hi = "say('hi')"
eval(say_hi)

#call defined functon via locals
locals()["say"]('boo hoo')

#via reference stored in a dict
funs = { 'sayit' : say }
funs['sayit']('sayit')

# ----- and now ... -----

#methods in a string
class StringMethod:
  
  def say(self, *args) : print(*args)
  
  
sm = StringMethod()

#"create" a method
method = getattr(sm,"say")
method("howdy")

#direct assignment
sm_say = sm.say
sm_say('whats sup')

#unbound
sayit = StringMethod.say
sayit(sm,'bye')


-----

hello
hi
boo hoo
sayit
howdy
whats sup
bye

Java

In Java you have to use Reflection. You do it in two steps :

  • First you create a Method out of the String and a description of the parameters
  • Then you call the .invoke() method with the real values

import java.util.*;
import java.lang.reflect.*;


public class MethodReflection {

  public static <ARG> void say(ARG arg) { System.out.println(arg); }

  public void say_hi() { say("hi"); }
  private void sayit(String str) { say(str); }
  
  public void test() throws Exception {
    //get the class that hold the methods you want to call
    Class<?> klass = MethodReflection.class;//Class.forName("MethodReflection");
    
    //get a method w/o arguments
    Method m1 = klass.getMethod("say_hi");// or getDeclaredMethod()
    m1.invoke(this);
    
    //private method requires getDeclaredMethod(), also using one String argument
    Method m2 = klass.getDeclaredMethod("sayit", String.class);
    m2.invoke(this, "hello");
    
    //unknown argument
    Method m3 = klass.getMethod("say", Object.class);
    //first argument null for static method
    m3.invoke(null, "howdy");
    
  }

  public static void main(String[] args) throws Exception {
    MethodReflection mc = new MethodReflection(); 
    mc.test();
  }
}

-----

hi
hello
howdy

JavaScript

say = console.log

eval("say('hi')")

sayit = "say"
window[sayit]('sayit')

-----

hi
sayit

Padding values …

Here is an example of how to pad values with fill characters :

public class Main {
  
  public static <T> String pad(T str, int n, String right_pad, String filler) {
    return String.format("%1$" + right_pad + n + "s", str).replaceAll(" ", filler);
  }
  
  public static <T> String pad_left(T str, int n, String filler) {
    return pad(str, n, "", filler);
  }  
  public static <T> String pad_right(T str, int n, String filler) {
    return pad(str, n, "-", filler);
  }  
  
  public static <T> String pad_left(T str, int n) {
    return pad(str, n, "", " ");
  }  
  public static <T> String pad_right(T str, int n) {
    return pad(str, n, "-", " ");
  }  
  
  public static void main(String[] args) {
      System.out.println("pad(5,3)     : " + pad_left(5,3));
      System.out.println("pad(5,3,\"0\") : " + pad_left(5,3,"0"));
      System.out.println("pad(5,3,\"-\") : " + pad_right(5,3,"-"));
  }
}

-------

pad(5,3)     :   5
pad(5,3,"0") : 005
pad(5,3,"-") : 5--

Calculating date difference

Here is a quick way to calculate the difference between two dates counted in days.
You can easily modify it to use different TimeUnit.

import java.util.*;
import java.util.concurrent.TimeUnit;

public class Main {
  
  public static int days_diff(Date from, Date to) {
    long time_diff = Math.abs(to.getTime() - from.getTime() );
    int days = (int) TimeUnit.DAYS.convert(time_diff, TimeUnit.MILLISECONDS);
    return days;
  }

  public static void main(String[] args) {
      Date now = new Date();
      //use year - 1900, and month counting starts from Zero
      Date past = new Date(2022 - 1900, 3 - 1,1);
      
      System.out.println("now: " + past);
      System.out.println("now: " + now);
      System.out.println("Days difference : " + days_diff(past,now));
  }
}

-----------------------

now: Tue Mar 01 00:00:00 UTC 2022
now: Wed Mar 23 14:24:54 UTC 2022
Days difference : 22

Managing nested structures in Java

Handling nested structures in Java is kind of a nightmare. For this reason I created EasyLoL.

LoL is abbreviation of List-of-List, a general way of calling nested structures.

Imagine having to build structure like that in pure Java, how many keystroke energy you need to waste. No more just use EasyLoL and you will have the last laugh …

a : >
  b : >
    c : ccc
    d : ddd
  f : >
    c : fff2
    e : [abra,cadabra]
bcd : bcd
abc : abc
structs : >
  list : [v1,v2,v3]
  hash : >
    k1 : v1
    k2 : v2
    k3 : v3

The module extends HashMap and builds on top of it.

It provides fetch(), set() and delete() methods that use a dotted syntax to access and instantiate an element anywhere in the hierarchy by autovivifing all the intermediary elements if necessary.

autovivify : is the automatic creation of new arrays and hashes as required every time an undefined value is dereferenced.

The module also support a quick way to create and insert on the fly a HashMap or ArrayList element, you do that by encoding it as string, pre-pending it with special character, so that the module can figure out which one do you want to create.
Here is an example :

//hash in a string
e.set("structs.hash", ">k1:v1,k2:v2,k3:v3");
// list in a string
e.set("structs.list", "]v1,v2,v3");	

Below you can see the class and examples of how to use it.
Also you can also use dump() function I described in a previous post to pretty print the data structure.

import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

// Easy way to Create and Access hierarchical structures
public class EasyLoL extends HashMap<String, Object> {
	
		public static <ARG> void say(ARG arg) { System.out.println(arg); }

		//extract value by going trough the hierarchy
		private Object _by_key(String[] keys, HashMap val) {
			Object rv = null;
			if ( val.containsKey(keys[0])) { rv = val.get(keys[0]); }
			else {
				//log.debug("! by_key> no such key : " + keys[0]);
				return null; 
			}
			// dig deeper, (skim the first element of String keys)
			if (rv instanceof HashMap && keys.length > 1 ) return _by_key(Arrays.copyOfRange(keys, 1, keys.length), (HashMap) rv);
			if (keys.length > 1) return null; //still more keys, but no more data that deep 
			return rv;
		}
		
		//helper method for LoL objects access
		public Object by_key(String key) {
			String[] keys = key.split("\\.");
			return _by_key(keys, this);
		}
		
		//Getter methods ....
		public Object find(String key) { return key.contains(".") ? by_key(key) : get(key); }
			
		public String gets(String key) {
			Object obj = find(key);
			if (obj instanceof String) return (String) obj;
			if (obj == null) return null;
			return obj.toString();
	 	}
		
		public void delete(String key) {
			int i = key.lastIndexOf('.') ;
			if (i == -1) return;
			String upto = key.substring(0,i);
			String last = key.substring(i+1);
			EasyLoL prev = fetch(upto);
			prev.remove(last);
		}
		
		public <T> T fetch(String key) {
			Object obj = find(key);
			if (obj == null) return null;
			return (T) obj;
	 	}
		
		public ArrayList by_keys(String ... keys) {
			ArrayList<String> lst = new ArrayList<>();
			for (String key : keys) lst.add(gets(key));
			return lst;
		}
		
		//converts string to Hash
		public HashMap<String,String> str2lol(String str) {
			HashMap<String,String> hash = new HashMap<String,String>();
			String[] kvs = str.split("[,:]");
			for (int i=0; i < kvs.length; i += 2) hash.put(kvs[i], kvs[i+1]);
			return hash;
		}
		
		//handles setting complex types
		private <T> void insert(EasyLoL lol, String k, T value) {
			//hash encoded in string
			if (value instanceof String) {
				if (((String) value).startsWith(">")) {
					HashMap<String,String> val = str2lol( ((String) value).substring(1) );
					lol.put(k, val);
					return;
				} else if (((String) value).startsWith("]")) {
					String val = ((String) value).substring(1);
					lol.put(k, Arrays.asList(val.split(",")));
					return;
				}
			} 
			lol.put(k, value);			
		}
		
		public <T> void set(String key, T value) throws Exception { this.set(key, value, true); }
		
		public <T> void set(String key, T value, boolean silent) throws Exception {
			
			//no hierarchy
//			if (! (key.contains(".") && this.containsKey(key))) {
//				this.put(key, value);
//			}
			
			String[] keys = key.split("\\.");
			//start point
			EasyLoL lol = this;
			int pos = 0;//are we at a leaf 
			//incrementally vivify the keys
			for (String k : keys) {
				if (lol.containsKey(k)) {
					if (lol.get(k) instanceof Map) {
						lol = (EasyLoL) lol.get(k);//extend
					} else {
						if (pos >= keys.length-1) {
							insert(LoL,k,value);//overwrite
							return;
						}	
						if (silent) return;
						throw new Exception("using : " + key + " ! Can't overwrite the structure at sub-key : " + k);
					}
				} else {//autovivify
					if (pos < keys.length-1) {//Intermediary keys
						LoL.put(k, new EasyLoL());
						//switch/move one level down
						LoL = (EasyLoL) LoL.get(k);
					} else {//tree leaf, set the value
						insert(LoL,k,value);
					}
				}
				pos++;
			}
	 	}
		
		
	public static void main(String[] args) throws Exception {
		EasyLoL e = new EasyLoL();
		e.set("abc", "abc");
		//Hierarchical set
		e.set("a.b.c", "ccc");
		e.set("a.b.d", "ddd");

		e.set("a.f.c", "fff");
		//overwrite
		e.set("a.f.c", "fff2");
		//conflict : structure already in place
		e.set("a.f.c.x", "fffx");//silent
		//e.set("a.f.c.x", "fffx",false);//throw error
		
		//setting java structure
		ArrayList ary = new ArrayList() {{
			add("abra");
			add("cadabra");
		}};
		e.set("a.f.e", ary);
		
		e.set("bcd", "bcd");
		//hash in a string
		e.set("structs.hash", ">k1:v1,k2:v2,k3:v3");
		// list in a string
		e.set("structs.list", "]v1,v2,v3");
		
		say("------------------------------------------");
		say("a.b : " + e.fetch("a.b"));
		say("a.b.c : " + e.fetch("a.b.c"));
		say("structs.list : " + e.fetch("structs.list"));
		say("structs.hash : " + e.fetch("structs.hash"));
//		e.delete("a.b");
		say("==========================================");
//		say(utils.dump(e));
	}

}

------------------------------------------
a.b : {c=ccc, d=ddd}
a.b.c : ccc
structs.list : [v1, v2, v3]
structs.hash : {k1=v1, k2=v2, k3=v3}
==========================================
a : >
  b : >
    c : ccc
    d : ddd
  f : >
    c : fff2
    e : [abra,cadabra]
bcd : bcd
abc : abc
structs : >
  list : [v1,v2,v3]
  hash : >
    k1 : v1
    k2 : v2
    k3 : v3

Slices, Ranges and __getitem__()

One annoyance when implementing __getitem__() is debugging the slice syntax interactively .. Below you can see an example.

First you create a dummy class with simple method that print its arguments.

Then you can experiment.

class Blah: 
  def __getitem__(self,*args): 
    print(args) 
    return args

b = Blah()
b[1]
b[2:5]
b[::2,2:11:2]

s = b[::3]

print()

#filling the indices
ind = s[0].indices(12)
print(f'indices: {s[0]} => {ind}')

#make it a range
rng = range(*s[0].indices(12))
print(f'range: {rng}')

for i in rng: print(i)
(1,)
(slice(2, 5, None),)
((slice(None, None, 2), slice(2, 11, 2)),)
(slice(None, None, 3),)

indecies: slice(None, None, 3) => (0, 12, 3)
range: range(0, 12, 3)
0
3
6
9

In addition inside your method you can convert a slice to a range by using the .indices() method … and then use them to do processing in a loop.

Incremental average

There are three ways to calculate an Average depending on the way we receive the data :

  1. Basic : we have all the data. In this case we just use the basic well known formula : $$Avg = \frac{1}{n} \sum_{i=0}^n x_i$$
  2. Moving average : calculated using rolling window
  3. Incremental average : the one that we will discuss now

The idea is to calculate the Basic average at every step w/o recalculating it from the whole sequence i.e. the data comes one value at a every time step.

Here is the formula :

$$a_n = a_{n-1} + \frac{x_n – a_{n-1}}{n}$$

here is how you can use it as a python closure function, so that you don’t have to carry the state :

def iavg():
  avg = 0
  n = 0
  def calc(value):
    nonlocal n,avg
    n += 1
    avg = avg + ((value - avg) / n)
    return avg
  return calc
  
avg = iavg()
print(f'2 => {avg(2)}')
print(f'4 => {avg(4)}')
print(f'6 => {avg(6)}')

-----

2 => 2.0
4 => 3.0
6 => 4.0

# (2+4+6) / 3 = 12/3 = 4

Timing code execution

Python have timeit module to test how long a piece of code takes to execute. On the other ipython %timeit magic gives much more useful information.

$ python3 -m timeit -s "5 == 55"
100000000 loops, best of 3: 0.00543 usec per loop

In [23]: %timeit 5 == 55                                                                                                                                                     
18.3 ns ± 0.393 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

Luckily @Numerlor was kind enough to extract the functionality from ipython for us

In [26]: nice_timeit('5 == 55')                                                                                                                                              
18.4 ns +- 0.328 ns per loop (mean +- std. dev. of 7 runs, 10000000 loops each)
import math
import timeit


def _format_time(timespan, precision=3):
    """Formats the timespan in a human readable form"""
    units = ["s", "ms", "\xb5s", "ns"]
    scaling = [1, 1e3, 1e6, 1e9]
    if timespan > 0.0:
        order = min(-int(math.floor(math.log10(timespan)) // 3), 3)
    else:
        order = 3
    scaled_time = timespan * scaling[order]
    unit = units[order]
    return f"{scaled_time:.{precision}g} {unit}"


class TimeitResult(object):
    """
    Object returned by the timeit magic with info about the run.

    Contains the following attributes :

    loops: (int) number of loops done per measurement
    repeat: (int) number of times the measurement has been repeated
    best: (float) best execution time / number
    all_runs: (list of float) execution time of each run (in s)
    compile_time: (float) time of statement compilation (s)
    """

    def __init__(self, loops, repeat, best, worst, all_runs, compile_time, precision):
        self.loops = loops
        self.repeat = repeat
        self.best = best
        self.worst = worst
        self.all_runs = all_runs
        self.compile_time = compile_time
        self._precision = precision
        self.timings = [dt / self.loops for dt in all_runs]

    @property
    def average(self):
        return math.fsum(self.timings) / len(self.timings)

    @property
    def stdev(self):
        mean = self.average
        return (
            math.fsum([(x - mean) ** 2 for x in self.timings]) / len(self.timings)
        ) ** 0.5

    def __str__(self):
        return "{mean} {pm} {std} per loop (mean {pm} std. dev. of {runs} run{run_plural}, {loops} loop{loop_plural} each)".format(
            pm="+-",
            runs=self.repeat,
            loops=self.loops,
            loop_plural="" if self.loops == 1 else "s",
            run_plural="" if self.repeat == 1 else "s",
            mean=_format_time(self.average, self._precision),
            std=_format_time(self.stdev, self._precision),
        )


def nice_timeit(
    stmt="pass",
    setup="pass",
    number=0,
    repeat=None,
    precision=3,
    timer_func=timeit.default_timer,
    globals=None,
):
    """Time execution of a Python statement or expression."""

    if repeat is None:
        repeat = 7 if timeit.default_repeat < 7 else timeit.default_repeat

    timer = timeit.Timer(stmt, setup, timer=timer_func, globals=globals)

    # Get compile time
    compile_time_start = timer_func()
    compile(timer.src, "<timeit>", "exec")
    total_compile_time = timer_func() - compile_time_start

    # This is used to check if there is a huge difference between the
    # best and worst timings.
    # Issue: https://github.com/ipython/ipython/issues/6471
    if number == 0:
        # determine number so that 0.2 <= total time < 2.0
        for index in range(0, 10):
            number = 10 ** index
            time_number = timer.timeit(number)
            if time_number >= 0.2:
                break

    all_runs = timer.repeat(repeat, number)
    best = min(all_runs) / number
    worst = max(all_runs) / number
    timeit_result = TimeitResult(
        number, repeat, best, worst, all_runs, total_compile_time, precision
    )

    # Check best timing is greater than zero to avoid a
    # ZeroDivisionError.
    # In cases where the slowest timing is lesser than a microsecond
    # we assume that it does not really matter if the fastest
    # timing is 4 times faster than the slowest timing or not.
    if worst > 4 * best and best > 0 and worst > 1e-6:
        print(
            f"The slowest run took {worst / best:.2f} times longer than the "
            f"fastest. This could mean that an intermediate result "
            f"is being cached."
        )

    print(timeit_result)

    if total_compile_time > 0.1:
        print(f"Compiler time: {total_compile_time:.2f} s")
    return timeit_result


# nice_timeit("time.sleep(0.3)", "import time")

# IPython license
# BSD 3-Clause License
#
# - Copyright (c) 2008-Present, IPython Development Team
# - Copyright (c) 2001-2007, Fernando Perez <fernando.perez@colorado.edu>
# - Copyright (c) 2001, Janko Hauser <jhauser@zscout.de>
# - Copyright (c) 2001, Nathaniel Gray <n8gray@caltech.edu>
#
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
#
# * Redistributions of source code must retain the above copyright notice, this
#   list of conditions and the following disclaimer.
#
# * Redistributions in binary form must reproduce the above copyright notice,
#   this list of conditions and the following disclaimer in the documentation
#   and/or other materials provided with the distribution.
#
# * Neither the name of the copyright holder nor the names of its
#   contributors may be used to endorse or promote products derived from
#   this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Cython: Tips and tricks

if you have more tricks comment below …

The last couple of days I started learning Cython …

What is Cython ?

Cython is compiled language that seamlesly integrates C with Python, plus more …

C, Cython, Python

The nice thing about is that the learning curve is very gradual. You can start w/o even changing your Python code.
Simply compiling it may speed up your script.

The next step is to start using Cython-the-language constructs.
Those include :

  • Declaring the type of the variables
  • Specifying who and how to call Functions/methods
  • Extension types : Classes which are implemented using Struct, instead of Dict allowing the dispatch resolution to happen at compile time, rather than runtime.

And finally you have the syntax to integrate directly C/C++ libraries and code.

Now on the

tips and tricks …

Creaing an array

Use 1D array instead of lists or numpy array whenever you can.
Red somewhere it is twice as fast than numpy.
In addition you can dynamically .resize() it in-place.

Here is the fastest way to create empty/zeroth array. First you need to have array templates prepared :

from cpython cimport array

cdef iARY = array.array('i') #integer
cdef IARY = array.array('I') #unsigned integer
cdef fARY = array.array('f') #float
cdef dARY = array.array('d') #double

then :

cdef ary = array.clone(fARY, size, 1)

Other options are :

cdef ary = array.array('f')
array.resize(ary, size)
array.zero(ary)

slower variant, but works on other types too :

cdef ary = array.array('f')
array.resize(ary, size)
ary[:] = 0

more on this here : Fast zero’ing

Accessing array elements

Here are several ways to access elements of array … from slower to faster.

ary[i] = value
ary._f[i] = value
#fastest, cause access the union struct directly
ary.data.as_floats[i] = value

the last one sped some portions of my code by ~30 times.

There are variations of the example above depending on the type :

  • _f, _i, _u …..
  • as_floats, as_ints, as_uints …..

A different len()

from cpython.object cimport Py_SIZE
#does not work on range()
cdef inline unsigned int clen(obj): return Py_SIZE(obj)

generates cleaner code, it should be faster. cpdef’d version is slower, which is expected.

type vs isinstance

if you have to do type checks use “type is …” instead of isinstance(), especially if you do several of them.

: x=type(5) 

: %timeit x is int
27.2 ns ± 4.12 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

: %timeit x is float
26.4 ns ± 0.731 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

: %timeit isinstance(5,int) 
52.6 ns ± 0.237 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

: %timeit isinstance(5,float) 
74.2 ns ± 1.28 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

: %timeit isinstance(x,int)                                                                                                                                           
71.5 ns ± 0.357 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

: %timeit isinstance(x,float)                                                                                                                                         
81 ns ± 1.32 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

---

: %timeit type(5) is int                                                                                                                                              
55 ns ± 0.487 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

: %timeit type(5) == int                                                                                                                                              
57.6 ns ± 1.59 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

: %timeit type(5) is float                                                                                                                                            
58 ns ± 1.26 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

Python : How to check if variable is X ?

Here is how you can check the type of a variable :

In [240]: isinstance(5, int)                                                                                                                                                 
Out[240]: True

In [241]: var = 5                                                                                                                                                            

In [242]: isinstance(var, int)                                                                                                                                               
Out[242]: True

In [243]: isinstance(var, str)                                                                                                                                               
Out[243]: False

In [244]: isinstance([1,2,3], list)                                                                                                                                          
Out[244]: True

#is it one of many types
In [245]: isinstance([1,2,3], (list,tuple))                                                                                                                                  
Out[245]: True

Here is a function you can use to check if variable is an iterator :

def is_iter(x):
  try:
    iter(x); return True
  except TypeError: return False