Learn Python: Structures & Classes

sets

You would probably not use sets very often, but when you need them you need them badly.

As per usual lets do the basics.

Firstly we can create sets in different ways, some of them shown below.
We can do the usual operations between sets, like union, intersection, difference. All of them can also be performed using methods calls too.

Below is a list of all the Set methods.

#creating a set
s1 = set({1,2,3,8})
s2 = {2,4,6}
s3 = set(range(1,10))

print("length: ",len(s1))
print(s1)
print(s2)
print(s3)

# .union()
print("s1 | s2 : ", s1 | s2)
#.intersection()
print("s1 & s2 : ", s1 & s2)
#.difference()
print("s1 ^ s3 : ", s1 ^ s3)
-----
Output:

length: 4
{8, 1, 2, 3}
{2, 4, 6}
{1, 2, 3, 4, 5, 6, 7, 8, 9}
s1 | s2 :  {1, 2, 3, 4, 6, 8}
s1 & s2 :  {2}
s1 ^ s3 :  {4, 5, 6, 7, 9}

to do .. need to find a good example


-----
Output:

sets methods
MethodDescription
add()Adds an element to the set
clear()Removes all the elements from the set
copy()Returns a copy of the set
difference()Returns a set containing the difference between two or more sets
difference_update()Removes the items in this set that are also included in another, specified set
discard()Remove the specified item
intersection()Returns a set, that is the intersection of two other sets
intersection_update()Removes the items in this set that are not present in other, specified set(s)
isdisjoint()Returns whether two sets have a intersection or not
issubset()Returns whether another set contains this set or not
issuperset()Returns whether this set contains another set or not
pop()Removes an element from the set
remove()Removes the specified element
symmetric_difference()Returns a set with the symmetric differences of two sets
symmetric_difference_update()inserts the symmetric differences from this set and another
union()Return a set containing the union of sets
update()Update the set with the union of this set and others

Lists

Lists are like a chain where at every connection joint you have a node which hold “data”. That is why the access to the nodes is sequential i.e. to access the N’th node we need to have access to the N-1’th node first. This makes lists slow.
May be that is the reason why Python Lists behave both like a Lists and as Array. Also internally they are implemented as over-allocated Array…. not like a normal list.

(If you don’t know what array is check : Learning programming section). Python have array only as external an module.

Which explains why they have a mixed API : you can append, insert, delete elements like in a list, but also we get support for array slicing.

Lets start with several examples of how to initialize Lists.

Then we define the apply() function which look pretty similar to the count() function we discussed earlier, but there is something different. We had no notion of Lists earlier, we only knew about scalar variables. So our only choice was to print the result at the end if we wanted to return multiple values. Now we can collect the values in a list by using the append() method and return them as a single variable. List variable instead of scalar variable.
Also the loop this time “consumes” the source list elements one by one and assigns them to the index variable, similar to the way it worked for the range.

#initialize lists
empty = []
lst = [1,2,3,4]
odd = list(range(1,11,2))
even = list(range(2,13,2))
mix = [5,'hi','there',6,9]
LoL = [1,2,[3,4],5,6,[7]]

print('odd : ', odd)
print('even : ', even)

def mult2(a): return a*2

#returns list instead of printing
def apply(fun, lst):
  rv = []# this will be our return value
  for i in lst :
    res = fun(i)
    #accumate the results
    rv.append(res)
  return rv  

#multiply by 2 every element
print('*2 :', apply(mult2,lst))

#array slicing, first elem have index 0
print(lst[2]) #3nd element
print(lst[1:3]) #2nd,3rd
print(lst[2:])  #3rd onward
print(lst[::-1]) #reverse
-----
Output:

odd :  [1, 3, 5, 7, 9]
even :  [2, 4, 6, 8, 10, 12]
*2 : [2, 4, 6, 8]

#slicing
3
[2, 3]
[3, 4]
[4, 3, 2, 1]

And at the end we have examples of array slicing syntax. It works like the range but instead of generating numbers it selects which elements of Array/List to pick. The format is lst[start:end:step] and the first element is with index ZERO. Almost all languages use zero-based indexing for arrays, including Python.

Here we’ve got a demonstration one coolest feature Python borrowed from Functional languages.

It is a loop but shorter to write. Brackets indicate that we are creating a list. The for keyword is in the middle. On the right side is the loop header and on the left is the body.
As you may suspect the body can”t be complex, but that is the idea, the actual code should be shorter than the name “list comprehension” 😉

def mult2(a): return a*2

def apply2(fun, lst):
  return [ fun(i) for i in lst ]
  
print(apply2(mult2,lst))
print(apply2(mult2,odd))
print(apply2(mult2,even))
-----
Output:

[2, 4, 6, 8]
[2, 6, 10, 14, 18]
[4, 8, 12, 16, 20, 24]

Another handy capability is creating anonymous Functions on the fly, called lambda-Functions. Why use them ? First many times it is handy for quick short in place Functions, second they can be assigned to a variable and passed around.

def mult2(a): return a*2

res = apply2( lambda x : x*3 ,lst)
print(res)

power = lambda x : x**2
res = apply2( power ,lst)
print(res)
-----
Output:

[3, 6, 9, 12]
[1, 4, 9, 16]

And third lambda Functions are the gateway to function composition, which is a treasure trove of goodies.

The format is the keyword lambda followed by input argument then double-colon and finally an expression.

List methods
MethodDescription
append()Adds an element at the end of the list
clear()Removes all the elements from the list
copy()Returns a copy of the list
count()Returns the number of elements with the specified value
extend()Add the elements of a list, to the end of the current list
index()Returns the index of the first element with the specified value
insert()Adds an element at the specified position
pop()Removes the element at the specified position
remove()Removes the item with the specified value
reverse()Reverses the order of the list
sort()Sorts the list

Dictionaries

Dictionaries are used to store associations in the form of KEY => VALUE pair.
Internally the key is hashed and stored in such a way so that the access by key is almost instantaneous even if the dictionary holds thousands of kv-pairs.
This is why the key can be any hashable Python data type. The value on the other hand can be of any type.
As we will see in a moment it can be another dictionary thus creating multi level structures such as trees and graphs.

First I show different ways of how to initialize a dictionary.

Second we can access,create or update a pair by specifying the key in quotes and surrounded by square brackets.

Third you can see how we can get a list of the keys and the values. You can also use them in a loop, like we do with items().

.items() returns both the key and the value as a tuple. You also see how can loop over two variables simultaneously.

#1:initializing
empty = {}
empty = dict()
pair = {'key' : 'value' }
phones = { 
  'john': '111-111-1111',
  'peter': '222-222-2222'
}  
#2:add KV pairTEXT
phones['james'] = '333-333-3333'

#access element
print("James phone : ",phones['james'])

#3
print()
print(phones.keys())
print(phones.values())

#walk over all the items
for k,v in phones.items() :
  #using f-string
  print(f'key:{k}, value:{v}')
  
#4: complex structure
medical = {
  'john' : {
    'height' : 180, 'weight' : 200,
    'blah' : 0,
    'blood-pressure' : {
      'low' : 80, 'high' : 130
    }
  }
}

#5: update value
medical['john']['blood-pressure']['high'] = 120
#6: delete KV
del medical['john']['blah']

print(medical) 
-----
Output:

James phone : 333-333-3333

dict_keys(['john', 'peter', 'james'])
dict_values(['111-111-1111', '222-222-2222', '333-333-3333'])

key:john, value:111-111-1111
key:peter, value:222-222-2222
key:james, value:333-333-3333

{'john': {'height': 180, 'weight': 200, 'blood-pressure': {'low': 80, 'high': 120}}}

Forth is an example of initializing a complex structure with multiple levels of hash within a hash.
Then we update the value 3 levels deep.

And finally I show you how you can delete KV pair.

For this example lets manually create a Tree structure of animal taxonomy.

We start at the bottom of the hierarchy by creating a key-value pair of a name of the category as a key and a list of children nodes as the value. When there are no children we will use the keyword None, which designates nothingness. If you want you can use instead an empty list [].

Once we have the base elements, we build the next layer using the already created pairs f.e. mammals have children dog and cat. … and of we go up to the top.

At the end we print the tree using the pretty print library.

#bottom elements
dog = { 'dog': None }
cat = { 'cat': None }
sardine = { 'sardine': None }
chicken = { 'chicken': None }
turkey = { 'turkey': None }

#level 1
mammals = {'mammals':[dog,cat]}
fish = { 'fish' : [sardine] }
birds={'birds':[chicken,turkey]}

#level 2
animals = { 'animals' : [mammals,fish,birds]}

import pprint
pprint.pprint(animals,width=20)
-----
Output:

{'animals': [{'mammals': [{'dog': None},
                          {'cat': None}]},
             {'fish': [{'sardine': None}]},
             {'birds': [{'chicken': None},
                        {'turkey': None}]}]}

Don’t forget to check the methods that can be used with Dictionaries, below :

Dictionary methods
MethodDescription
clear()Removes all the elements from the dictionary
copy()Returns a copy of the dictionary
fromkeys()Returns a dictionary with the specified keys and values
get()given the key returns the value
items()Returns a list containing a tuple for each key value pair
keys()Returns a list containing the dictionary’s keys
pop()Removes the element with the specified key
popitem()Removes the last inserted key-value pair
setdefault()Given the key returns the value. If the key does not exist: insert the key value pair
update()Updates the dictionary with the specified key-value pairs
values()Returns a list of all the values in the dictionary

Classes and Objects

As I already mentioned in [Learn programming] section Classes are used to combine data and behavior into a single bundle.

  • Class is the description and implementation
  • Objects are the physical instantiation of the Class definition
  • Method is a function aware of the Object variables/attributes
  • Attributes are variables belonging to a Class or an Object
    • Class variables are accessible by all the Objects instantiated from the Class
    • Object variables are accessible only to the object

The BASIC example will be concentrated only on the mechanics of Classes. On the second read visit the other TABS.

First we will create a Dummy class so that you may get familiar with the syntax. That is all I want you to learn about classes on the first pass.

We define Class with the keyword class. Inside the methods are declared like a function with the difference that the first argument is a self-reference to the object itself. This way we can keep access to the internals. Normally this argument is named self, but it is not a requirement.

Then we may have an __init__ method, where we put code that we want to be executed immediately after the creation of the object.

Whenever a variable name has the prefix self. this mean we are accessing object variable.

class Dummy:
  
  def __init__(self, arg ):
    self.attr = arg
    
  def method(self, arg1, arg2='default'):
    print(f'attr: {self.attr}')
    print(f'arg1: {arg1}')
    print(f'arg2: {arg2}')
    
#creating an object
t = Dummy('one')
#calling a method with default value
t.method('two')
print()
#override the default
t.method('three','four')
-----
Output:

attr: one
arg1: two
arg2: default

attr: one
arg1: three
arg2: four

After the Class declaration you can see how we create an object based on the class definition. You can create infinite number of objects. Every object is stored in a variable. Then to call a method we use the dot-notation: variable.method_name(arguments) .
In reality this is simply a fancy function.

x.blah(arg) <=equivalent=> blah(x,arg)

Look carefully, you can see the purpose of using self now.

You can also see the usage of parameters with default value, but you already knew that from the Function section, aren’t you ?

This example is a bit more complex, but will demonstrate to you how to use higher level of abstraction.

I will be very detailed, so you can catch the ideas.

Our goal here is to format input text in different ways.The input will be given as a single line of text which we would want to format either as a list of words or split this line into sentences and/or paragraphs.

The markers to decide where a word, sentence or paragraph starts|ends will be space( ) dot(.) and double-colon(:). Example :

word word word. word word. : word word word. : word word.

To make this happen we need to have functionality to do 3 things : the condition by which to SPLIT a string to multiple items and how to format every one of them i.e. what to print BEFORE and AFTER.

  • we will print The Words one per line, preceeded by a dash
  • we will print The Sentences one per line
  • The Paragraphs will be separated by dashed line and then we will use a Sentences objects to control the appearance of the any of the sentences. This way if we decide to change the way we want to print them, paragraphs will automatically pick the new formatting w/o the need to change anything in the Paragraph class.

Looking at those requirements we can see there are alot of commonalities. For this reason we will create a more Abstract Base class called Say that implements the basic logic and afterwards we will just inherit this class. Later we can override or modify the behaviors we want to change.

All of the classes need to do three basic operations :

  1. Parsing the input string
  2. Transforming it to internal/intermediate structure
  3. Formatting and Printing the result

BTW compilers and interpreters often use similar stages.

Upon object creation the __init__ method is called as we discussed earlier. Here we internally store the .before and .after formatting, to be able to access them later.
Second we create a list called .many where we will store the result of parsing the input stream.
And third we call the .process() method to do the actual parsing.

The .process() method below simply splits the text according to the split_by symbol and invokes the .create_one() method which returns an item that we will store in the .many object attribute.


What the type of the item is, depends on the actual Class. Remember we use Say as a base class which we will inherit and then we can override any method to change behavior. F.e. we may want Paragraph class to return a Sentence item rather than simple String. So sometimes we create placeholder methods with the purpose of later tweaking some process.


The .many list here represent the intermediate/internal structure from which to generate the formatted output.

class Say:
  def __init__(self, txt, split_by=' ', before="\n", after=''):
    #wrappers
    self.before = before
    self.after = after
    self.many = [] #holds all Children objects
    self.process(txt,split_by)

  #split and process the incoming stream
  def process(self,txt,split_by):
    for line in txt.split(split_by) :
      one = self.create_one(line.strip(" .,:"))
      self.many.append(one)

  #given a input text return Child object suitable for the Parent       
  def create_one(self,one): return one

#continues down >>>
  • Why go to that trouble when we can just store the item as is in the .many list ?
  • By doing that we allow any class that inherit Say to override this behavior.
  • Why do we need that for ?
  • Indirectly we can store different objects in .many ?
  • But why ?
  • You can’t always predict how you would want to format and print the item.
  • I see that ! if I store simple text, I can just print this text as is.
  • Correct. But imagine you want to print fancy sentence in different colors or a math formula or an image.
  • Ooo I see, I’m allowing a future class to give me back an item that can print itself…
  • Exactly and the Say class doesn’t need to know anything about it.
  • Clever 😉

We’ve taken care of Parsing and Transformation, what is left is to code the Printing part.

The main method say() is the method we call to print the collected data in the new format.

say() as you see is a simple loop which for every items calls say_one().

class Say:

  # insert here the above code ^^^

  #prints just one item
  def say_one(self,one):
    print(self.before,end='')
    #limiting condition
    if isinstance(one, str) : print(one,end='')
    else : one.say()
    print(self.after,end='')    

  def say(self): 
    for one in self.many : self.say_one(one)

This is a good rule of thumb, I mean when you have a repetitive code to separate the logic in two pieces, a method that contains the loop and another method to do the singular task.
The benefit of that are twofold, first you can override any one of the methods in an ancestor class and second in the current class you may have different loopy behaviors reusing the same singular method.


F.e. in the current class as a variants of say() we can write say_in_reverse(), say_twice(), …

But lets get back to say_one(), which wraps the “core” print with before and after part. Depending on the object class it will use different wrappers.
The core is a bit weird … on the surface it is a simple check :

  • if the parameter is a String just print it
  • if it is an Object call his own one.say() method, rather than print() method.

Now it gets even weirder … if you follow the logic of the second statement you can deduce that this is a recursive call across Classes i.e. object of classA may call object of classB.say(), which will call ClassB.say_one() which may call classC.say() … and so on … until the argument is a simple String. As we discussed recursion requires a limiting condition.

So this ends our discussion of the base class Say.

Now lets look at how we will implement Words functionality. It is very easy we did all the heavy lifting.

The first line “class Words(Say):” , says that Words inherits the class Say i.e. everything we said so far applies to Words too.

Next we override the __init__ method. Two things here. First Words use different default arguments.
Second because we override the method there is no chance for Say.__init__() to run. For solving this problem Python as many other languages provides a mechanism to call the parent method and as you may already guessed you use the keyword super.

Then we have the Sentences class … with what you know so far its trivial … simply different arguments.

What about ….

class Say:

  # insert here the above code ^^^

#Mostly different parameters ...
class Words(Say):
  
  def __init__(self, txt, split_by=' ', before="- ", after="\n"):
    super().__init__(txt,split_by,before,after)

class Sentences(Say):
  
  def __init__(self, txt, split_by='.', before="", after=".\n"):
    super().__init__(txt,split_by,before,after)
      
      
class Paragraphs(Say):
  
  def __init__(self, txt, split_by=":", before="-----\n", after="\n"):
    super().__init__(txt, split_by,before,after)
        
  def create_one(self,para): return Sentences(para)

… Paragraphs ? Almost the same story … the difference as I hinted earlier is that we want to delegate.

It will be much better the Paragraphs class to handle only paragraph related things and instead of hard coding how to format its components to leave it to the components themselves to this job.

As I said in the future we may add different items, not just sentences to be a part of paragraph.

Compare this .create_one() method with the one we overwrote. Also here we did not use super(), why ?

…. I’m waiting … 😉

Ok, lets see what we’ve done.

line = "This is a sentence. And another one. : Now we start a paragraph. A short one. : End of the text"

print("Words:==========================")
w = Words(line)
w.say()

print("\nSentences:==========================")
s = Sentences(line)
s.say()

print("\nParagraphs:==========================")
p = Paragraphs(line)
p.say()

Like expected, everything looks nice … next tab can we get even more weird. Sure we can !

Words:==========================
- This
- is
- a
- sentence
.....

Sentences:==========================
This is a sentence.
And another one.
Now we start a paragraph.
A short one.
End of the text.

Paragraphs:==========================
-----
This is a sentence.
And another one.

-----
Now we start a paragraph.
A short one.

-----
End of the text.

We are going Commando ;), more inheritance, more reuse, more ideas …

This time we will extend our classes to generate HTML.

First case, simple one again.. with a twist.

The obvious feature of HTMLParagraphs class is to surround the paragraph with <p> html tags.
It inherits the Say rather than Paragraphs. What are the implications ? Both will work, but what is the difference ?.
Inheriting from Say means that there are no Sentences, but just a simple text line. Otherwise using Paragraphs this text line will be split into Sentences and everyone of them will be printed on a new line.

If you design your classes carefully you open the possibility to use them like a Lego blocks.

Next we have a HTMLArticle class. Several notes about it :

  • paragraphs are separated by <hr>
  • paragraphs are HTML based paragraphs via overwriting .create_one()
  • the Article has a Heading as a simple string
  • example of use of super() in a normal method

The heading is integrated as a string but it also could be a separate class. Think of how you can do that and what will be the benefits.

class HTMLParagraphs(Say):

  def __init__(self, txt, split_by=":", before="<p>", after="</p>\n"):
    super().__init__(txt,split_by,before,after)


class HTMLArticle(Say):

  def __init__(self, heading, txt, split_by="\n", before="<hr>\n", after=''):
    super().__init__(txt,split_by,before,after)
    self.heading = heading

  def create_one(self,para): return HTMLParagraphs(para)

  def say(self):
    print(f'<h2>{self.heading}</h2>')
    super().say()
line = "This is a sentence. And another one. : Now we start a paragraph. A short one. : End of the text"

print("\nHTMLParagraphs:==========================")
h = HTMLParagraphs(line)
h.say()

print("\nHTMLArticle==========================")
a = HTMLArticle("So much about nothing", line)
a.say()

-------------------------------------------------------

HTMLParagraphs:==========================
<p>This is a sentence. And another one</p>
<p>Now we start a paragraph. A short one</p>
<p>End of the text</p>

HTMLArticle==========================
<h2>So much about nothing</h2>
<hr>
<p>This is a sentence. And another one</p>
<p>Now we start a paragraph. A short one</p>
<p>End of the text</p>

So this is the end of Part2, may be I will do Part3 ….. who knows.