Intermediate Python – Part 1

Intermediate Python – Part 1#

Thierry Géraud 2024

h:3cm

print("Hello world!")

Test: write a Python program that input a number of lines and stores the strings in an appropriate structure

such as:

 *
***

or:

  *
 ***
*****

email me your solution at thierry.geraud@epita.fr with the tag [INTPY] in the subject

Let’s start with a few questions…#

What language(s) do you know?

your answers : C/C++, ruby, javascript, HTML, brainfuck, ... but not "english, or french..."

What is Python?
How do you use it?
What for?
Describe Python…

answers...

We will come back to this in a few minutes…

$e^{i\pi} = 1$ gives: $e^{i\pi} = 1$

What have we here?

answer: a description language (LaTeX)... not a programming language

% for i in ls -d *; echo $i

What is it? Is it programming?

answers:
- a loop in shell (zsh); that is an instruction; we are scripting
- we do not have a program at the end, so no

What is a programming language?

answer:

read https://en.wikipedia.org/wiki/Programming_language

Courses / lectures + sessions of practical work

Your 1st project: display a map described by a file

Characterization of Python#

Python is:

free and open-source software
portable
a programming language which is
- high-level
- general-purpose
- multi-paradigm
easy to learn, easy to code, easy to read → accessible
equipped with a large standard library, plus a vast range of libraries.

history of programming languages → http://tinyurl.com/IntPyLanguages

Fetch “ → Python”

High-level language#

compare with some assembler code:

Fib PROC 
   mov eax, 1         
   xor ebx, ebx      
   xor edx, edx      
L1:
   add eax, ebx ; eax += ebx
   mov ebx, edx
   mov edx, eax
loop L1
   ret
Fib ENDP

General-purpose language#

broadly applicable across application domains
- e.g., bank, medicine, science
lacking specialized features for a particular domain
- a counterexample: solve some constraint-based problems (use Prolog instead)
- note that libraries (except the standard one(s)) are not part of the language
- a related key idea: a language now comes with an environment, and a community

A multi-paradigm language#

Python is:

imperative
procedural
structured
object-oriented
somehow a bit
- functional
- generic

Imperative#

a = 0
a += 1

a series of instructions that change the state of the running program

Procedural#

def fact(n):
    return 1 if (n == 0 or n == 1) else n * fact(n - 1)
    # an example of "literate programming" here

a procedure is great to factor code, to be called many times

Structured#

with control structures (e.g., for, if)
with blocks – thanks to indentation

for i in range(len(lst)):
    if i % 2 == 0:
        print(lst[i], end = '\n')
        i = i - 1
    elif i == 7:
        break

subroutines

print(lst)  # this is a procedure call

subroutines are procedures (functions) and methods

compare with basic code:

HOME : TEXT : REM Fibonacci numbers 
LET MAX = 5000
LET X = 1 : LET Y = 1
IF (X > MAX) GOTO 100
PRINT X
X = X + Y
IF (Y > MAX) GOTO 100
PRINT Y
Y = X + Y
GOTO 30
END

and

GOSUB

for subroutines

Object-oriented#

class Door:
    def __init__(self, number, status='closed'):
        self.__number = number
        self.status = status
    def open(self):   # this is a method
        self.status = 'opened'
    # cut

d = Door(42)
d.open()              # this is a method call
print(d.status)

in this code snippet, ‘d’ is a Door

A bit functional#

l = list(range(0, 4))
l2 = list(filter(lambda x: x > 2, l))

we have the function: x $\mapsto$ x > 2 (this is not a procedure)

A bit generic#

T = TypeVar('T')
def first(seq: Sequence[T]) -> T:
    return seq[0]

returns the first element

About types / typing#

Object = instance of a type / result of the instanciation of a type

Type = description of all the objects with this type

d0 = Door(0)
dd = [Door(i) for i in range(10)]

here d0 and dd[i] are doors / are of type Door
they are independant, yet they behave the same way

A simple test than you shall pass#

run

print(type(d))
print(first(d))

on

d = (0, 1, 2)
d = [0, 1, 2]
d = {0, 1, 2}
d = {'pi': 3.14, 'e': 2.72}

what are the output? why?

Python has some built-in types:

Numeric data types: int, float, complex
String data types: str
Sequence types: list, tuple, range
Binary types: bytes, bytearray, memoryview
Mapping data type: dict
Boolean type: bool
Set data types: set, frozenset

multiple-item data types are called collections or containers

Comment the difference(s):

T = TypeVar('T')
def first(seq: Sequence[T]) -> T:
    return seq[0]
# versus
def first(seq):
    return seq[0]

def sqr(x: int) -> int:
    return x * x
#versus
def sqr(x):
    return x * x

b: int = 2
#versus
b = 2

b: str = 2
print(type(b))

def doit(i: int):
    print('int', i)

def doit(s: str):
    print(type(s), s)

doit(0)
doit("0")

What happens? Why?

use a linter!

and a type checker:

Mypy (by Dropbox), Pytype (by Google), Pyright (by Microsoft)
Pyre/Pysa (at Facebook and Instagram)

Variables, values, and types#

py:

a = 1                 # implicit type for 'a' -- typed by the compiler
b: int = 2            # explicit type for 'b'
b = "a string"        # do compile; 'b' is now a string

Pascal:

var b : integer = 2;  // types have to be explicit

modern C++:

auto a = 1;           // implicit type
int b = 2;            // explicit type
auto b = "a string";  // do not compile: 'b' is already defined!

py:

a = 1
ida = id(a)
a = 2
print(id(a) == ida)  # gives: False

we cannot change the value of the integer whose identity is ‘ida’

$\leadsto~$ integers are immutable in Python

py:

a = 1
ida = id(a)
b = a  # 'b' is actually a reference to 'a'; we can say that 'b' is 'a'
print(id(b) == ida)  # gives: True

b = 2  # now 'b' designates a new integer
print(id(b) == ida)  # gives: False

print(a)  # gives 1

single-item data types (integers, floats, complex numbers, Booleans) are immutable
strings and tuples are immutable

whereas

lists, sets, and dictionaries are mutable

Exercice#

def foo(b):
    print(id(b), b)

def bar(c):
    print(id(c), c)
    c = 2
    print(id(c), c)
    
a = 0 
print(id(a), a)
foo(a)
bar(a)
print(id(a), a)

What is printed? Explain. What the rationale behind it?

py:

l = [1, 'a string']
idl = id(l)
l.append(Door(0))
print(id(l) == idl)  # gives: True, lists are mutable

print(l)  # gives: [1, 'a string', <__main__.Door object at 0x7f568add0790>]

for e in l:
    print(e)  # gives: ???

Explain:

why the behavior hopefully differs from the previous example
the output of print(l)
the output of the for loop, and how we can make it possible?

class Door:
    def set_number(self, number):
        self.__number = number
    # cut

py:

# cont'd
print(id(l), id(l[2]), l[2])
for e in l:
    if isinstance(e, Door):  # same as: if type(e) == Door:
        e.set_number(2)
print(id(l), id(l[2]), l[2])

what is the output? what have we done here?

Structured programming and blocks#

Python:

def doit(lst):
    for i in range(len(lst)):
        if i % 2 == 0:
            print(lst[i], end = '\n')
            i = i - 1
        elif i == 7:
            break

blocks start with ‘:’ and rely on indentation

Equivalent code in C++:

void doit(std::list<int>& lst)
{
    for (int i = 0; i < lst.size(); ++i)
        if (i % 2 == 0) {
            std::cout << lst[i] << '\n'; // do not compile!  Why?
            i = i - 1;
        }
        else if (i == 7) {
            break; }  // the two braces are useless here
}

blocks are delimited by ‘{’ and ‘}’, and indentation does not matter

an instruction (not a block of instructions) ends with ‘;’

in Python indentation matters

i = 1 
 i = 0  # gives: IndentationError: unexpected indent

and it is the key to see blocks

i = 0
while i < 5:
    print(i)
    i += 1

vs

i = 0
while i < 5:
    print(i)
i += 1

compare with COBOL (1959):

column 7 is for an indicator:

‘*’ means that it is a comment
‘-’ means that it is the continuation of the previous line
‘/’ means that, when printing your code listing, it would page break
‘D’ means that the line would only compile when in debugging mode

compare with FORTRAN 66:

      program circle
      real r, area
 
c This program reads a real number r and prints
c the area of a circle with radius r.
 
      write (*,*) 'Give radius r:'
      read  (*,*) r
      area = 3.14159*r*r
      write (*,*) 'Area = ', area
 
      stop
      end

Memory#

a = 1  # new object => memory allocation
b = a  # no new object
b = 2  # b is a new object => memory allocation
       # no need for explicit memory deallocation

py:

d = Door(1)       # new object
print(d.status)
d.open()          # method call
                  # no need for memory deallocation

Python relies on a garbage collector to free memory

Read about

heap and stack
reference counting
the mark and sweep algorithm https://en.wikipedia.org/wiki/Memory_management https://en.wikipedia.org/wiki/Tracing_garbage_collection

The programmer can not care about memory; yet he/she shall know:

del my_large_container
gc.collect()

Python as an object-oriented language#

A particular way of thinking:

a program = data types and algorithms (functions)

example in C:

struct rectangle {                   // a data type
    float width, height;
};

void scale(rectangle* r, float s) {  // an algorithm
    assert(r != NULL and s > 0);
    r->width *= s;
    r->height *= s;
}

Compare:

dta = [("EPITA", "Kremlin-Bicetre"), ("Sorbonne University", "Paris 5")]

print("name=", dta[0][0], "where=", dta[0][1])

with:

@dataclass
class University:                # what have we here?
    name: str = None
    location: str = None

dta = [University("EPITA", "Kremlin-Bicetre"),
       University("Sorbonne University", "Paris 5")]
    
print(f"name = {dta[0].name} where = {dta[0].location}")

print(dta[0])  # works directly!

from dataclasses import dataclass

@dataclass
class University:                # a data type
    name: str = None             # grouping two strings
    location: str = None 

meaning that a university is composed of two data: name and location, both being strings

u1 = University("EPITA", "Kremlin-Bicetre")   # a particular university, u1

print(u1.name)  # gives: EPITA -> so *explicitly* print its name

print("Paris" in u1.location)   # more readable than 'u1[1]'

we have defined a type:

to be able to have objects with that particular type
- a “University” is far more precise than “just two strings”
to store data and explictly access to one piece of data
- “.location” is way more explicit than “[1]”

that is what you have in all python libraries: types

they are the keystones of these libraries
the librairies also provide algorithms
- doing stuff / transforming data is the key features of these libraries

# a data type

@dataclass
class University:
    name: str = None
    location: str = None

# an algorithm, dedicated on objects being a University

def relocalize(u: University):
    if u.location == "Kremlin-Bicetre":
        u.location = "Grand Paris"

sample use:

dta = [University("EPITA", "Kremlin-Bicetre"),
       University("Sorbonne University", "Paris 5")]
# 'dta' is a list of two universities

for u in dta:
    relocalize(u)
# we have relocalized all universities in 'dta'

in:

# a data type
@dataclass
class University:
    name: str = None
    location: str = None

we say that: we have define a class (a type) with two attributes (named ‘name’ and ‘location’)

in this example, its is a particular class that has only data; it is “just a data type”

in this example, algorithms are defined on the side (usually nearby)

this is the case of ‘relocalize’
algorithms are not within the type

this example corresponds to: