In this course, we will review some mathematical concepts wich are very useful in computer science research.

To illustrate these concepts, we will use the power of (source) code.

As part of the course evaluation, you will also write some code

**(Luke, use the source)**, and obviously do some math exercises

The course language is Python. It could be matlab, R, or even Java, C, (or the language of preference), but I choose Python because:

- I like it - my course, my rules ;o)
- A lot of people are using it (so there is plenty of resources out there, including libraries, tutorials, foruns, etc);
- Its sintax is clean and simple;
- It is easy to code and mantain;
- We can have these nice notebooks that I can easily convert to presentations, were I can mix, text, code, formulas, demos, videos, etc;
- It is free.

If you don't known Python, don't worry. It's a nice opportunity to learn. There are a lot of very good material freely available on the interner.

I won't teach you Python (it's out of the scope of this course), but I can eventually help you with some language issues. We will review some import things before we startd

The course handouts are written using jupyter notebooks.

I'll make them available on the internet.

The course handouts (these notebooks) are written in English. If you can't read English, do be worried. You must so, thus start studing English

**now**.

The intended audience are graduate students pursuing studies towards conclusind a M.Sc. or Ph.D. degree, which have a good backgroung in programming, but wish (or need) to revisit some math concepts.

The student’s prior programming experience can be in pretty much any programming language. Although the course is based in Python, no deep knowledge of the language is necessary.

The topics will not (intentionally) be covered in depth. Most of the content is already covered in undergraduate courses of Calculus, Linear Algebra, Discrete Mathmatics and Statistics. However, the intention is also to show a different, and more practical perspective.

Some python advantages for scientific computing

- It's free
- General purpose programming language
- Clean syntax, easy to read and write
- Plataform independent
- Pré-defined functions
- Lots of packages for scientifc computing
- Graphing tools

- You can program using any text editor, but this is recomendable only to very simple programs
- There are some python IDEs available out there. Some of them are "general purpose" ides.
- I recommend some "scientifc oriented"
- Jupyter notebooks can be used to produce "live" documents, mixing text and code in cells
- Fell free to choose the best configuration for you

- There are two main python versions out there
- 2.7.X
- 3.3.X or 3.4.X

- The syntaxes are almost the same (with some important differences)

- Common problem among Python programmers is to choose between version 2 or 3
- The general recommendation is to go for Python 3, because this is the version that will be developed in the future
- However, much useful mathematical software in Python has not yet been ported to Python 3
- Therefore, scientific computing with Python still goes mostly with version 2

- Mathematical computing frequently involve functions such as sin, cos, tan,sinh, cosh, exp, log, etc.
- On a pocket scientific calculator you have special buttons for such functions.
- Similarly, in a program you also have ready-made functionality for evaluating these types of mathematical functions. - One could in principle write one’s own program for evaluating, e.g., the but how to do this in an efficient way is a non-trivial topic.
- Experts have worked on this problem for decades and implemented their best recipes in pieces of software that we should reuse.

Consider the formula for the height y of a ball in vertical motion, with initial upward velocity $v_0$: $ y = v_0 t + \frac{1}{2}g t^2$, where $g$ is the acceleration of gravity and $t$ is time.

How long time does it take for the ball to reach the height $y_c$?

As this is a quadratic function, we can make $y_c = y$ and solve for $t$, geting:

$t_1 = \frac{v_0 - \sqrt{v_0^2 - 2 g y_c}}{g}$, $t_2 = \frac{v_0 + \sqrt{v_0^2 - 2 g y_c}}{g}$

A possible program would be

```
v0 = 5 //arbitrarily chosen
g = 9.81 // constant
yc = 0.2 // arbitrailiry choson
import math
t1 = (v0 - math.sqrt(v0**2 - 2*g*yc))/g
t2 = (v0 + math.sqrt(v0**2 - 2*g*yc))/g
print 'At t=%g s and %g s, the height is %g m.' % (t1, t2, yc)
```

- importing the module and using its name as a prefix

```
import math
print math.sqrt(2)
```

- importing specific procedures from a module

```
from math import sqrt
print sqrt(2)
```

Imported modules and functions can be given new names in the import statement

```
import math as m
print m.sqrt(2)
from math import sqrt as square_root
print square_root(2)
```

Let's say we want to convert from Celsius to Fahrenheit. The formula is:

$ F = \frac{9}{5}C + 32$

where $C$ is the temperature in Celsius and $F$ is the temperature in Fahrenheit. A possible program is

```
C = 21 #some arbitrary value
F = (9/5)*C + 32
print F
```

In python 2.X, this program will print 53. However, in python 3.X, this same program will print 69.8 (which is the correct value)

What happened? The formula in our program looks pretty much correct (and indeed is).

The error in our program above is one of the most common errors in mathematical software and is not at all obvious for a newcomer to programming.

In many computer languages, there are two types of divisions: **float division** and **integer division**.

Float division is what you know from mathematics: 9/5 becomes 1.8 in decimal notation

Integer division a/b with integers (whole numbers) a and b results in an integer that is truncated (or mathematically, rounded down. This implies that 9/5 becomes 1.

The version 2 of python (and many other programming languages) interprets a/b as an integer division, when a and b are integers

Other languages, such as Matlab and the version 3 of python interpret a/b as float division even if both operands are integers, or complex division if one of the operands is a complex number

It would be interesting to avoid integer division when programming mathematical formulas

Python 3.X has no problem with integer division, so the problem arises when we are using python 2.X. To overcome this, we can use at the very begining of our program:

```
from __future__ import division
```

In the rare cases when a mathematical algorithm does make use of integer division, one should use a double forward slash, `//`

, as division operator, because this is Python’s way of explicitly indicating integer division

An alternative is to use the cast operation:

```
C = 21 #some arbitrary value
F = (float(9)/5)*C + 32 # it could be (9/float(5))*C + 32
print F
```

or alternativelly, rewrite the formula using floats

```
C = 21 #some arbitrary value
F = (9.0/5)*C + 32 # it could be (9/5.0)*C + 32
print F
```

Let's say we want to compute the hiporbolic sine

$sinh = \frac{1}{2}(e^x - e{-x})$

using three different ways:

- using the
`python math.sinh`

procedure from math package - computing the right hand side of the expression, by using the
`python math.exp`

procedure from math - computing the right hand side using the exponential built in operator
`python **`

In [1]:

```
from math import sinh, exp, e, pi
x = 2*pi
r1 = sinh(x)
r2 = 0.5*(exp(x) - exp(-x))
r3 = 0.5*(e**x - e**(-x))
print r1, r2, r3
```

267.744894041 267.744894041 267.744894041

At a first glance, the three results seems to be the equal

Nevertheless, this is not the whole story ...

Let's print the results using 16 decimals

In [2]:

```
print '%.16f %.16f %.16f' % (r1,r2,r3)
```

267.7448940410164369 267.7448940410164369 267.7448940410163232

Now `r1`

and `r2`

are equal, but `r3`

is different!

Why is this so?

- Our program computes with
real numbers, andreal numbersneed in general aninfinite numberof decimals to be represented exactly.- The computer
truncatesthe sequence of decimals because the storage is finite

We won't get into details, but you can read further here

We must be aware that real numbers on a computer often have a small error.

Only a few real numbers can be represented exactly within the quantity of reserved digits, the rest of the real numbers are only approximations.

For this reason, most arithmetic operations involve **inaccurate** real numbers, resulting in **inaccurate calculations**

Let's consider the following two calculations: $\frac{1}{49}\cdot 49$ and $\frac{1}{51}\cdot 51$

Both expressions are identical to 1, but when we perform the calculations in Python

In [3]:

```
print '%.16f %.16f' % (1/49.0*49, 1/51.0*51)
```

0.9999999999999999 1.0000000000000000

The two results are different!

The reason why we do not get exactly 1.0 as answer in the first case is because $\frac{1}{49}$ is not correctly represented in the computer. Also $\frac{1}{51}$ has an inexact representation, but the error does not propagate to the final answer.

To summarize, errors in floating-point numbers may propagate through mathematical calculations.

As a result, answers are only approximations to the exact underlying mathematical values.

The errors in the answers are commonly known as **round-off errors**.

Python has a special module `decimal`

which allows real numbers to be represented with adjustable accuracy so that round-off errors can be made as small as desired.

A **set** is a collection of mathematical objects in which each object is considered to occur at most once.

The objects belonging to a set are its elements.

Curly braces are used to explicit enumerate the elements of a set. For example, $\{\heartsuit, \clubsuit, \spadesuit, \diamondsuit\}$ is the set of suits in a traditional deck of cards.

Python has a built in set structure

In [4]:

```
deck = set(["heart", "club", "spade", "diadmond"])
print (deck)
```

set(['club', 'heart', 'diadmond', 'spade'])

The symbol $\in$ is used to indicate that an object belongs to a set (equivalently, that the set contains the object).

For example, $\heartsuit \in \{\heartsuit, \clubsuit, \spadesuit, \diamondsuit\}$.

We can test if an element is in a set using the keyword `in`

In [5]:

```
element = "heart"
if element in deck:
print(element+" belongs to set")
else:
print(element+" does not belong to set")
```

heart belongs to set

One set $S_1$ is contained in another set $S_2$ (written $S_1 \subseteq S_2$ ) if every element of $S_1$ belongs to $S_2$.

Two sets are equal if they contain exactly the same elements. A convenient way to prove that two sets are equal consists of two steps:

- prove the first set is contained in the second, and
- prove the second is contained in the first.

In python, you can use the equality opertar to test set equality

In [6]:

```
A = set([1, 2, 3])
B = set([3, 2, 1])
if (A==B):
print("sets are equal")
else:
print("sets are different")
```

sets are equal

A set can be infinite. The set of real numbers $\Re$, which contains all real numbers, is infinite.

If a set S is not infinite, we use $|S|$ to denote its cardinality, the number of elements it contains. For example, the set of suits has cardinality 4.

The `len`

command returns the size of a set.

In [7]:

```
len(deck)
```

Out[7]:

4

An **empty** set, denoted as $\emptyset$, no contains any element

We can create an empty set in Python by using `set()`

without arguments

```
emptyset = set()
```

Let $A$ and $B$ be two sets. The following are set operations

The **union** is a set of all elements from both sets

In [8]:

```
# initialize A and B
A = {1, 2, 3, 4, 5}
B = {4, 5, 6, 7, 8}
# use | operator
print(A | B)
# use union function
print(A.union(B))
# use union function on B
print(B.union(A))
```

set([1, 2, 3, 4, 5, 6, 7, 8]) set([1, 2, 3, 4, 5, 6, 7, 8]) set([1, 2, 3, 4, 5, 6, 7, 8])

The **intersection** is a set of elements that are common in both sets.

In [9]:

```
# initialize A and B
A = {1, 2, 3, 4, 5}
B = {4, 5, 6, 7, 8}
# use & operator
print(A & B)
# use intersection function
print(A.intersection(B))
# use intersection function on B
print(B.intersection(A))
```

set([4, 5]) set([4, 5]) set([4, 5])

The **difference** of A and B (A - B) is a set of elements that are only in A but not in B. Similarly, B - A is a set of element in B but not in A.

In [10]:

```
# initialize A and B
A = {1, 2, 3, 4, 5}
B = {4, 5, 6, 7, 8}
# use - operator
print(A-B)
# use the difference function on A
print(A.difference(B))
# use - operator
print(B-A)
# use the difference function on A
print(B.difference(A))
```

set([1, 2, 3]) set([1, 2, 3]) set([8, 6, 7]) set([8, 6, 7])

The **Cartesian product** of two sets $A$ and $B$, denoted by $A \times B$ is the set of all pairs $(a,b)$ where $a \in A$ and $b \in B$.

**Example**
For the sets $A=\{1,2,3\}$ and $B=\{\heartsuit, \clubsuit, \spadesuit, \diamondsuit\}$, the cartesian product is
$\{(1,\heartsuit), (1,\clubsuit), (1,\spadesuit), (1,\diamondsuit), (2,\heartsuit), (2,\clubsuit), (2,\spadesuit), (2,\diamondsuit), (3,\heartsuit), (3,\clubsuit), (3,\spadesuit), (3,\diamondsuit)\}$

For finite sets $A$ and $B$, $|A \times B| = |A|\cdot |B|$

Loosely speaking, a function is a rule that, for each element in some set $D$ of possible inputs, assigns a possible output.

The output is said to be the

**image**of the input under the function and the input is a**pre-image**of the output.The set D of possible inputs is called the domain of the function.

Formally, a function is a (possibly infinite) set of pairs $(a, b)$ no two of which share the same first entry.

**Example**

The doubling function with domain $\{1, 2, 3, \dots\}$ is $\{(1, 2), (2, 4), (3, 6), (4, 8), \dots\}$

**Example**

The multiplication function with domain $\{1, 2, 3, \dots\} × \{1, 2, 3, \dots\}$ looks something like this: $\{((1, 1), 1), ((1, 2), 2), \dots, ((2, 1), 2), ((2, 2), 4), ((2, 3), 6), \dots\}$

For a function named $f$, the image of $q$ under $f$ is denoted by $f (q)$. If $r = f (q)$, we say that $q$ maps to $r$ under $f$.

The notation for “$q$ maps to $r$” is $q \rightarrow r$.

It is convenient when specifying a function to specify a co-domain for the function.

The co-domain is a set from which the function’s output values are chosen. Note that one has some leeway in choosing the co-domain since not all of its members need be outputs.

The notation $f : D \rightarrow F$ means that $f$ is a function whose domain is the set $D$ and whose co-domain (the set of possible outputs) is the set $F$ . (More briefly: "a function from $D$ to $F$”, or “a function that maps $D$ to $F$”).

** Example **

Caesar’s cryptosystem replace each letter with the one three steps forward in the alphabet (wrapping around for $X$,$Y$, and $Z$). Thus the plaintext $MATRIX$ would be encrypted as the cyphertext $PDWULA$. The function that maps each plaintext letter to its cyphertext replacement could be written as $A \rightarrow D, B \rightarrow E, C \rightarrow F, D \rightarrow G, \dots, W \rightarrow Z, X \rightarrow A, Y \rightarrow B, Z \rightarrow C$ This function’s domain and co-domain are both the alphabet $\{A, B, \dots, Z\}$.

** Example **

The $cosine$ function, $\cos$, maps from the set of real numbers (indicated by $R$) to the set of real numbers. We would therefore write $\cos : R \rightarrow R$. Of course, the outputs of the cos function do not include all real numbers, only those between -1 and 1.

The

**image**of a function $f$ is the set of images of all domain elements. That is, the image of $f$ is the set of elements of the co-domain that actually occur as outputs.For example, the image of Caesar’s encryption function is the entire alphabet, and the image of the cosine function is the set of numbers between -1 and 1.

- A
**procedure**is a precise description of a computation; it accepts inputs (called arguments) and produces an output (called the return value).

```
'''
compute the multiplication between two integers p and q
'''
def mult(p,q):
return p*q
```

- For the sake of cleaness, we avoid the common practice of referring to procedures as “(computer) functions”.

Sometimes the arguments of procedures are sets or other container with several elements, and we want to compute and return somethig for (a combination of) all elements.

We can do the entire computation for all elements and return a container, but this may be time and memory consuming

Python provides the generator/iterator concept to help in this case

Let's say we want to implement a procedure to compute the cross product of two sets.

Without a generator we have:

In [ ]:

```
def cross_prod_no_generator(A,B):
result = []
for element_a in A:
for element_b in B:
result.append((element_a,element_b))
return result
```

Let's say we want to implement a procedure to compute the cross product of two sets.

With a generator we have:

In [12]:

```
def cross_prod(A, B):
for element_a in A:
for element_b in B:
yield (element_a,element_b)
```

Observe that, instead of the `return`

statement, we used the `yield`

statement

`return`

is used when we want a normal procedure, and the "status" of the call is destroied after the`return`

is executed`yield`

, on the other hand, produces a generator. The "status" of the call is frozen and return the next result in the next call

Each time we call the iterator, it will return a new result until all results are produced

In [14]:

```
A = set(['1','2','3'])
B = set(["heart", "club", "spade", "diadmond"])
for pair_element in cross_prod(A,B):
print pair_element
```

We can also use without a loop, and explicit call the next element

In [15]:

```
A = set(['1','2','3'])
B = set(["heart", "club", "spade", "diadmond"])
cross_product_iterator = cross_prod(A,B)
print(cross_product_iterator.next())
print(cross_product_iterator.next())
print(cross_product_iterator.next())
```

('1', 'club') ('1', 'heart') ('1', 'diadmond')

As only one item is generated/consumed in each iteration, it is not necessary to allocate memory for all retornable items.

In a loop, all elements are consumed.

Outside the loop, elements could be consumed only once. Calling next after all elements where consumed will raise an error.

If not all elements are consumed, the frozen state is kept in memory (eventually, it may be garbage collected)

- A
**computational problem**is an input-output specification that a procedure might be required to satisfy.

** Example **

input:a pair $(p, q)$ of integers greater than 1output:the product $p\times q$

** Example **

input:an integer $m$ greater than 1output:a pair $(p, q)$ of integers whose product is $m$

Unlike a procedure, a function or computational problem does not give us any idea how to compute the output from the input.

For integer multiplication, we can use:

- ordinary long multiplication (you learned this in elementary school)
- Karatsuba algorithm (used by Python for long-integer multiplication)
- the faster Schönhage-Strassen algorithm (which uses the Fast Fourier Transform)
- and the even faster Fürer algorithm, discovered in 2007.

There are often many different procedures that satisfy the same input-output specification or that implement the same function.

The

`python muly`

procedure definie before can be used for multiplying negative integers and numbers that are not integers.

A computational problem need not specify a unique output for every input.

Unlike a function, a computational problem need not specify a unique output for every input; for Example 0.3.8 (Page 4), if the input is 12, the output could be (2, 6) or (3, 4) or (4, 3) or (6, 2).

Although function and computational problem are defined differently, they are related

For each function $f$, there is a corresponding computational problem: Given an element a of $f$’s domain, compute $f(a)$, the image of $a$ under $f$.

This is known as the "forward problem"

However, there is another computational problem associated with a function: Given an element $r$ of the co-domain of the function, compute any pre-image (or report that none exists).

This is known as the "backward problem"

Suppose there is a procedure $P(x)$ for computing the image under $f$ of any element $x$ of the domain.

An obvious procedure for computing the pre-image of $r$ is to iterate through each of the domain elements $q$, and, one by one, apply the procedure $P(x) on $q$ to see if the output matches $r$.

This approach seems very slow - even if the domain is finite, it might be so large that the time required for solving the pre-image problem would be much more than that for $P(x)$ - and yet there is no better approach that works for all functions.

Sometimes, this behaviour is desirable: think about a good cryptographic function. One would like to compute the cyphred message easyly, but the decyphing process should be as hard as possible, unless we know the key

However, in some cases, we want to compute both problems efficiently. To this end, we must consider the pre-image problem not for arbitrary functions but for specific families of functions (as we have just seen, there is no general efficient procedure for all functions).

Furthermore, the family of functions must have applicability. If the family of functions is too restrictive, the existence of fast procedures for solving the pre-image problem will have no relevance to real-world problems.

Let us take the perspective of a lieutenant of Caesar who has received a cyphertext: PDWULA.

To obtain the plaintext, the lieutenant must find for each letter in the cyphertext the letter that maps to it under the encryption function.

That is, he must find the letter that maps to $P$ (namely $M$), the letter that maps to $D$ (namely $A$), and so on.

In doing so, he can be seen to be applying another function to each of the letters of the cyphertext, specifically the function that reverses the effect of the encryption function. This function is said to be the **functional inverse** of the encryption function.

The operation functional composition combines two functions to get a new function.

Given two functions $f : A \rightarrow B$ and $g : B \rightarrow C$, the function $g\circ f$, called the **composition** of $g$ and $f$, is a function whose domain is $A$ and its co-domain is $C$.

It is defined by the rule $$(g \circ f)(x) = g(f(x))$$

For any domain $D$, there is a function $id_D : D \rightarrow D$ called the **identity function** for $D$, defined by

$id_D(d) = d$

$\forall d \in D$.

We say that functions $f$ and $g$ are functional inverses of each other if

- $f \circ g$ is defined and is the
**identity function**on the domain of $g$, and - $g \circ f$ is defined and is the
**identity function**on the domain of $f$.

Not every function has an inverse. A function that has an inverse is said to be **invertible**.

Consider a function $f : D \rightarrow F$.

We say that $f$ is

**one-to-one**if for $\forall x,y \in D,f(x)=f(y)$ implies $x=y$ .We say that $f$ is

**onto**if, $\forall y,z \in F$, there exists $x \in D$ such that $f(x) = z$.

Suppose $f$ is not one-to-one, and let $x_1$ and $x_2$ be distinct elements of the domain such that $f(x_1) = f(x_2)$.

Let $y = f(x_1)$. Assume for a contradiction that $f$ is invertible.

The definition of inverse implies that $f^{−1}(y) = x1$ and also $f^{−1}(y) = x2$, but both cannot be true.

Suppose $f$ is not onto, and let $\hat{y}$ be an element of the co-domain such that $\hat{y}$ is not the image of any domain element.

Assume for a contradiction that $f$ is invertible.

Then $\hat{y}$ has an image $\hat{x}$ under $f^{−1}$. The definition of inverse implies that $f(\hat{x}) = \hat{y}$, a contradiction.

**Proof:**

For each element $\hat{y}$ of the co-domain of $f$, since $f$ is onto, $f$’s domain contains some element $\hat{x}$ for which $f(\hat{x}) = \hat{y}$; we define $g(\hat{y}) = \hat{x}$.

$g \circ f$ must bee the identity function on $f$’s domain. Because $f$ is one-to-one, $\hat{x}$ is the only element of $f$’s domain whose image under $f$ is $\hat{y}$, so $g(\hat{y}) = \hat{x}$. This shows $g \circ f$ is the identity function.

Furthermore $f \circ g$ is the identity function on $g$’s domain. Let $\hat{y}$ be any element of $g$’s domain. By the definition of $g$, $f(g(\hat{y})) = \hat{y}$.

**Proof:**

Let $f : U \rightarrow V$ be an invertible function. Suppose that $g_1$ and $g_2$ are inverses of $f$.

We show that, for every element $v \in V$ , $g_1(v) = g_2(v)$, so $g_1$ and $g_2$ are the same function.

Let $v \in V$ be any element of the co-domain of $f$. Since $f$ is onto, there is some element $u \in U$ such that $v = f(u)$. By definition of inverse, $g_1(v) = u$ and $g_2(v) = u$. Thus $g_1(v) = g_2(v)$.

If $f$ and $g$ are invertible functions and $f \circ g$ exists then $f \circ g$ is invertible and $(f \circ g)^{-1} = g^{-1} \circ f^{-1}$.