...

Sunday, May 26, 2013

Computer Science 10

A Generator in Python is yield. It is similar to return, but instead of just stopping, it actually remembers the point in the body where it was, plus all local variables. A Generator is a function, or method, that we can use as an Iterator. For every iteration, it goes back and continues the execution until another yield is met. It does so until it reaches the end of the function.

The benefit of using a generator function as opposed to returning a whole is that it saves on memory requirements of having to store an entire list in memory. It also cuts down response time. Let us look at the following example:

def generateList(n):
    ans=[]
    for i in range(n):
        ans.append(n*2)
    return ans

for i in generateList(1000000000):
    print(i)


It actually does the same thing as:
def generatorYield(n):
    for i in range(n):
        yield i*2

for i in generatorYield(1000000000):
    print(i)


However, the yield function uses much less memory as it does not have to store a full list in memory. It also has faster response time because it reacts upon the first yield, as compared to waiting for the entire list to be populated.

This is not a perfect example, but it does illustrate some little principles and basic implementation of the yield function.

For much of the history of Science, people focus on trying to find Analytic Models. An Analytic Model allows prediction of behavior through a function by having a set of initial conditions and parameters. This led to things like the calculus and the probability theories.

However, Analytic Models do not work all the time. In the 20th century, people started turning to Simulation Models. When it comes to things like systems which are not mathematically tractable, like weather forecasting, instead of spending a lot of time to build accurate mathematical models, it's easier to build successively refining series of simulations. It is often easier to extract useful intermediate results than it is to build a detailed Analytic Model. In the past, people had to build really complex mechanical clockwork models to work with simulations. The advent of computers gave rise to the adoption of Simulation Models.

A Simulation Model has the following properties:
1) It gives useful information about behavior of a system, as opposed to an Analytic Model which gives the exact outcome of the situation.
2) It gives us an approximation to reality.
3) Simulation models are descriptive, instead of prescriptive. A prescriptive model gives the exact same output for every input. In simulations, given a set of conditions, you may get different outcomes each time you run it because it allows factoring of things that cannot be mathematically modeled.

We may not get the same answer every time but if we do the simulation enough times we're going to get a good sense of what is going to happen.

In 1827, Robert Brown observed that a pollen particle suspended in water would float around randomly under microscopic conditions. They call it the Brownian Motion. The first clear mathematical model didn't really come about until in the 1900s. In 1905, Albert Einstein came up with the model that confirmed the existence of atoms.

Brownian Motion in Wikipedia

Brownian Motion is an example of a Random Walk. A Random Walk simulation is extremely useful when trying to model particles that are going to move in a random direction and velocity at each time-step. We can use it to model particles in fluid. This is also extremely useful in understanding biological processes. It is also useful in modeling stock markets.

Suppose that we have a drunken person in the middle of a field, and each second he can take one step in any cardinal direction (NSEW). After a thousand seconds, where would he be?

To model this, we're going to divide this field into x and y axes. Let's assume that the person starts of at (0,0), which is the origin.

This is the probability table mapping steps, distance and probability:

At the first step:
There is a probability of 1 that he can be 1 step away.
He would be about 1 step away.

At the second step:
There is a probability of 1/4 that he can be back at 0,
a probability of 2/4 that he can be sqrt(2) steps away,
and a probability of 1/4 that he can be 2 steps away.
He would be about sqrt(2)/2+0.5=1.2 steps away

At the third step:
There is a probability of 9/16 to be 1 step away,
3/8 to be sqrt(5) steps away,1/16 probability to be 3 steps away,
He would be about 9/16+3/16+(3sqrt(5)/8)=1.5 steps away.

As the number of steps increase, it becomes extremely difficult to calculate where he will end up at.

From here, it seems like the more steps the person takes, the further away he will be.

To simulate this, we need to model a person, a field, and a location to help with calculations.
The field object can be a collection of drunks and their location.

This is how the entire code looks like:

import math
import random
import copy

class Location(object):
    def __init__(self,x,y):
        self.x=float(x)
        self.y=float(y)

    def set_location(self,xy):
        self.x=float(xy[0])
        self.y=float(xy[1])

    def move(self,xy):
        self.x+=float(xy[0])
        self.y+=float(xy[1])
       
    def get_location(self):
        return (self.x,self.y)

    def distance_from(self,other):
        return math.sqrt((other.x-self.x)**2+(other.y-self.y)**2)

    def __str__(self):
        return str((self.x,self.y))

class Field(object):
    def __init__(self):
        self.field={}

    def add_person(self,person,location):
        if person not in self.field.keys():
            self.field[person]=copy.deepcopy(location)

    def rem_person(self,person):
        if person in self.field.keys():
            self.field.pop(person)

    def move_person(self,person):
        if person in self.field.keys():
            self.field[person].move(person.step())

    def where_is(self,person):
        if person in self.field.keys():
            return self.field[person]
           
class Person(object):
    def __init__(self,name):
        self.name=name

    def step(self):
        return random.choice(((0,1),(1,0),(0,-1),(-1,0)))

    def __str__(self):
        return self.name


We can then implement a test like this:

origin=Location(0,0)
kelvin=Person("Kelvin")
field=Field()
field.add_person(kelvin,origin)
tries=1000
steps=1000
average_distance=0.0
maximum_distance=0.0
minimum_distance=float(steps)
for i in range(tries):
    for j in range(steps):
        field.move_person(kelvin)
    distance_moved=origin.distance_from(field.where_is(kelvin))
    average_distance+=distance_moved/float(tries)
    maximum_distance=max(maximum_distance,distance_moved)
    minimum_distance=min(minimum_distance,distance_moved)
    origin.set_location(field.where_is(kelvin).get_location())
print("Average Travelled:",average_distance)
print("Maximum Travelled:",maximum_distance)
print("Minimum Travelled:",minimum_distance)


As you can see, the results vary quite a lot and in order to get a proper reading we have to set tries to a really high value.

How do we think about the results of the programs when the programs themselves are stochastic? Almost everything in the real world is stochastic, and we must be really careful in accepting results.  Let's think about the role of randomness in computation. The world is extremely non-deterministic. That means that we can only say that "something has a very high probability of occurring", and there is no such thing as "something will certainly occur".

Causal Non-Determinism is the believe that not every event is caused by previous events. Einstein and Schrodinger strongly disagreed with this. Instead, they strongly believed in Predictive Non-Determinism. The concept is that because of our inability to make accurate measurements about the physical world, it makes it impossible to make precise predictions about the future. Things are not unpredictable, it's just because we just don't / can't know enough to be able to predict it.

Stochastic Process is a process whose next state is based on the previous state and at least one random element.

Random in Python can be implemented via:
1) random.choice() - Helps you choose randomly based on what's passed to the function.
2) random.random() - Generates a number that is 0.0 < x <= 1.0.

A process that returns a value between 1 to 6 (a dice roller) is not considered Stochastic as the value returned is independent upon the previous values. In 10 throws, the probability of getting all 1's is:
1/(610)

In fact, the probability of getting all 0's is also 1/(610), as is any other combination (e.g. 1, 3, 2, 5, 3, 4, 3, 2, 4, 2).

Looking from the probability perspective, for such a process, we should be asking: What fraction of the possible results have the property we are testing for (e.g. have exactly four 1's, etc).

From the other side, the probability of us getting NOT all 1's is:
1-1/(610)

This "1 minus" trick will come in very handy.

A lot of times when we're doing computing, we use print statements to present our results. When it comes to decimal places or when dealing with large amounts of data, print statements can be quite messy and difficult to read.

In Python, there is a plotting package known as pylab. It provides quite a lot of facilities from Matlab. We can use pylab for plotting graphics very easily.

However, pylab is not part of the standard Python library, and therefore we need to install it.

Head on over here and install one of the binaries compatible with your computer.

The next article will discuss more about pylab.

Computer Science 09

Objection-Oriented Programming (OOP) is actually not a new notion. However, OOP didn't take off until Java, which was the first popular OOP language. The basis of OOP is the abstraction of data types. The idea is that we can extend our programming languages through user-defined types. These types can then be used just as though it's a built-in type.

The reason why it's called "Abstract" data type is because for each type there exists an Interface for which we can Implement. Interfaces explain what the methods do (but not how they do it). The Specification of an object tells us what the object does.

To define a Class, we use:

class class_name(object):
    #Contents of the Class


We create the Instantiator as such:

def __init__(self):
    #Contents of the instantiator


Every time the object is created, the __init__ method will be executed first. We usually introduce attributes of objects through its Instantiator like this:

def __init__(self):
    self.a=10
    self.L=[]


We then create the Object in the main code as such:

object_name = class_name()

Notice that there is a notion of self in the implementation. In __init__, self is a formal parameter, but when creating the object, we did not actually pass any parameter in.

This is because for every method in an object, the reference pointer of itself is automatically passed into the first parameter. Therefore, you'll need to include self (or anything you name) in every method's first formal parameter. The pointer in object_name and the self object inside __init__ is actually the same.

Integer a and list L are attributes of the instance of the class_name object, which is equivalent to the self object.

 To see the pointer of the object, we can print object_name directly, and this is what we'll see:

<__main__.class_name object at 0x0000000002C27B38>

This is also exactly what we'll see if we printed self inside the methods of the object.

Let's say that if we print object_name, we want to see a custom string representation of it, instead of its contents. We can do this by creating  the __str__ method:

def __str__(self):
    return "The value of a is: "+str(self.a)+"\nThe contents of L is "+str(self.L)


The print function automatically calls the __str__ method when attempting to print object_name.

If we want to let the value of a represent object_name when type-casted to an integer, we can simply create a method as such:

def __int__(self):
    return self.a


We can test by using:

print(10+int(object_name))

We can actually use class_name.a to refer to the "a" attribute in the object. However, this is not recommended. This is because "a" is a variable used in the implementation of the object. It is not in the specification that "a" even exists.

Take for instance, you have a method get_number() which returns the value of a certain number stored in variable "a". In a future version, let's say that the variable "a" doesn't exist anymore, because the implementation is different and uses perhaps variable "x" instead, to store that number.

If our programs used to reference "a", you would get a whole load of errors. However, if it used get_number(), it doesn't matter whether "a" or "x" is used, because it'll depend what's returned by the get_number() method. We still get the number we want.

The concept is known as Data Hiding, in which we abstract data into Getter and Setter methods. It is important that we do not directly access instance (created each instance of the class) and class variables (shared among all instances of the class), due to object reference problems and possibility of changes in the future.

Take for instance, we want to write a program to keep track of staff information across various departments of a company. To do this, we need to start thinking about the level of abstraction we want to have. Before we write the code in detail, we want to think about the types that would make it easier to write the code.

When dealing with stuff like these, in the end we don't want to deal with lists and dicts and floats. Instead, we want to keep track of things like department, worker, manager, etc.

We can then set up a system of hierarchy of these. We first identify the similarities between the objects. For example, worker and manager are all employees.  All employees have attributes like name, birth date, etc. We can set up the employees class as shown:

class Employee(object):
    import datetime
   
    def __init__(self,name):
        nameSplit=name.split()
        self.firstName=nameSplit[0]
        try:
            self.lastName=nameSplit[1]
        except:
            self.lastName=""
        self.birthdate=None

    def set_firstName(self,firstName):
        self.firstName=firstName

    def get_firstName(self):
        return self.firstName

    def set_lastName(self,lastName):
        self.lastName=lastName

    def get_lastName(self):
        return self.lastName

    def set_birthdate(self,date):
        self.birthdate=date

    def get_birthdate(self):
        return self.birthdate.isoformat()

    def get_age(self):
        return round((datetime.date.today()-self.birthdate).days/365.0,2)

    def __lt__(self,other):
        #If they have the same first names
        if self.firstName==other.firstName:
            #Compare their last names
            return self.lastName        else:
            #Else compare by their first names
            return self.firstName
 

    def __str__(self):
        return self.firstName+" "+self.lastName


Notice that there's a __lt__ method in there. The __lt__ method is what's called when you compare employee with something else. For example, if you have two employees, john and william, typing john < william would invoke __lt__(john,william).When sorting lists of employees, the __lt__ automatically gets used when comparing objects. Observe the following implementation in the main code:

listEmployees=[]
listEmployees.append(Employee("John Smith"))
listEmployees.append(Employee("William Smith"))
listEmployees.append(Employee("John Doe"))
listEmployees.append(Employee("William Wise"))

print("Before sort:")
for employeeInList in listEmployees:
    print(employeeInList)

print("After sort:")
listEmployees.sort()

for employeeInList in listEmployees:
    print(employeeInList)


It will be sorted according to first name, then last name, as implemented in the __lt__ method.

We can also play with the birthdate and age as shown:

def isOlder(employee1,employee2):
    if employee1.get_age()<employee2.get_age():
print(employee2,"is older")
    else:
          print(employee1,"is older")

import datetime

kelvin=Employee("Kelvin Ang")
kelvin.set_birthdate(datetime.date(1990,6,21))

ling=Employee("Lingling Pang")
ling.set_birthdate(datetime.date(1993,10,19))

print(kelvin,"is",kelvin.get_age())
print(ling,"is",ling.get_age())

isOlder(kelvin,ling)


As we can see, employee is a subclass of object, that's why we can do so much stuff with it. A worker is a subclass of employee. For example, in the company, every worker has a workerID. We can then do:

from Employee import Employee

class Worker(Employee):
    workerID=0
    def __init__(self,name):
        Employee.__init__(self,name)
        self.workerID=Worker.workerID
        Worker.workerID+=1

    def __str__(self):
        return Employee.__str__(self)+", Worker ID "+str(self.workerID)

    def __lt__(self,other):
        return self.workerID


We can then create a worker as such:

kelvin = Worker("Kelvin Ang")
print(kelvin)


(Note that I saved the Employee class in a file called Employee.py. I also didn't create the Getter and Setter methods for workerID, which should be implemented but had been left out in the interest of simplicity)

Notice that even though we inherited all methods from Employee class, we defined the __init__ method again. This is known as Method Overriding. When we Override a method, and we want its Superclass's __init__ to be run as well, we specify it in the Overriding Method.

Similarly, we Override the __str__ method to now include the Worker ID. We append the old output with our new information. If we didn't override __str__, it would use the superclass's __str__ method.

Notice that there's a workerID=0 statement at the top. This defines a Class Variable. A Class Variable is one that is shared among all objects of the same class. In this case, the first object instantiated gets assigned ID 0, and the next one gets assigned ID 1 and so on.

We can then try it out like this:

kelvin = Worker("Kelvin Ang")
ling = Worker("Lingling Pang")
print(kelvin)
print(ling)
print(kelvin<ling)


Note that we're now comparing the ID when we do kelvin<ling. Suppose that we want to compare names instead. We can use:

print(Employee.__lt__(kelvin,ling))

We, however, cannot use Worker.__lt__() for two employees, because it makes use of workerID which employees do not have.

We can also create another Class that is exactly the same as Worker. Like this:

class Worker2(Worker):
    pass


It may seem as though it's for nothing, but it allows us to do checking like this:

if type(kelvin)=='Worker'
    #Some code here
elif type(kelvin)=='Worker2'
    #Some other code here

Thursday, May 23, 2013

Computer Science 08

Hashing is how the 'dict' is implemented in Python. It trades off space (memory) for time (CPU utilization).  Take for example, we have a set of integers. We build the set of integers, and detect whether or not a particular integer is in the set very quickly.

We are first going to take an integer i, and hash it. A hashing function is one that converts integer i to some other integer in a range between 0 to k. We will then use this hash integer to index into a list of buckets (a bucket is actually a list).

We know that we can find the ith element of a list in constant time, so to find an integer in a dictionary, we hash the integer, then look directly into the index to look for it. The hash function has the following issues:
1) Many-to-One - There can be an infinite number of different inputs that may hash to the same value, resulting in Collisions.
2) Collision - When two different values hash to the same bucket, we have a collision.
3) Linear Rehashing is used to take care of Collision, which keeps a list of things with the same hash value in buckets.

A good hash function has the following properties:
1) Wide Dispersal, One-to-One - This requires a large hash size
2) Avalanche Effect - A small change in input should result in a large change in hash

If we analyze the complexity of a Hash Lookup, we know that the hashing is of constant complexity, while the lookup is actually based on the number of things stored in the Bucket. If the particular item you're looking for has a great number of other things that collided with it, then it will take a longer time to find it. If all buckets are of length 1 (no collision), then a hash lookup is actually a constant complexity operation. The more buckets there are, the less chance is it for there to be a collision. This is a type of trade off between space (take more space) and time (use less time).

A practical hash table uses a large hash such that there is really low collision.

Python's dictionary implements an expanding hash system that expands accordingly to the number of collisions detected.

The reason why the keys in a 'dict' has to be immutable is because if a list is used as a key, and it mutates somewhere along the way, the hash would come out different and we would not be able to find the object it's supposed to be the key for.

Exceptions are everywhere in programming languages. What an exception is an error that is encountered during execution. Examples are:
1) IndexError - When you try to access an invalid index
2) NameError - When you try to access an invalid variable
3) TypeError - When an unexpected type is encountered

These Exceptions are also called Unhandled Exceptions. In Unhandled Exceptions, the program crashes. Sometimes we write programs that intentionally raise exceptions, and we write code to catch these exceptions and handle them. To handle these exceptions, we use the try and except blocks:

try:
    #Code that potentially can raise exception here
except:
    #Things that run when an exception is caught

Let us look at the following code:
def readVal(valType, msgRequest, msgError):
    tries=0
    while tries<4

        val = input(msgRequest)
        try:
            val=valType(val)
            return val
        except:
            print(msgError)
            tries+=1
    raise TypeError('Number of tries exceeded')

print(readVal(int,"Please enter an int:","You did not enter an int."))


We can actually catch specific exceptions raised by an exception like this:

try:
    print(readVal(int,"Please enter an int:","You did not enter an int."))
except TypeError as tE:
    print(tE)


In this case, we can see that the tE variable stores the TypeError Exception. We can directly print the tE variable to display its message. If we do not specify the Exception type then all Exceptions are caught by the statement. We can also specify multiple except statements to represent the different cases of Exceptions.

The notion of the Class is what distinguishes a modern programming language from a primitive one.

A Module is a collection of related functions. An example of a Module is math, which we import by using "import math". From there, we can then use the log function stored in the math module using math.log(). The Dot-Notation is used to Disambiguate names. For example, item1.x and item2.x are different x's.

A Class is similar to a Module in that it is a collection of functions, but it also contains data on which its functions operate on. It allows us to pass a whole group of functions and data around. This is the basis of what we call Object-Oriented Programming. An example of a Class is the list, which contains functions like append(). The data and function associated with the object is called an object's attributes. The data, function, and everything, including the class itself, is an object.

When we deal with objects, there is a Message Passing Metaphor. The basic metaphor is that when we write something like L.append(e), we are actually passing the message "append(e)" into the L object, and it is executed within its own scope. A Method, such as append(), is a Function associated with an Object. A Class is a collection of objects with identical characteristics that forms a type. Lists and Dictionaries are actually built-in classes.

Computer Science 07

The oldest kind of list in the world is the Linked List. In a Linked List, each element has a pointer to the next element in the list. In this case, finding the ith element is O(i) as it has to traverse to the next item to find the next, and so on. In a system like this, Binary Search doesn't work very well.

In Python, however, lists are differently implemented in a system of pointers. Let us suppose that a pointer in a list occupies 4 units of memory. The definition of unit is arbitrary, as is the number 4 arbitrarily chosen. To get to the ith object of a list of 'int's, we access the object  pointed by the Pointer stored in:

start+(i*n)

We know exactly where to get the pointer to get to the actual object, instead of having to traverse each of them individually till we find it. In Python's implementation, and in fact all OOP (Object-Oriented Programming) languages, the lists require constant time to reach, which is defined as Q(1).

This implementation is called indirection. It is known that most problems we look at can be solved by adding layers of indirection. The only kind of problems that cannot be solved with indirection is when the levels of indirection is too great that it uses up the available memory.

Binary Search is a very powerful tool, but it works only when the assumption that the list is sorted is true. In order to perform Binary Search, the process is as shown:
1) Sort L
2) Perform Binary Search on L

We know that Binary Search is definitely O(log len(L)). In order for the above to make sense, we have to make sure that Sort L takes less than O(len(L)). However, there currently exists no sorting algorithm that could sort an element in sub-linear time. Therefore, if a list is unsorted, we will always take at least O(len(L))+O(log len(L)) time.

However, there are still benefits to using Binary Search after sorting an unsorted list, as opposed to Linear Search. We are interested in what's called the Anortized Complexity. The idea here is that if we can sort the list once, we can perform the search many times. Therefore, if the number of searches performed on the list is a great number, the Order of Growth is:

lim k->inf (O(len(L))+k*O(log len(L))/k)
= k*O(log len(L))/k
= O(log len(L))


Obama and Bubble Sort

A very simple sort is the Selection Sort. It is not a very good way to sort, but it is a useful method. In Selection Sort, like most other algorithms, it depends upon maintaining an Invariant. An Invariant is something that is invariantly true. The invariant we are going to maintain is a pointer into the list, which divides the list into a Prefix and a Suffix. The requirement is that the Prefix is assumed to be always sorted. This requirement is known as the Invariant. We'll start with an empty Prefix. Each step through the Algorithm, we decrease the size of the Suffix by one element, while increasing the Prefix by one element. We'll be done when the size of Suffix is 0.

In Selection Sort, suppose that we have the following objects in the list:
| 3 5 6 7 8 5 4 3 2

The | is the invariant pointer. The element looks through everything at the right side (the Suffix) and swaps the smallest element with the next element, then increments the invariant pointer:

2 | 5 6 7 8 5 4 3 3

It then looks for the next smallest element and swaps it with the next element:

2 3 | 6 7 8 5 4 5 3

This will continue on as shown:

2 3 3 | 7 8 5 4 5 6
2 3 3 4 | 8 5 7 5 6
2 3 3 4 5 | 8 7 5 6
2 3 3 4 5 5 | 7 8 6
2 3 3 4 5 5 6 | 8 7
2 3 3 4 5 5 6 7 | 8
2 3 3 4 5 5 6 7 8 |

The following code implements a simple Selection Sort:

def selectionSort(L):
    invariant=0
    minValue=L[invariant]
    minPointer=invariant
    for i in range(len(L)):
        for i in range(invariant,len(L)):
            if (L[i] < minValue):
                minValue=L[i]
                minPointer=i
        temp=L[invariant]
        L[invariant]=minValue
        L[minPointer]=temp
        invariant+=1
        if (invariant

            minValue=L[invariant]
            minPointer=invariant
    return L

L=[7,4,3,7,3,5,2,6,5,3,1]
print(selectionSort(L))


In this case, here's the breakdown of the list as it loops through the steps:
[1, 4, 3, 7, 3, 5, 2, 6, 5, 3, 7]
[1, 2, 3, 7, 3, 5, 4, 6, 5, 3, 7]
[1, 2, 3, 7, 3, 5, 4, 6, 5, 3, 7]
[1, 2, 3, 3, 7, 5, 4, 6, 5, 3, 7]
[1, 2, 3, 3, 3, 5, 4, 6, 5, 7, 7]
[1, 2, 3, 3, 3, 4, 5, 6, 5, 7, 7]
[1, 2, 3, 3, 3, 4, 5, 6, 5, 7, 7]
[1, 2, 3, 3, 3, 4, 5, 5, 6, 7, 7]
[1, 2, 3, 3, 3, 4, 5, 5, 6, 7, 7]
[1, 2, 3, 3, 3, 4, 5, 5, 6, 7, 7]

For a problem of length 11, we actually went through the middle loop 66 times. If the list is length 12, it would go through 78 times. The order of growth for this problem is:
O((n(n+1)/2))

If we take away any multiplicative and additive constants, we get:
O(n2)

For some time, people are actually unsure if we could sort this list better. This is a form of the Divide and Conquer algorithm. For a Divide and Conquer algorithm, we:
1) Choose a Threshold Input Size, which is n0 which is the smallest problem
2) How much are we going to divide the problem
3) Combine the sub-solutions back into the main solution

John Von Neuman observed that given two sorted list, we can merge them quickly. Here's an example of two lists:
1=[1,3,5,6,9]
2=[1,7,8,10,11,12]
3=[]

We first compare the first elements of each side. We would take smaller (or first, if both are the same) and put it to the merged list like this:
1=[3,5,6,9]
2=[1,7,8,10,11,12]
3=[1]

The process goes on like this:

1=[3,5,6,9]
2=[7,8,10,11,12]
3=[1,1]

1=[5,6,9]
2=[7,8,10,11,12]
3=[1,1,3]

1=[6,9]
2=[7,8,10,11,12]
3=[1,1,3,5]

1=[9]
2=[7,8,10,11,12]
3=[1,1,3,5,6]

1=[9]
2=[8,10,11,12]
3=[1,1,3,5,6,7]

1=[9]
2=[10,11,12]
3=[1,1,3,5,7,8]

1=[]
2=[10,11,12]
3=[1,1,3,5,7,8]

1=[]
2=[]
3=[1,1,3,5,7,8,10,11,12]

In this case, the maximum number of copies is Linear, while the maximum number of comparisons we are going to make is dependent on the total lengths of the lists, which is also Linear. We write this by:

O(len(L))+O(len(L))

Removing all multiplicative constant, this is the time taken to merge two lists:

O(len(L))

We can merge in Linear time.

To implement mergesort, we break up the list into a bunch of lists of length one, then we start merging them into sorted lists.

Since each merge is O(n) where n is the length of the list, and we call merge log2n depths, we end up with:

O(n log2n)

Mergesort is Logarithmic, and therefore a little slower than Linear, but a lot faster than Quadratic.

Here is the source code for a merge sort function:

def merge(L1,L2,compare):
    #Initialize an empty answer list
    ans=[]
    #Initialize positional counters
    i,j=0,0
    #As long as there are things to compare (i.e. It will run till the shorter list runs out)
    while i<len(L1) and j<len(L2):
        #Append the smaller of the two to the answer list
        if compare(L1[i],L2[j]):
            ans.append(L1[i])
            i+=1
        else:
            ans.append(L2[j])
            j+=1
    #Append the remaining list to the answer list
    while i<len(L1):
        ans.append(L1[i])
        i+=1
    while j<len(L2):
        ans.append(L2[j])
        j+=1
    #Return the list
    return ans

def mergeSort(L,compare):
    #If list is length of 1, return it
    if len(L)<2:
        return L
    else:
        #If list is longer than 1
        #Find the middle of the list
        middle=int(len(L)/2)
        #Attempt to mergeSort it further
        L1=mergeSort(L[0:middle],compare)
        L2=mergeSort(L[middle:len(L)],compare)
        #L1 and L2 should be sorted by now
        return merge(L1,L2,compare)

L=[2,4,5,6,8,12,12,12,1,3,5,6,7,13]
print(mergeSort(L,compare=lambda x,y:x<y))


Let us look at the base case for the mergeSort function:
If the list is length of 1, it is considered sorted

Next, we look at the recursive case:
If the list is more than length of 1, break them in half sort them, then merge them and return the answer.

At the point where mergeSort(L) where len(L)==2, it would be broken into L1 and L2 of list size 1 each. Next, it is merged and returned into its previous caller's L1 or L2.

The previous caller would then see sorted size 2 lists of L1 and L2 returned, and it would in turn try to merge them and return them, to its previous caller's L1 or L2.

Next, the previous caller would get sorted size 4 lists in its L1 and L2, and it will attempt to merge them.

Now, you would ask, what happens if the list is not a power of 2? For example, if the list size is 5:
1)The first iteration would call mergeSort(L[0:2]) and mergesort(L[2:5]).
    a)The first level 2 iteration would call mergeSort(L[0]) and mergeSort(L[1])
        i)The first level 3 iteration would return the 1 length list.
        ii)The second level 3 iteration would return the 1 length list.
    a)The first level 2 iteration would merge the two sorted lists and return it.
    b)The second level 2 iteration would call mergeSort(L[0]) and mergeSort(L[1:2])
        i)The first level 3 iteration would return the 1 length list.
        ii)The second level 3 iteration would call mergeSort(L[0]) and mergeSort[L[1]
            (1)The first level 4 iteration would return the 1 length list.
            (2)The second level 4 iteration would return the 1 length list.
        ii)The second level 3 iteration would merge the two lists and return it.
    b)The second level 2 iteration would merge the two sorted lists and return it.
1)The first iteration would merge the two sorted lists and return it.

Computer Science 06

Many people think about how to make their program work. However, to be an excellent Computer Scientist, we must learn to make programs work efficiently.

We learnt that Brute Force can solve almost every problem. However, Brute Force can sometimes be really impractical especially when it comes to the length of time required to complete something.

Efficiency is mostly about algorithm, and not rarely about the tiny details in the code. Creation of algorithm usually has the goal of reducing a problem to a previously solved problem. This process is known as Problem Reduction.

Efficiency is classified in two dimensions:
1) Space
2) Time

When we talk about Space and Time, we can often trade one for the other. What we are talking about is Memory and CPU Utilization. These days, we focus on Time, because Space is usually in abundance to a degree.

Computational Complexity is influenced by:
1) Speed of the machine
2) Cleverness of programming language version
3) Input

To talk about Computational Complexity, we count the number of basic steps. An ideal function to calculate the Computational Complexity would be:
T(x)=n, where n is the number of steps taken for an input x.

A step is an operation that takes constant time and is predictable. An example of a step is an assignment, comparison, array index, and so on.

Our modern computers are Random Access Machines. Instructions are executed Sequentially and we assume Constant Time required to access memory. Primitive Computers used things like tape for memory and this model cannot be mapped on it. Modern Computers somewhat maps well to this model, but if we look in detail, because of Memory Hierarchy, it may take longer for some memory to be accessed. However, generally, the Random Access Model works.

For each algorithm, we have a best case, and a worst case. For example, in Linear Search, where a list is searched from the first to the last element sequentially. The Best Case for Linear Search is when the item is found in the first index. The Worst Case is when the item does not exist in the Linear Search. We also look at the Expected Case which is the average, but it is the hardest to talk about because we need to know exactly how the elements are sorted.

In Computational Complexity Analysis, we usually look at the Worst Case.

The Worst Case provides an upper bound. Once we know the upper bound, we cannot be surprised. Even though it seems pessimistic to think about the Worst Case, the Worst Case happens really often.

Let us analyze a function that computes factorial:

def factorial(n):
    assert n >= 0
    answer = 1
    while n > 1:
        answer*=n
        n-=1
    return answer


The command assert is considered a step, while initialization of answer is another step. When it enters the loop, the conditional test is a step, and the two operations in it are considered two steps. The loop runs for n times. Finally, the return statement is considered a step. We can derive the Computational Complexity as:

2+3*n+1

However, when we have a large input, we can ignore additive constants. We will end up with the following formula:

3*n

When we look at the Computational Complexity, we are concerned with growth with respect to size. We can actually ignore multiplicative constants:

n

This model of Asymptotic Growth shows how the Complexity grows according to the size of inputs.

The Big Oh is a notion to describe Asymptotic Growth. To say that O grows linearly with n, we write:

O(n)

This gives us an upper bound for the Asymptotic Growth of the function. To describe the Asymptotic Growth of a Function, we write:
F(x) is O(x2)

This is formally read:
"Function F grows no faster than the quadratic polynomial x2"

Here are some popular Orders:
1) O(1) - Constant regardless of input
2) O(log n) - Logarithmic growth
3) O(n) - Linear growth
4) O(nlog n) - Loglinear Growth, which occurs surprisingly often
5) O(nc) - Polynomial
6) O(cn) - Exponential

A Bisection Search is an example of a Binary Search. Binary Searches are Logarithmic in growth, while Linear Search are Linear in growth.

A Recursive Factorial function can be like this:

def resursiveFactorial(n):
    assert n >= 0
    if n<=1:
        return n
    else:
        return n*recursiveFactorial(n-1)


If we observe this statement, we can derive that the complexity is:

O(n)

We can see that Recursion or Iterative doesn't change the Order of Complexity.

An example of a quadratic growth function is as such:

def iterativeAdd(n):
    for x in range(n):
        for y in range(n):
            x+=1
    return x


In this case, the growth is according to:

O(n2)

Let us now look at a Logarithmic function:

def sumOfNumbers(x)
    strX = str(int)
    answer=0
    for y in strX:
        answer+=int(y)
    return x


In this case, the Order of Complexity is dependent in the number of digits in n. To represent this, we use:

O(log10n)

In a Binary Search, we can observe that as the possible values for doubles, the number of search increases only by 1. We can represent this by:

O(log2len(L))

Here's a good read pertaining to Orders of Complexity:
When all you have is a hammer, everything looks like a nail

Wednesday, May 22, 2013

Computer Science 05

When we first learn about numbers, we learn about Base 10 (or Radix 10). We know that the decimal system represents values in the numbers 0 to 9. When we look at binary numbers, we see long strings of zeroes and ones. It is in fact, similar to the decimal system, except that it's base 2. Binary numbers take more digits to represent a value. A binary number of n digits can only represent up to a maximum of 2n values. People think in base 10, while computers work in base 2. It is difficult to explain why humans work that way, but computers actually work that way because it is the easiest to build switches that have two stable states: ON (1) or OFF (0).

For whole numbers (i.e. int), it is easy to convert from decimal to binary.

However, if we look at fractional representation of 1/8 in binary. 0.125 in binary looks like this:
0.001

Why is it the case? We first relate this to the decimal system. In the decimal system of base 10,  0.1 actually means 1*10-1 and 0.01 actually means 1*10-2 and so on.

In binary, 0.1 actually means 1*2-1 and 0.01 actually means 1*2-2. In fact, 0.11 in binary actually means 1*2-1+1*2-2. Therefore, 0.001 is actually 1*2-3. Decimal 0.625 in binary is actually 0.101.

The problem comes when we try to convert decimal 0.1 to binary. To get this, we take the process:
1) Is 0b0.1 higher than 0.1? Yes. Set the bit to 0. We have 0.
2) Is 0b0.01 higher than 0.1? Yes. Set the bit to 0. We have 0.
3) Is 0b0.001 higher than 0.1? Yes. Set the bit to 0. We have 0.
4) Is 0b0.0001 higher than 0.1? No. Set the bit to 1. We have 0.0625.
5) Is 0b0.00011 higher than 0.1? No. Set the bit to 1. We have 0.09375.
6) Is 0b0.000111 higher than 0.1? Yes. Set the bit to 0. We have 0.09375.
7) Is 0b0.0001101 higher than 0.1? Yes. Set the bit to 0. We have 0.09375.
8) Is 0b0.00011001 higher than 0.1? No. Set the bit to 1. We have 0.09765625.
9) Is 0b0.000110011 higher than 0.1? No. Set the bit to 1. We have 0.099609375.

As you can see, through this rigorous process, we inch closer and closer to 0.1, but we never ever reach 0.1.

Therefore, we should never use floating point numbers to make equal comparisons.

To get a more accurate representation of a floating point number, we can use:
print(repr(0.1))

Most of the time it is safe to assume that the floating point numbers are exactly what we think they are. However, there are times when floating point arithmetic's weaknesses show:

number=0.0
accurateNumber=10000.0
for x in range(0,1000000):
    number+=0.01
print(number)
print(number*10)
print(number==accurateNumber)


In this case, as you can see, the variable number has actually accumulated quite a bit of inaccuracy problems from having gone through too many floating point operations. It is not equal to accurateNumber as we would have expected it to.

To test for two floating numbers, we should not use the == operator. Instead, we should use:

abs(number-accurateNumber)<=epsilon

You define an epsilon yourself, which can be for instance 0.00001. When you test two floating point numbers there is an extremely high probability that we would get False when we should be getting True.

There is an urban legend on how the process of fixing programs came to be known as "debugging". It is the First actual case of a bug being found by Admiral Grace Hopper in 1947.

The goal of debugging is not to eliminate one bug quickly, but to move towards a bug-free program.

Debugging is a learnt skill and nobody does it well and instinctively. A good programmer is a good debugger, and a good debugger is a person who can think systematically and efficiently about how to move toward a bug-free program. The same skills used to debug programs can be transferred to real-life situations.

Debuggers are designed to help programmers evaluate their program. A lot of experienced programmers swear that the best debugging tool is actually print. A good debugger knows where to search for good places to place print statements.

The question to debugging is not "Why doesn't it produce the output I want it to?", but rather "How could it have produced the result it did?". There is a scientific method to debugging, and this method is based upon studying of available data. These data include:
1) Source Code
2) Output

A hypothesis is formed that is consistent with the data, and we design and run a repeatable experiment. Such an experiment would have the potential to refute the hypothesis, which allows us to form more accurate hypotheses.

It is important for the experiment to be repeatable because some programs use random inputs as well as certain timing factors. We must make sure that we can repeat the experiment with the exact same input.

When we have a bug, we need to find a smaller, simpler inputs in which the program fails. There are three reasons why:
1) It is faster to run
2) It has less variables
3) It is easier to debug

There is a debugging process known as Binary Search. In Binary Search, we split the program into half, and we ask if the program occurred above or below. We need to now find an intermediate value that we can check on, and we print it.

There is a concept known as a Test Driver or a Test Harness. The goal of this is to write a code that isolates certain parts of a program to find out what bugs it has. The Test Harness is usually written at a separate part of the program.

Computer Science 04

Divide and Conquer is the idea of taking a difficult problem and breaking them into simpler pieces. These smaller problems must have two properties:
1) The smaller problems are easier to solve than the original ones.
2) The solutions to these problems can easily can be combined to solve this big problem.

The technique of Recursion allows Divide and Conquer to be applied to problems. In Computer Science, it is:
1) A way of describing / defining problems
2) A way of designing solutions

A Recursive problem has typically two parts:
1) Base Case - It describes a direct answer.
2) Recursive (Inductive) Case - Reduce the problem to a simpler version of the same problem through some simpler operations.

Here are some problems:
1) Calculate bn resursively
-We first start by defining that bn is b*b*b*b*b...*b n times.
-We can then reduce the problem to b*bn-1

We can now say:
bn = b.bn-1 if n!=0
bn = 1 if n==0

We can create a recursive method like this:

def recursivePow(number,exponent)
    if exponent==0
        return 1
    else
        return number*recursivePow(number,exponent-1)
 

We implemented the above statements into code form. Of course, this method only works with positive values of n.

We can also use it to look at the Towers of Hanoi problem.
There is a Temple in Hanoi, and in it there are 3 tall towers. There are 64 golden discs of descending sizes. The rules are like this:
1) You can only move one disc at a time
2) A larger disc cannot cover up a smaller disc

The recursive solution is this:
1) Suppose that you have three stacks. A starting stack, an ending stack, and a spare stack.
2) The solution is to move a stack of n-1 from the starting stack to the spare stack. Then move the last piece to the ending stack. Then move everything from the spare stack to the ending stack.
3) If the stack is of size 1, just move it to the ending stack.

In code, we have:

def hanoi(n, stackFrom, stackTo, stackSpare):
    if n==1:
        print("Move from",stackFrom,"to",stackTo)
    else:
        hanoi(n-1,stackFrom,stackSpare,stackTo)
        hanoi(1,stackFrom,stackTo,stackSpare)
        hanoi(n-1,stackSpare,stackTo,stackFrom)

hanoi(5, "Starting", "Ending", "Spare")


You can test this using:
Tower of Hanoi

Next, here's another problem. Suppose that we are looking for a palindrome. A palindrome is a sentence that reads the same from both sides: "able was I ere I saw elba". To approach a palindrome, here's how we do it:
1) Check if the first and last characters are the same
2) If they are, remove one character and check again till there are no more characters left.
3) If it satisfies all the way, then it's a palindrome.

The Basecases are:
1) If string is length 1, it is a palindrome
2) If string is length 0, it is a palindrome

The Recursive case is:
1) If first and last character is the same, continue checking

Implemented in code, it would be:

def formatSentence(sentence):
    sentence=sentence.lower()
    ans=''
    for c in sentence:
        if c in {'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z'}:
            ans+=c
    return ans

def recursivePalindrome(sentence):
    if len(sentence)<=1:
        return True
    elif sentence[0]==sentence[len(sentence)-1]:
        return recursivePalindrome(sentence[1:len(sentence)-1])
    else:
        return False
print(recursivePalindrome(formatSentence("Able was I ere I saw Elba")))
print(recursivePalindrome(formatSentence("Are we not drawn onward, we few, drawn onward to new era")))<\code>


As we can see, we can solve problems by breaking it down into simpler problems.

Here's another problem we can look at:
1) Suppose that a pair of rabbits can mate within 1 month
2) A pair of rabbits will be born after gestation period of 1 month
3) The pair is a male and a female
4) How many female rabbits will there be at the end of a year?

We can look at the problem in a timeline:
MonthFemale
01
11
22
33
45
58
613
721

We can now derive the function:
1) f(n)=f(n-1)+f(n-2)

Here are the Base cases:
1) If month is 0, return 1
2) If month is 1, return 1

Here are the Recursive cases:
1) Otherwise, return f(n-1)+f(n-2)

We can now create the code for generating the Fibonacci number:

def fibonacci(n):
    if n<=1:
        return 1
    else:
        return fibonacci(n-1)+fibonacci(n-2)

print(fibonacci(30))


Fibonacci numbers appear in nature all the time. The number of petal on most flowers is almost always a Fibonacci number.

Here is a hair-raising read:
Fibonacci in Nature

You can actually find the golden ratio using:
lim(x->inf) fibonacci(n):fibonacci(n-1)

Computer Science 03

Recall that previously we made programs did approximation. We could print the approximations during the process onto the screen. However, these approximations are not recorded and you cannot really do anything with them.

We have three data structures to collect items in Python:
1) Tuples
2) Lists
3) Dictionaries

Tuples and Lists are Ordered Sequences of Objects. You can refer to the objects by using an index.
Dictionaries on the other hand, are NOT ordered.

You can declare a 'tuple' as such:

helloTuple = (1, 2, 3, 4, 5)

We can then print the 'tuple' as such

print(hello[1]) #Prints the second object in the tuple
print(hello) #Prints the contents of the tuple
print(len(hello)) #Prints the length of the tuple
print(hello[len(hello)-1]) #Prints the last element in the tuple


We can use tuples in a variety of ways. For example, to find numbers divisible by 3 from the range of 0 to 100, we can use:

divisor=5
divisible=() #This declares an empty tuple
for x in range(0,101):
     if x%divisor==0:
        divisible+=(x,)
print(divisible)


This would check if the number is divisible by divisor using the modulo (%) function, and if it is, it will append a tuple of length 1 to the back of the divisible tuple.

Tuples are Immutable. The values of tuples cannot be changed. For example, if we cannot use divisible[0]=1 at the end of our previous example.

Lists are Mutable. The value of the list object can be changed.

You can create lists using:

listNames = ['Kelvin', 'Ling']
listNames2 = ['Wynne', 'Elaine']


The difference is that we're using a square bracket instead of a normal one.

We can append lists as such:

listNames.append(listNames2)

Append is a method that actually Mutates the list. In this list, we have various items:
1) Object 0 is a 'str'
2) Object 1 is a 'str'
3) Object 2 is a 'list'

In this case, list2[0] and list1[2][0] is actually referring to the same object in memory. Modifying either will cause both to change since they're referring to the same thing. Append adds a Pointer to the list instead of simply copying.

The difference is that in the first example, if we are actually concatenating two tuples, a new tuple is created and it is assigned to the variable. We can treat tuples as Primitive Data-Types in Java, and tuples as Objects in Java.

If you concatenate a list into a tuple, a new tuple is created, with the list at the back. Suppose that tuple1[2] contains the Pointer to list1, tuple1[2][0] contains the same Pointer as list1[0], changing tuple1[2][0] actually changes list1[0]. The tuple1 has not changed, because it is Non-Mutable. The changes are applied to the list1 instead.

Assignment is changing the object to which an identifier binds to.
Mutation is changing the actual value of the object.

Lists and tuples can be sliced the same way as a 'str'.

You can loop through a tuple or list as such:

for x in list1
    print(x)


You can also use the remove() method to remove the first occurrence of a specified object:

allNumbers=[1,2,3,4,5,6,7,8,9]
unwantedNumbers=[1,3,7,8,9]
for x in unwantedNumbers:
    if x in allNumbers:
        allNumbers.remove(x)
print(allNumbers)


The code goes through every unwanted number and if it exists in the list of all numbers, it would remove the first occurrence of it. To remove all occurence, we use the loop "while x in unwantedNumbers" instead.

You can sort a list like this:

allNumbers.sort()

An important concept to note is highlighted here:

list1=[1]
list2=[list1,list1]
print(list2) #We should get [[1], [1]]
list1[0]='hello'
print(list2) #We should get [['hello], ['hello']]
list2[0]='a'
print(list2) #We should get ['a', ['hello']]
list1=['fish']
print(list2) #We should still get the same, which is ['a', ['hello']]


From the last print, we can see that the original list object pointed by list1 has not changed. Instead, The identifier list1 has actually been assigned to a different list.

When one Object have multiple names, we call it an Alias. We can get problems when dealing with Aliases for Mutable Objects. Take for example, the method for a list to another:

def copyList(listSource, listDestination):
    for x in listSource:
        listDestination.append(x)
        print(listDestination)


Copying list1 to list2 works, but if we copy list1 to list1, then we will have items adding to the source (when we add it to the destination), and we will never reach the end of the list.

You can slice, concatenate, index etc. with Lists and Tuples, but there is a built-in type, known as a Dictionary, which has the following propeties:

1) There is no order of things in a Dictionary
2) The indexes are called Keys and it can be of any Immutable type.

To create a dict:

dictionary = {'toilet':'paper', 'pencil':'eraser',3:'computer'}<\code>

It is in the format key:object. A dict is therefore a set of key:value pairs.

To select 'paper', we can use:

print(dictionary['toilet'])

If we want to view all the keys, we can use:

print(dictionary.keys())

We can delete items in the dictionary as such:

del(dictionary['pencil'])

We can use the keys returned by the dictionary to look up the dictionary as well:

for key in dictionary.keys():
    print(key,"=",dictionary[key])


In order to clone a Mutable type, we need to use the syntax:

item1=item2[:]

Computer Science 02

We will now look at Python, as we know previously as an Interpreted Programming Language.

A popular IDE (Integrated Development Environment) for Python is actually IDLE. IDLE consists of:
1) Text Editor - A place where the program can be typed and then sent to the Shell.
2) Shell - The actual environment that the Python code is interpreted.
3) Debugger - Useful for debugging, but most people use print statements.

Everything in Python is considered an Object. The Python code itself can be an Object. Each object has a Type. The Type determines what kind of object it is and what can we do with it.

Types can be two kinds:
1) Scalar - These are the primitive types of the programming language. They are indivisible types.
2) Non-scalar - Things that are divisible, for example, String, which can be divided into smaller 'str' of many chars.

The most common Scalar Types we work with are:
1) int - A real number.
2) float - An approximation of a real number.
3) bool - A true or false value.
4) NoneType - Representing a null.

An example of a Non-Scalar type is:

str - A string of characters.

In Python, we can type the literals (e.g. 3, 'a', '5.6) into the Shell and it would return that value. To check the type of a literal, we can use the type() function as such:
type(3)

This would tell us that 3 is an 'int' type.

Literals by themselves are useless. We cannot do much with it. What we can do, however, is to chain them together using Operands (values) and Operators to form Expressions.

In Python, when we deal with division, it is important to use floating point numbers. If we use "+" between two integers, 5+5, the "+" sign will act as an addition. If we use "+" between 'a' and 'b', we would get 'ab', because the "+" would act as concatenation when dealing with 'str'. This is because "+", like many other Operators, is Overloaded (dependent upon the types and parameters given to it).

We can do manual conversion as such, which is called Type Casting:
1) 'a'+str(3) gives you 'a3'
2) int(2.1) would truncate the float into an integer value of 2.

A Program in the Python sense is a synonym to a Script.

When typing a literal into the Shell, it prints the value directly to the screen. When things are executed in a Program, however, it will not be printed unless we use the print() command.
print 5

In order for things to work, it is important for us to have Variables. A Variable in Python is simply a name for an Object. To bind a Variable to an Object, we use the Assignment:
x=5

This assigns ("=") the int Object of value 5 to the Variable x.

The Function print() allows us to output things. We'll need to get inputs, and there are two types of input:
1) Raw Input (this is the only type that exists in Python 3.0)
2) Input

Raw Input always interprets what the user types as a 'str' type. We would need to do Type Casting:

x=int(input("Please enter an integer: "))
print("You entered "+str(x)+".")


The simplest programs are Straight Line Programs. Straight Line Programs are Programs where every line is computed exactly once, with no conditions or loops (which are referred to as "tests").

Conditional Statements are the simplest "tests". A conditional statement comprises of:
1) if
2) else
3) elif

A program where Conditional Statements are implemented is called a Branching Program. In a Branching Program, each statement is executed the most once.

In conditional statements, the indentation is extremely important. This is how a simple conditional statement is implemented:

x=int(input("Enter an integer: "))
if x%2==0:
    print("Even")
else:
    print("Odd")


Programs are intended to be readable, and by enforcing this, Python actually ensures some sort of code home-keeping.

Finally, we look at a Looping Construct which allows us to do Iterations. When we do this, we'll be able to create a Turing Complete Program.

Whenever we write a loop, we should have either an Incrementing or a Decrementing Function. This allows the loop a guarantee to exit.

The properties of a Decrementing Function is as such:
1) It will map a set of program variables to an 'int'.
2) It starts with a non-negative number.
3) The loop terminates when the value of the variable is <= 0.
4) It is decreased each time through the loop.

In a loop as seen here, there is in fact a Decrementing Function:

while a<25:
    a+=1


Even though all we can see is an Increment here, we can derive the Decrementing Function of:
25-a

Loops allow us to do something known as Exhaustive Enumeration, which we usually refer to as Bruteforce.

The for loop in Python is as such:

for x in range(0,25):
    instructions


In this case, range(0,25) actually creates a tuple with a sequence starting from 0 to 24.

Approximation in computing is finding a number that is accurate enough. It does not need to be perfect, but it must be close enough for our use. For example:

y^2=x+-epsilon

Where epsilon is the degree of error acceptable.

We would then be able to come up with a slow converging approximation algorithm as such:

x = 123
epsilon = 0.01
ans = 0
while abs(ans**2-x)>=epsilon and ans<=x:
    ans+=0.00001
print(ans," is close to sqrt of ",x)


The program is extremely dependent on size of the number given to test, and the maximum number of times it can loop can be calculated by x/0.00001.

A more effective algorithm exists to solve this problem. It is called the Bisection Search. It can be implemented by cutting the search space in half after each iteration. The worst case would be log(x) where x is the number of possible value of answers.

In this method, we first define the possible range of answers. We then try to guess the answer from the middle of the range. We then check if the answer when put through the formula gives us an answer near enough to what we want. If it's higher, then we search for a lower number. If it's lower, we search for a higher number. It is implemented as such:

x=5;
epsilon=0.00000001;
upperBound=x;
lowerBound=0.0;
ans=(lowerBound+upperBound)/2.0;
while abs(ans**2-x)>epsilon:
    if ans**2 > x:
        upperBound=ans;
    else:
        lowerBound=ans;
    if ans==(lowerBound+upperBound)/2.0:
        break
    else:
        ans=(lowerBound+upperBound)/2.0
    print(ans)
if abs(ans**2-x)>epsilon:
    print("Converged at",ans,"without meeting epsilon specifications of",epsilon)
    print("Resultant error is",ans**3-x)
else:
    print("Converged at",ans)


This Program does smart guesses at a range of values you give it. It terminates when it has converged to an answer or it's failed. However, this assumes that the answer lies between 0 to x, which means that for problems where the answer is outside this range, this could not solve it. That is a flaw of this method.

An example to observe this problem is when we try to search for numbers less than 1.0. For example, 0.5's square root is 0.707107, and if we look at the search space of 0 to 0.5, we'll never find it! We then observe this pattern that for every number that is less than 1, the highest answer is 1.0. Therefore, we can see that for the upper bound, we cannot have it less than 1. We can change our upperBound statement as such:

upperBound=max(x,1.0)

This could take the value of x, or 1.0, whichever is higher.

We typically want programs to do things with minimal coding. A smaller program is more efficient than a larger program who can do the same thing.

We can use Functions to help shorten code. A Function is something that provides:
1) Decomposition - It allows us to break our program up into modules. An example of a module is a Function and a Class. Modules should be self-contained and reusable.
2) Abstraction - Suppresses details that the programmer does not need to see. This allows the programmer to focus on what it does.

A simple function could be as such:

def accurateEnough(x, y, epsilon):
    return abs(x-y)<=epsilon


In order to use this Function, we invoke it as such:

print(accurateEnough(1,2,1))

Note that you must invoke this function below where it's been defined or it would not work.

A Function can contain:
1) Formal Parameters - are the names x, y and epsilon. You can use x in the main code and it would have nothing to do with the parameters used in the function, because it is self-contained. This is because upon entry of a function, a new Scope is created. A Scope is a mapping of names to objects. The parameter passed into the function during its call is called the Actual Parameter.
2) Return - It is a statement that causes the Function to return a value to wherever it is called. If no Return is specified, then the NoneType is returned.
3) Assert - The assert statement takes a bool value (resulting from a bool, expression or function) and stops the function if it evaluates to False, otherwise it continues.

We can create a more robust function with assert as such:

def accurateEnough(x, y, epsilon):
    assert epsilon>=0
    return abs(x-y)<=epsilon


This ensures that epsilon is a non-negative value before it continues.

What happens when we run a program:
1) The Interpreter will build a Scope, and it would find Object it sees in sequence and assigns the value of it to the Name (the Object can be a Function or a Variable, etc.).
2) If a Function is invoked, then the Interpreter would create a new Scope for it and begin executing it as in Step 1.
3) This will continue till the program reaches the end.

Each Scope is also called a Stack Frame. When we start the Program, we begin with the Main Scope. Then as we progress into Functions, we'll have Scopes adding to the top of the Stack. When we're done with the Function, its Scope gets popped from the top, and we return to the Scope of the previous Function or the Main Scope. We would eventually end up again with the Main Scope. This is because a Stack is FILO (Frst-In-Last-Out). We can use Variables defined in a Scope below its own Scope (e.g. a Function uses a Variable in the Main Scope). When a Function exits, its Variables and everything else in its scope are lost.

We now look at the 'str' type, which is the most common Non-Scalar type. A 'str' can be used as such:

sum=0
for x in str(1234):
    sum+=int(c)
print(sum)


What we get here is that the numbers 1234 is first converted into a 'str' of '1234'. For each item in the String, starting from 1, they are added to the total sum. In the end, we'll have the answer of 10.

We can deal with 'str' as if it's an array. Suppose that we have the following 'str':

helloStr = 'hello'

We can access the first character of the 'str' using:

print(helloStr[0])

The above code would give us a 'h'. We can also access a range of characters, as such:

print(helloStr[0:2])

This would print out the first (0) and second (1) char in the 'str'. This is a method known as a Slicing. It makes a new copy of the original 'str' (known as a Substring). There are also different operations that you can do with 'str' types, for example, to find patterns (and get the location of the provided pattern), we can use:

helloStr.find('lo')

We would get the output 3.

We'll now look at the following piece of code, which enables us to find patterns in a provided text:

#Finds patterns in Strings
providedString = input('Enter the text: ')
findString = input('Enter the pattern: ')
low = providedString.find(findString)
while providedString.find(findString,low)!=-1:
    print(providedString.find(findString,low),"to",low+len(findString))
    print(providedString[0:low]+'['+providedString[low:low+len(findString)]+']'+providedString[low+len(findString):])
    low=providedString.find(findString,low+1)


In this Program, we first get a String, and a Substring to search for. We have a 'low' value to indicate the last position from the last search. We find the first occurrence of the searched text, and print its start and end slice index of the providedString. Next, we print out the text, adding brackets between the found text. This repeats till no more occurrence is found.

Monday, May 20, 2013

Computer Science 01

Engineers are people who can map problems into computational frameworks.

A good computer scientist is someone who has the skills to make a computer do whatever they want it to do.

Let's start with the question... What is computation? To answer this, we first look at two kinds of knowledge:

1) Declarative - It is composed of statements of facts. It does not tell us how to arrive at a conclusion, but if a piece of declarative knowledge is correct, we can use it to test computations. An example of declarative knowledge is: "A chocolate cake is delicious".

2) Imperative - Imperative knowledge tells how to solve something. For example, a recipe for a chocolate cake is imperative knowledge.

We use the recipe to create a dish (Imperative Knowledge), and if the dish is indeed delicious, looks everything like the description of a chocolate cake (Declarative Knowledge) then we know we've made a chocolate cake.

Let's look at an approximation algorithm that lets us solve a square root. Say for example, we are looking for the value of sqrt(25).

We first get facts about the declarative knowledge we know of squares and their roots. It is as such:

g^2=x

We have x, we need to find g.

The approximation algorithm, a form of imperative knowledge, for obtaining the sqrt(x) is as such:
1) Guess a value g.
2) Compute the value of g^2.
3) If g^2 is x, or close enough to x, then g is the root of x.
4) Else, the next value of g will be (g+x/g)/2 (The average of g and x/g). Compute again from Step 2.

Programmatically, we write it like this:

while(math.pow(g,2)!=x)
{
    g=g+x/g;
}


When you get the value of g which when squared gives you x, it is said that the algorithm has converged.

An algorithm is made up of the following components:
Instructions - Operations that are carried out, e.g. "g=g+x/g"
Flow Control - The way and sequence the instructions are carried out.
Termination Condition - When are we satisfied with the answer, e.g. "g^2 equals x or is close enough to x"

There are two types of programs:

1) Fixed Program Computer - These computers are designed to do specific and fixed things. Instructions are hard-coded into circuitry (think logic gates in ASICs) and only input and output is considered data.

2) Stored Program Computer -The input, instructions and output are all considered data and are all stored alongside in memory. This allows infinite amount of flexiblity and programs can now produce programs. This is the current state of computers.

In the Stored Program Computer, people think of a computer as a Hardware Program, known as an Interpreter.

A basic Stored Program Computer has the following things:
1) Memory - Memory used for storing of data and program.
2) Control Unit - Interprets instructions from memory and turns it into a form that the ALU can understand.
3) Arithmetic Logic Unit - Accepts instructions from Control Unit, pulls and pushes data from Memory, computes and stores results in its Accumulator (the ALU's own Memory).
4) Input/Output - Allows interaction with external systems.

A British mathematician proved that there are 6 primitive instructions that can be used to do anything that can be done with the computer.

In Programming, we make use of a set of Primitive Instructions and a set of Flow Control. Using and combining these two elements, we can come up with different programs.

A Programming Language comprises of:
1) Syntax - Which sequence of characters and symbols constitute a well-formed string, e.g. g=5+6.
2) Static Semantics - Which well-formed strings have a meaning, e.g. 6+4 by itself is well-formed but does not have a proper meaning/purpose.
3) Semantics - What that meaning is. A program can have a proper syntax and proper static semantics, but it may mean differently from what the programmer wants it to do.

A program with improper semantics can:
1) Crash - The program stops running and shows an obvious sign that it's happened. A properly designed program environment will keep damages minimal and local (i.e. does not damage the program or system).
2) Never Stop - The program never finishes the job, which is also known as "Hang". These programs typically have entered an infinite loop due to a deadlock or semantics error.
3) Wrong Output - The program runs to completion, but produces the wrong answer. This is the worst kind of problem because people may think it's running fine and it isn't.

There are two types of Programming Languages:
1) Interpreted - Python, where the source code is executed directly and does not need to be compiled.
2) Compiled - Java, where a compiler changes source code into an object code, which is executed by the Hardware Interpreter.

Sunday, May 5, 2013

CCNA Review 04

You can see a plethora of different hardware in a typical network, but the most important two devices are the Switches and the Routers.

Switches were not like that from the start. In the past, everything was connected by network hubs. A hub is basically a repeater with multiple ports. Being a repeater, every device on the network actually shares the bandwidth available through it. Therefore, only one device can transmit at the same time. If more than one device transmits, then it is said that there is a collision.

To combat collisions, the CSMA/CD (Carrier Sense Multiple Access / Collision Detection) protocol is created which detects collisions whenever one happens. When two devices transmits at the same time, a collision is resulted and both devices would transmit a jam signal. The devices would back off for a random amount of time before transmitting again.

There are three types of RX/TX modes:
-Simplex is where the transmitter only transmits, and the receiver only receives. It is a one-way traffic. An example of simplex communication is radio broadcasting.
-Half-Duplex is when only one device can transmit across the wire at the same time. When one talks, all others sharing the medium must listen. An example of half-duplex communication is WiFi.
-Full-Duplex allows all devices to transmit at the same time. Modern switched networks and router interfaces are full-duplex.

The biggest problem with the hub is that it is a shared medium, and therefore it is half-duplex. If 5 devices are transmitting at the same time across a 50Mbps hub, then each of them have less than 10Mbps for transmission. In this case, an entire hub is a collision domain. The repeater and hub is a layer 1 device.

A new device came and allowed collision domains to be segmented. This is called the bridge. The bridge is an intelligent layer 2 device which allows learning of MAC addresses on each interface, much like a switch. Hubs can be connected to the bridges, allowing larger networks to be created. Each interface on the bridge is a collision domain if a hub is connected to it. If a computer is connected directly to a bridge, it has full duplex connectivity. However, the biggest problem with bridges is that they're software based, so they introduce really high latencies and limited bandwidth.

Switches were then created, which is much like a bridge, but moves frames around via ASICs. Every port on the switch is a full-duplex wire. If all ports are full-duplex, there can be no collision occuring on the switch. However, each switchport is still considered a collision domain by definition. Modern switches can support multiple speeds per interface and it is managed and intelligent. It is managed because you can modify settings, and it has a large feature-set and intelligence due to its IOS firmware.

Switches typically have one or two high-speed links for daisy chaining switches together. Switches need to be connected with crossover cables but modern switches have auto-sensing ports that allow connection through straight-through as well. Modern switches also have SFP (Small-Form Factor Pluggable Transceiver) modules which allow chaining together via Fibre Optics.

When switches are first booted up, its CAM (Content Addressable Memory) table is empty. CAM tables are used to store MAC address associations with interfaces. Once STP (Spanning Tree Protocol, covered in a later article) is completed and stable, the switches will start learning MAC addresses from its interfaces.

Suppose that two devices, A and B, connected to the switch wants to communicate. A would first create an ARP request to look for B. At this time, A's MAC will be recorded, while the ARP request is forwarded out of all ports. B would receive the ARP request and generate an ARP reply, which allows the switch to record the location of B's MAC. Now that the switch has both device's MACs, further communication from now would be switched at wire-speed.

By default, CAM entries has a timeout of 5 minutes. If a device stops communicating for 5 minutes, the entry is dropped. If A stops communicating for 5 minutes and his MAC gets dropped, further communication to A will be forwarded out of all open ports like a broadcast until his MAC is learned again.

CCNA Review 03

In this article we are going to talk about two of the most common transport layer protocols: UDP and TCP.

If the two protocols could talk, UDP would say, "I hope it gets there", while TCP would say, "I'll make sure it gets there".



UDP (User Datagram Protocol) is a connectionless protocol. It does not establish connections and packets sent by it has no guarantee of reaching the intended target. Therefore, UDP is unreliable. However, it has many practical uses due to its low overhead and is typically used in time-sensitive, real-time applications like VoIP where a dropped packet is not important but latency and jitter could render the whole protocol useless.



An example of a UDP protocol is DNS Client. The DNS Client is UDP not because of the time-sensitive nature but because everything fits into one packet and it makes no sense to have to establish a full connection with the 3-Way Handshake and the subsequent ACKs and termination just to send that one packet. If the request is dropped the client simply has to send another one after a timeout. Much more efficient than using TCP.

TCP (Transmission Control Protocol) is a connection-oriented protocol. Before the start of every transmission, there is a Three-Way Handshake that starts a session between the clients. The handshake consists of a SYN, SYN-ACK and ACK, which will be covered more in detail later. TCP requires ACK for messages, which allows dropped packets to be detected and resent. TCP is therefore reliable. However, due to the connection-oriented nature, it has costly overhead and is more suitable for applications that deal with large transfers that has a requirement for data integrity. TCP has a mechanism known as the TCP Sliding Window Protocol, which allows more efficient use of the acknowledgment system. This will be covered in greater detail later.



An example of TCP is the HTTP, where a 3-Way Handshake is first established before the client sends a GET message. Since web-browsing is data-integrity sensitive and bandwidth-heavy, TCP is the perfect protocol. Protocols like FTP use TCP for the same reason.

As previously mentioned, the Three-Way Handshake consists of a SYN, SYN-ACK, and an ACK. It is used to initialize a connection between two communicating hosts. The initiating host (in the case of a client accessing a web server) would first send a SYN 0 to the target. This is a good time to introduce the concept of sequence numbers, which TCP uses to track what's been received and what's not been received. The server would then reply with a ACK 1, stating that it has received 0 and is ready for 1. It would also send a SYN 0 along with it. Finally, the client sends a ACK 1.



At this point there would be a connection ready for data to be transmitted. The first HTTP packet would then be sent out.

The sequence number represents the amount of bytes sent out by the sender. The ACK number represents the sequence that it is ready to receive. This is particularly useful in implementing the TCP Sliding Window Protocol. To implement the Sliding Window, TCP first sends out one segment. Once it is successful, it would send more segment at one go (e.g. 2). If it is successful, it would send some more (e.g. 4), and so on. It would do so until it reaches the tolerance of the end-to-end devices and packets get dropped. It knows that packets get dropped when the ACK it receives is not what is expects. The next transmission would then start from that number of packets, and then from there it slowly increases.



The common TCP and UDP ports are:

TCP
21 - FTP (File Transfer Protocol)
22 - SSH (Secure SHell)
23 - Telnet
25 - SMTP (Simple Mail Transfer Protocol)
53 - DNS Server (DNS server-server communication)
80 - HTTP (Hyper-text Transfer Protocol)
110 - POP3 (Post Office Protocol 3)
443 - HTTPS (Hyper-text Transfer Protocol Secure)

UDP
53 - DNS Client (DNS client-server communication)
69 - TFTP (Trivial File Transfer Protocol)

Both protocols have 65535 distinct ports.
Well known ports are from 0 to 1023.
1024 to 49151 are registered ports.
Finally, 49152 to 65535 (215+214 to 216-1) are the dynamic/private ports.

Let us look at the previous examples we used again.



If A wants to visit a website on B, then this is going to happen.

Again, A would check if B is in the same network. It would realize that B is in a different network, so it needs to send it through its default gateway. It ARPs for the router's MAC and once it gets it, the following frame is created:

(v,w)
FCS (Created at the end)
SYN
S TCP - 58372 (Randomly generated)
D TCP - 80
S IP - A's IP
D IP - B's IP
S MAC - A's MAC
D MAC - R1's MAC

As the switch receives the frame, it would associate A's MAC to the interface, then forward it out of all ports because it does not know where the router is. When the router receives the frame, it would look deeper at the IP header and realize that it's intended for another network that it can reach by sending it to R2. It would replace the S and D MAC, recalculate the FCS, then send it out as:

(x)
FCS (Created at the end)
SYN
S TCP - 58372
D TCP - 80
S IP - A's IP
D IP - B's IP
S MAC - R1's MAC
D MAC - R2's MAC

Once F2 gets it, it would realize that it is intended for something connected to S2, so it sends it out of that interface like this:

(y,z)
FCS (Created at the end)
SYN
S TCP - 5837
D TCP - 80
S IP - A's IP
D IP - B's IP
S MAC - R2's MAC
D MAC - B's MAC

The switch learns that R2 is at that interface, then sends it out through all the other ports. B receives it, processes the IP header and realizes that it's for itself. Once it reaches the TCP header, it would realize that something is trying to initiate a connection to its port 80 (HTTP) from port 58372. It then replies with the following:

(y,z)
FCS (Created at the end)
SYN, ACK
S TCP - 80
D TCP - 58372
S IP - B's IP
D IP - A's IP
S MAC - B's MAC
D MAC - R2's MAC

The same process happens but the other way round, till A receives the SYN,ACK. It would then proceed to send an ACK. Thereafter, the HTTP connection would take place.
<