...

Thursday, May 23, 2013

Computer Science 06

Many people think about how to make their program work. However, to be an excellent Computer Scientist, we must learn to make programs work efficiently.

We learnt that Brute Force can solve almost every problem. However, Brute Force can sometimes be really impractical especially when it comes to the length of time required to complete something.

Efficiency is mostly about algorithm, and not rarely about the tiny details in the code. Creation of algorithm usually has the goal of reducing a problem to a previously solved problem. This process is known as Problem Reduction.

Efficiency is classified in two dimensions:
1) Space
2) Time

When we talk about Space and Time, we can often trade one for the other. What we are talking about is Memory and CPU Utilization. These days, we focus on Time, because Space is usually in abundance to a degree.

Computational Complexity is influenced by:
1) Speed of the machine
2) Cleverness of programming language version
3) Input

To talk about Computational Complexity, we count the number of basic steps. An ideal function to calculate the Computational Complexity would be:
T(x)=n, where n is the number of steps taken for an input x.

A step is an operation that takes constant time and is predictable. An example of a step is an assignment, comparison, array index, and so on.

Our modern computers are Random Access Machines. Instructions are executed Sequentially and we assume Constant Time required to access memory. Primitive Computers used things like tape for memory and this model cannot be mapped on it. Modern Computers somewhat maps well to this model, but if we look in detail, because of Memory Hierarchy, it may take longer for some memory to be accessed. However, generally, the Random Access Model works.

For each algorithm, we have a best case, and a worst case. For example, in Linear Search, where a list is searched from the first to the last element sequentially. The Best Case for Linear Search is when the item is found in the first index. The Worst Case is when the item does not exist in the Linear Search. We also look at the Expected Case which is the average, but it is the hardest to talk about because we need to know exactly how the elements are sorted.

In Computational Complexity Analysis, we usually look at the Worst Case.

The Worst Case provides an upper bound. Once we know the upper bound, we cannot be surprised. Even though it seems pessimistic to think about the Worst Case, the Worst Case happens really often.

Let us analyze a function that computes factorial:

def factorial(n):
    assert n >= 0
    answer = 1
    while n > 1:
        answer*=n
        n-=1
    return answer


The command assert is considered a step, while initialization of answer is another step. When it enters the loop, the conditional test is a step, and the two operations in it are considered two steps. The loop runs for n times. Finally, the return statement is considered a step. We can derive the Computational Complexity as:

2+3*n+1

However, when we have a large input, we can ignore additive constants. We will end up with the following formula:

3*n

When we look at the Computational Complexity, we are concerned with growth with respect to size. We can actually ignore multiplicative constants:

n

This model of Asymptotic Growth shows how the Complexity grows according to the size of inputs.

The Big Oh is a notion to describe Asymptotic Growth. To say that O grows linearly with n, we write:

O(n)

This gives us an upper bound for the Asymptotic Growth of the function. To describe the Asymptotic Growth of a Function, we write:
F(x) is O(x2)

This is formally read:
"Function F grows no faster than the quadratic polynomial x2"

Here are some popular Orders:
1) O(1) - Constant regardless of input
2) O(log n) - Logarithmic growth
3) O(n) - Linear growth
4) O(nlog n) - Loglinear Growth, which occurs surprisingly often
5) O(nc) - Polynomial
6) O(cn) - Exponential

A Bisection Search is an example of a Binary Search. Binary Searches are Logarithmic in growth, while Linear Search are Linear in growth.

A Recursive Factorial function can be like this:

def resursiveFactorial(n):
    assert n >= 0
    if n<=1:
        return n
    else:
        return n*recursiveFactorial(n-1)


If we observe this statement, we can derive that the complexity is:

O(n)

We can see that Recursion or Iterative doesn't change the Order of Complexity.

An example of a quadratic growth function is as such:

def iterativeAdd(n):
    for x in range(n):
        for y in range(n):
            x+=1
    return x


In this case, the growth is according to:

O(n2)

Let us now look at a Logarithmic function:

def sumOfNumbers(x)
    strX = str(int)
    answer=0
    for y in strX:
        answer+=int(y)
    return x


In this case, the Order of Complexity is dependent in the number of digits in n. To represent this, we use:

O(log10n)

In a Binary Search, we can observe that as the possible values for doubles, the number of search increases only by 1. We can represent this by:

O(log2len(L))

Here's a good read pertaining to Orders of Complexity:
When all you have is a hammer, everything looks like a nail

No comments :

Post a Comment

<