Monday, June 10, 2013

Computer Science 11

We'll start today's article with an installation of matplotlib from here:

Let's go through an example in graph plotting. The below example shows plotting of a graph that compounds annual interest of principal $100,000 for 50 years:

import pylab

principal = 100000
interestAnnual = 0.05
years = 50
values = []
for i in range(years+1):

You should see a graph that bends upwards. Notice that even though we know exactly what we're looking at, because it's a simple graph and we know what the x and y axes represent, it may not be useful for other people we are showing this to.

Therefore it is important that we include labelled axes. To label the axes, we use the following code before we use title(), xlabel() and ylabel() functions as shown:

import pylab

principal = 100000.0
interestAnnual = 0.05
years = 50
pylab.title(str(interestAnnual)+" Growth of $"+str(principal)+" Compounded Annually over "+str(years)+" years")
pylab.xlabel("Years of Compounding")
pylab.ylabel("Value of Principal (SGD)")
values = []
for i in range(years+1):

Randomness is an integral part of problem solving, because much problems in the real world are based on random (or seemingly random) variables.

Let's look at one of the simplest forms of probability: The Die. If a die is thrown 10 times, what is the probability of it not getting a 1 in ALL the throws?

On the first throw, the probability of rolling a 1 is 1/6, and NOT rolling a 1 on the first throw is 5/6.

The probability of not getting 1's on two rolls is (5/6)2.

Therefore, the probability of not getting 1's at all in 10 rolls is (5/6)10.

We can use the 1 minus rule to find the probability of getting AT LEAST ONE 1. All we have to do is to use 1-(5/6)2.

Blaise Pascal, the same person that made important contributions to the study of fluids, vacuum and pressure, also founded the probability theory. His friend asked him a question: Rolling a pair of dice 24 times, how likely is it that we'll get a pair of 6's.

Let's first look at the probability of rolling a pair of 6's in a single throw. Since the probability of getting a 6 on the first die is 1/6 and the probability of getting it on the second die is 1/6, there should be a 1/36 chance of rolling a pair of 6's on a single throw.

Most people would go ahead and calculate a (1/36)24. However, what we're actually getting over here is the probability of getting ALL rolls to be pairs of 6's.

To calculate the probability of getting AT LEAST ONE pair of 6's, we use the method found in the previous example. We first look at the probability of NOT rolling any pairs of 6's.

On a single throw, this would be defined as 1-(1/36) which is 35/36.

In 24 throws, we would have a chance of (35/36)24 NOT encounter a pair of 6's, which equates to ~0.5086.

If we inverse it, we would have a chance of 1-(35/36)24 to roll AT LEAST ONE pair of 6's. We have a ~0.4914 chance to roll at least a pair of 6's.

Let's write a quick simulation to calculate this probability:

import random

def rollDie():
    return random.randint(1,6)

def rollGame():
    for i in range(24):
        if rollDie()+rollDie()==12:
            return True
    return False

def simulateGame(n):
    for i in range(n):
        if rollGame():
    return wins/n


This code essentially has a rollDie() function which simulates the rolling of 1 die.

The rollGame() function rolls the die for 24 times and evaluates to True when a pair of 6's occurs and False when the pair does not occur.

Finally, simulateGame() runs n number of games and finds out how many of those games did we actually get at least ONE pair of 6's.

In this case, it seems counter-intuitive to have to write a simulation in order to solve this since it is pretty easy to find the probability of this through simple multiplication. However, simulation is important in solving really complex problems. It is also good for us to have at least 2 methods to prove our calculations.

The above simulation we've used is also known as the Monte Carlo Simulation which is named after a casino. This method is known as an Inferential Statistics method. It is based upon the principle that a random sample tends to exhibit the same properties as the population from which it was drawn. In simpler terms, if I draw a sample of people from across the country, the average opinion of this sample would be roughly the same as the average of every single person in the country.

In the Monte Carlo Simulation, the more trials you do, the more stable the value gets. In a sense, there is less jittery, and it gets closer to the actual value of the calculated probability. This is the Bernoulli's Law, or the Law of Large Numbers. This law applies when the tests are independent and does not depend on the results of the previous tests. As the number of trials tends to infinity, the output would converge to the value of the actual probability.

There is a term known as the Gambler's Fallacy where in a game of coin flipping people would tend to bet on heads if tails came up many times in a row. However, the problem here is that since coins have no memory, and each process is independent of each other, the deviation of having many heads will not be evened out as most people think it might.

Here's a demonstration of Bernoulli's Law:

import random
import pylab

def flipCoin():
    return bool(random.randint(0,1))

def exponentialTest(n):
    for i in range(n):
        for j in range(2**i):
    pylab.title("Demonstration of Bernoulli's Law")
    pylab.xlabel("Number of Flips")
    pylab.ylabel("Percentage of Heads")


This flips a coin 20 to n times and displays the results in a table. The output demonstrates that as the number of trials increase, the results converge to the actual probability:

No comments :

Post a Comment