The benefit of using a generator function as opposed to returning a whole is that it saves on memory requirements of having to store an entire list in memory. It also cuts down response time. Let us look at the following example:
def generateList(n):
ans=[]
for i in range(n):
ans.append(n*2)
return ans
for i in generateList(1000000000):
print(i)
It actually does the same thing as:
def generatorYield(n):
for i in range(n):
yield i*2
for i in generatorYield(1000000000):
print(i)
However, the yield function uses much less memory as it does not have to store a full list in memory. It also has faster response time because it reacts upon the first yield, as compared to waiting for the entire list to be populated.
This is not a perfect example, but it does illustrate some little principles and basic implementation of the yield function.
For much of the history of Science, people focus on trying to find Analytic Models. An Analytic Model allows prediction of behavior through a function by having a set of initial conditions and parameters. This led to things like the calculus and the probability theories.
However, Analytic Models do not work all the time. In the 20th century, people started turning to Simulation Models. When it comes to things like systems which are not mathematically tractable, like weather forecasting, instead of spending a lot of time to build accurate mathematical models, it's easier to build successively refining series of simulations. It is often easier to extract useful intermediate results than it is to build a detailed Analytic Model. In the past, people had to build really complex mechanical clockwork models to work with simulations. The advent of computers gave rise to the adoption of Simulation Models.
A Simulation Model has the following properties:
1) It gives useful information about behavior of a system, as opposed to an Analytic Model which gives the exact outcome of the situation.
2) It gives us an approximation to reality.
3) Simulation models are descriptive, instead of prescriptive. A prescriptive model gives the exact same output for every input. In simulations, given a set of conditions, you may get different outcomes each time you run it because it allows factoring of things that cannot be mathematically modeled.
We may not get the same answer every time but if we do the simulation enough times we're going to get a good sense of what is going to happen.
In 1827, Robert Brown observed that a pollen particle suspended in water would float around randomly under microscopic conditions. They call it the Brownian Motion. The first clear mathematical model didn't really come about until in the 1900s. In 1905, Albert Einstein came up with the model that confirmed the existence of atoms.
Brownian Motion in Wikipedia
Brownian Motion is an example of a Random Walk. A Random Walk simulation is extremely useful when trying to model particles that are going to move in a random direction and velocity at each time-step. We can use it to model particles in fluid. This is also extremely useful in understanding biological processes. It is also useful in modeling stock markets.
Suppose that we have a drunken person in the middle of a field, and each second he can take one step in any cardinal direction (NSEW). After a thousand seconds, where would he be?
To model this, we're going to divide this field into x and y axes. Let's assume that the person starts of at (0,0), which is the origin.
This is the probability table mapping steps, distance and probability:
At the first step:
There is a probability of 1 that he can be 1 step away.
He would be about 1 step away.
At the second step:
There is a probability of 1/4 that he can be back at 0,
a probability of 2/4 that he can be sqrt(2) steps away,
and a probability of 1/4 that he can be 2 steps away.
He would be about sqrt(2)/2+0.5=1.2 steps away
At the third step:
There is a probability of 9/16 to be 1 step away,
3/8 to be sqrt(5) steps away,1/16 probability to be 3 steps away,
He would be about 9/16+3/16+(3sqrt(5)/8)=1.5 steps away.
As the number of steps increase, it becomes extremely difficult to calculate where he will end up at.
From here, it seems like the more steps the person takes, the further away he will be.
To simulate this, we need to model a person, a field, and a location to help with calculations.
The field object can be a collection of drunks and their location.
This is how the entire code looks like:
import math
import random
import copy
class Location(object):
def __init__(self,x,y):
self.x=float(x)
self.y=float(y)
def set_location(self,xy):
self.x=float(xy[0])
self.y=float(xy[1])
def move(self,xy):
self.x+=float(xy[0])
self.y+=float(xy[1])
def get_location(self):
return (self.x,self.y)
def distance_from(self,other):
return math.sqrt((other.x-self.x)**2+(other.y-self.y)**2)
def __str__(self):
return str((self.x,self.y))
class Field(object):
def __init__(self):
self.field={}
def add_person(self,person,location):
if person not in self.field.keys():
self.field[person]=copy.deepcopy(location)
def rem_person(self,person):
if person in self.field.keys():
self.field.pop(person)
def move_person(self,person):
if person in self.field.keys():
self.field[person].move(person.step())
def where_is(self,person):
if person in self.field.keys():
return self.field[person]
class Person(object):
def __init__(self,name):
self.name=name
def step(self):
return random.choice(((0,1),(1,0),(0,-1),(-1,0)))
def __str__(self):
return self.name
We can then implement a test like this:
origin=Location(0,0)
kelvin=Person("Kelvin")
field=Field()
field.add_person(kelvin,origin)
tries=1000
steps=1000
average_distance=0.0
maximum_distance=0.0
minimum_distance=float(steps)
for i in range(tries):
for j in range(steps):
field.move_person(kelvin)
distance_moved=origin.distance_from(field.where_is(kelvin))
average_distance+=distance_moved/float(tries)
maximum_distance=max(maximum_distance,distance_moved)
minimum_distance=min(minimum_distance,distance_moved)
origin.set_location(field.where_is(kelvin).get_location())
print("Average Travelled:",average_distance)
print("Maximum Travelled:",maximum_distance)
print("Minimum Travelled:",minimum_distance)
As you can see, the results vary quite a lot and in order to get a proper reading we have to set tries to a really high value.
How do we think about the results of the programs when the programs themselves are stochastic? Almost everything in the real world is stochastic, and we must be really careful in accepting results. Let's think about the role of randomness in computation. The world is extremely non-deterministic. That means that we can only say that "something has a very high probability of occurring", and there is no such thing as "something will certainly occur".
Causal Non-Determinism is the believe that not every event is caused by previous events. Einstein and Schrodinger strongly disagreed with this. Instead, they strongly believed in Predictive Non-Determinism. The concept is that because of our inability to make accurate measurements about the physical world, it makes it impossible to make precise predictions about the future. Things are not unpredictable, it's just because we just don't / can't know enough to be able to predict it.
Stochastic Process is a process whose next state is based on the previous state and at least one random element.
Random in Python can be implemented via:
1) random.choice() - Helps you choose randomly based on what's passed to the function.
2) random.random() - Generates a number that is 0.0 < x <= 1.0.
A process that returns a value between 1 to 6 (a dice roller) is not considered Stochastic as the value returned is independent upon the previous values. In 10 throws, the probability of getting all 1's is:
1/(610)
In fact, the probability of getting all 0's is also 1/(610), as is any other combination (e.g. 1, 3, 2, 5, 3, 4, 3, 2, 4, 2).
Looking from the probability perspective, for such a process, we should be asking: What fraction of the possible results have the property we are testing for (e.g. have exactly four 1's, etc).
From the other side, the probability of us getting NOT all 1's is:
1-1/(610)
This "1 minus" trick will come in very handy.
A lot of times when we're doing computing, we use print statements to present our results. When it comes to decimal places or when dealing with large amounts of data, print statements can be quite messy and difficult to read.
In Python, there is a plotting package known as pylab. It provides quite a lot of facilities from Matlab. We can use pylab for plotting graphics very easily.
However, pylab is not part of the standard Python library, and therefore we need to install it.
Head on over here and install one of the binaries compatible with your computer.
The next article will discuss more about pylab.