...

Wednesday, May 22, 2013

Computer Science 05

When we first learn about numbers, we learn about Base 10 (or Radix 10). We know that the decimal system represents values in the numbers 0 to 9. When we look at binary numbers, we see long strings of zeroes and ones. It is in fact, similar to the decimal system, except that it's base 2. Binary numbers take more digits to represent a value. A binary number of n digits can only represent up to a maximum of 2n values. People think in base 10, while computers work in base 2. It is difficult to explain why humans work that way, but computers actually work that way because it is the easiest to build switches that have two stable states: ON (1) or OFF (0).

For whole numbers (i.e. int), it is easy to convert from decimal to binary.

However, if we look at fractional representation of 1/8 in binary. 0.125 in binary looks like this:
0.001

Why is it the case? We first relate this to the decimal system. In the decimal system of base 10,  0.1 actually means 1*10-1 and 0.01 actually means 1*10-2 and so on.

In binary, 0.1 actually means 1*2-1 and 0.01 actually means 1*2-2. In fact, 0.11 in binary actually means 1*2-1+1*2-2. Therefore, 0.001 is actually 1*2-3. Decimal 0.625 in binary is actually 0.101.

The problem comes when we try to convert decimal 0.1 to binary. To get this, we take the process:
1) Is 0b0.1 higher than 0.1? Yes. Set the bit to 0. We have 0.
2) Is 0b0.01 higher than 0.1? Yes. Set the bit to 0. We have 0.
3) Is 0b0.001 higher than 0.1? Yes. Set the bit to 0. We have 0.
4) Is 0b0.0001 higher than 0.1? No. Set the bit to 1. We have 0.0625.
5) Is 0b0.00011 higher than 0.1? No. Set the bit to 1. We have 0.09375.
6) Is 0b0.000111 higher than 0.1? Yes. Set the bit to 0. We have 0.09375.
7) Is 0b0.0001101 higher than 0.1? Yes. Set the bit to 0. We have 0.09375.
8) Is 0b0.00011001 higher than 0.1? No. Set the bit to 1. We have 0.09765625.
9) Is 0b0.000110011 higher than 0.1? No. Set the bit to 1. We have 0.099609375.

As you can see, through this rigorous process, we inch closer and closer to 0.1, but we never ever reach 0.1.

Therefore, we should never use floating point numbers to make equal comparisons.

To get a more accurate representation of a floating point number, we can use:
print(repr(0.1))

Most of the time it is safe to assume that the floating point numbers are exactly what we think they are. However, there are times when floating point arithmetic's weaknesses show:

number=0.0
accurateNumber=10000.0
for x in range(0,1000000):
    number+=0.01
print(number)
print(number*10)
print(number==accurateNumber)


In this case, as you can see, the variable number has actually accumulated quite a bit of inaccuracy problems from having gone through too many floating point operations. It is not equal to accurateNumber as we would have expected it to.

To test for two floating numbers, we should not use the == operator. Instead, we should use:

abs(number-accurateNumber)<=epsilon

You define an epsilon yourself, which can be for instance 0.00001. When you test two floating point numbers there is an extremely high probability that we would get False when we should be getting True.

There is an urban legend on how the process of fixing programs came to be known as "debugging". It is the First actual case of a bug being found by Admiral Grace Hopper in 1947.

The goal of debugging is not to eliminate one bug quickly, but to move towards a bug-free program.

Debugging is a learnt skill and nobody does it well and instinctively. A good programmer is a good debugger, and a good debugger is a person who can think systematically and efficiently about how to move toward a bug-free program. The same skills used to debug programs can be transferred to real-life situations.

Debuggers are designed to help programmers evaluate their program. A lot of experienced programmers swear that the best debugging tool is actually print. A good debugger knows where to search for good places to place print statements.

The question to debugging is not "Why doesn't it produce the output I want it to?", but rather "How could it have produced the result it did?". There is a scientific method to debugging, and this method is based upon studying of available data. These data include:
1) Source Code
2) Output

A hypothesis is formed that is consistent with the data, and we design and run a repeatable experiment. Such an experiment would have the potential to refute the hypothesis, which allows us to form more accurate hypotheses.

It is important for the experiment to be repeatable because some programs use random inputs as well as certain timing factors. We must make sure that we can repeat the experiment with the exact same input.

When we have a bug, we need to find a smaller, simpler inputs in which the program fails. There are three reasons why:
1) It is faster to run
2) It has less variables
3) It is easier to debug

There is a debugging process known as Binary Search. In Binary Search, we split the program into half, and we ask if the program occurred above or below. We need to now find an intermediate value that we can check on, and we print it.

There is a concept known as a Test Driver or a Test Harness. The goal of this is to write a code that isolates certain parts of a program to find out what bugs it has. The Test Harness is usually written at a separate part of the program.

1 comment :

  1. Equipped to situations that you will need to have
    the opportunity to have custom tee shirts. They have captured the
    heart rate of the young and the old equally.


    Also visit my web blog agencja detektywistyczna

    ReplyDelete

<