Converting decimal to binary

Converting fraction

base-q expansion form

0.625 = x_1\cdot 2^{-1}+x_2\cdot 2^{-2}+x_3\cdot 2^{-3}

0.625 = x_1\cdot 2^{-1}+x_2\cdot 2^{-2}+x_3\cdot 2^{-3}

0.625 = x_1\cdot \frac{1}{2}+x_2\cdot \frac{1}{2^2}+x_3\cdot \frac{1}{2^3}

0.625 = x_1\cdot \frac{1}{2}+x_2\cdot \frac{1}{2^2}+x_3\cdot \frac{1}{2^3}

Factor out 1/2

0.625 = \frac{1}{2}(x_1+\frac{1}{2}x_2+\frac{1}{2^2}x_3)

0.625 = \frac{1}{2}(x_1+\frac{1}{2}x_2+\frac{1}{2^2}x_3)

What is ?

x_1

x_1

Single out first digit

If are at most equal to 1, and the right side is greater than 1, then must be equal to 1

x_2\cdot\frac{1}{2^2}+x_3\cdot\frac{1}{2^3}

x_2\cdot\frac{1}{2^2}+x_3\cdot\frac{1}{2^3}

1.25 = x_1+x_2\cdot 2^{-1}+x_3\cdot 2^{-2}

1.25 = x_1+x_2\cdot 2^{-1}+x_3\cdot 2^{-2}

x_1

x_1

Factor out

0.25 = \frac{1}{2}\cdot(x_2+\frac{1}{2}\cdot x_3)

0.25 = \frac{1}{2}\cdot(x_2+\frac{1}{2}\cdot x_3)

What is ?

x_2

x_2

Repeat until no digits with left

0.5 = x_2+\frac{1}{2}\cdot x_3

0.5 = x_2+\frac{1}{2}\cdot x_3

1 = x_3

1 = x_3

0.5 = x_3\cdot\frac{1}{2}

0.5 = x_3\cdot\frac{1}{2}

\frac{1}{2}

\frac{1}{2}

\frac{1}{2}

\frac{1}{2}

So the algorithm is:

continue multiplying by 2 until the remainder is zero

0.625_{10} = 0.101_2

0.625_{10} = 0.101_2

0.375_{10} = ?_2

0.375_{10} = ?_2

Converting integer

base-q expansion form

12 = x_3\cdot 2^3+x_2\cdot 2^2+x_1\cdot 2^1+x_0\cdot 2^0

12 = x_3\cdot 2^3+x_2\cdot 2^2+x_1\cdot 2^1+x_0\cdot 2^0

12 = x_3\cdot 2^3+x_2\cdot 2^2+x_1\cdot 2^1+x_0

12 = x_3\cdot 2^3+x_2\cdot 2^2+x_1\cdot 2^1+x_0

Factor out 2

12 = 2\cdot(x_3\cdot 2^2+x_2\cdot 2^1+x_1\cdot 2^0) +x_0

12 = 2\cdot(x_3\cdot 2^2+x_2\cdot 2^1+x_1\cdot 2^0) +x_0

What is ?

x_0

x_0

Single out last digits

If and are both even numbers, than must be

x_0

x_0

2y + x_0

2y + x_0

x_0

x_0

0

0

Factor out 2

6 = 2\cdot(x_2\cdot 2^1+x_2\cdot 2^0) + x_1

6 = 2\cdot(x_2\cdot 2^1+x_2\cdot 2^0) + x_1

What is ?

x_1

x_1

Repeat until no digits with 2 left

Continue...

So the algorithm is:

continue factoring out 2 until the remainder is zero

12_{10}=1100_2

12_{10}=1100_2

Converting binary to decimal

Converting fraction

base-q expansion form

0.625 = x_1\cdot \frac{1}{2}+x_2\cdot \frac{1}{2^2}+x_3\cdot \frac{1}{2^3}

0.625 = x_1\cdot \frac{1}{2}+x_2\cdot \frac{1}{2^2}+x_3\cdot \frac{1}{2^3}

0.625_{10} = 0.101_2

0.625_{10} = 0.101_2

So, the algorithm is...

start from 0, add the current result to the current digit and divide by 2

0.625_{10} = 0.101_2

0.625_{10} = 0.101_2

0.1011_2=?_{10}

0.1011_2=?_{10}

Converting integer

base-q expansion form

12 = x_3\cdot 2^3+x_2\cdot 2^2+x_1\cdot 2^1+x_0\cdot 2^0

12 = x_3\cdot 2^3+x_2\cdot 2^2+x_1\cdot 2^1+x_0\cdot 2^0

So, the algorithm is...

start from 0, multiply the current result by 2 and add the current digit

12_{10} = 1100_2

12_{10} = 1100_2

Converting

0.1 and 0.2

to binary

Converting 0.1

0.1\cdot2=0.2\enspace\enspace\enspace0.0...

0.1\cdot2=0.2\enspace\enspace\enspace0.0...

0.2\cdot2=0.4\enspace\enspace\enspace0.00...

0.2\cdot2=0.4\enspace\enspace\enspace0.00...

0.4\cdot2=0.8\enspace\enspace\enspace0.000...

0.4\cdot2=0.8\enspace\enspace\enspace0.000...

0.8\cdot2=1.6\enspace\enspace\enspace0.0001...

0.8\cdot2=1.6\enspace\enspace\enspace0.0001...

0.6\cdot2=1.2\enspace\enspace\enspace0.00011...

0.6\cdot2=1.2\enspace\enspace\enspace0.00011...

0.2\cdot2=0.4\enspace\enspace\enspace0.000110...

0.2\cdot2=0.4\enspace\enspace\enspace0.000110...

Converting 0.2

0.2\cdot2=0.4\enspace\enspace\enspace0.00...

0.2\cdot2=0.4\enspace\enspace\enspace0.00...

0.4\cdot2=0.8\enspace\enspace\enspace0.00...

0.4\cdot2=0.8\enspace\enspace\enspace0.00...

0.8\cdot2=1.6\enspace\enspace\enspace0.001...

0.8\cdot2=1.6\enspace\enspace\enspace0.001...

0.6\cdot2=1.2\enspace\enspace\enspace0.0011...

0.6\cdot2=1.2\enspace\enspace\enspace0.0011...

0.2\cdot2=0.4\enspace\enspace\enspace0.00110...

0.2\cdot2=0.4\enspace\enspace\enspace0.00110...

Representinga number in a

scientific notation

Why scientific notation?

Scientific notation is a way to work easily with

very large or small numbers.

General form

2 = 2\cdot10^0

2 = 2\cdot10^0

Normalized form

1100.101

1100.101

1100.101 = 1.100101\cdot2^3

1100.101 = 1.100101\cdot2^3

?

Rounding binary numbers

Possible rouding options

Round 0.42385 to 2 places. What numbers can it be round to?

0.42 and 0.43

Round 0.110110 to 2 places. What numbers can it be round to?

for a lower number we simply discard the remaining places
for a larger number we add 1 to the number in the last place

0.11 and 1.00 (0.11+0.01)

How we define that

Defining shortest distance

x_1 < x_2

x_1 < x_2

x_2 < x_1

x_2 < x_1

Is or ?

But!

Subtraction is cumbersome and

won't work for infinite fractions

x_1 = |0.11-0.110110|=0.000110

x_1 = |0.11-0.110110|=0.000110

x_2 = |1.00-0.110110|=0.001010

x_2 = |1.00-0.110110|=0.001010

So, and we should round to

x_1 < x_2

x_1 < x_2

0.11

0.11

Comparing with the middle

So, and we should round to

0.11

0.11

0.110110 > 0.111

0.110110 > 0.111

0.110110 < 0.111

0.110110 < 0.111

Is or ?

0.110110

0.110110

0.111000

0.111000

0.110110 < 0.111

0.110110 < 0.111

larger

General rule for rounding

If X is 0,

Round 0.UVXYZ... to 2 places, the middle is 0.UV1, so

If X is 1 and and any remaining digit is 1,

round down

If X is 1 and all remaining digits are 0,

apply tie breaker rule (round to even)

0.UV0YZ...

0.UV1 (middle)

0.UV100...

0.UV1 (middle)

round up

0.UV101...

0.UV1 (middle)

Rounding 0.1 and 0.2

Rounding 0.1 to 52 bits

Finding middle

>

Rounding up

Comparing

Rounding 0.2 to 52 bits

Finding middle

>

Rounding up

Comparing

Representing negative numbers with offset binary (Excess-K, biased representation)

Offset-K

How many numbers can represent with 4 bits?

2^4 = 16

2^4 = 16

[0;15]

[0;15]

For positive numbers the range is

But if we include negative numbers, it can be

[-8;7]

[-8;7]

[0000;1111]

[0000;1111]

[0000;1111]

[0000;1111]

K = 2^{n-1}

K = 2^{n-1}

where n, is the number of bits

[-7;8]

[-7;8]

[0000;1111]

[0000;1111]

Offset defines the range

K = 2^{n-1}-1

K = 2^{n-1}-1

(floating point standard)

8 and 7 is an offset, referred to as `K` (excess-K, excess-7)

(common usage)

Calculating number

How to store the number 3 in 4 bits?

1. Calculating offset (K)

K = 2^{4-1}-1 = 7

K = 2^{4-1}-1 = 7

If 0000 is -7, then what number should we add to get 3?

-7+10=3\rightarrow10=3+7

-7+10=3\rightarrow10=3+7

representation = number to store + K

3+K=3+7=10_{10}=1010_2

3+K=3+7=10_{10}=1010_2

number to store = representation - K

1010_2=10_{10};\enspace\enspace10-K=10-7=3

1010_2=10_{10};\enspace\enspace10-K=10-7=3

First bit defines the sign

-8_{10} = 0000_2

-8_{10} = 0000_2

8_{10} = 1000_2

8_{10} = 1000_2

-8_{10} + 8_{10} = 0000_{2}+1000_{2} = 0_{10} = 1000_{2}

-8_{10} + 8_{10} = 0000_{2}+1000_{2} = 0_{10} = 1000_{2}

0 - negative number

1 - positive number

Floating point according to the IEEE754

Format

Name	Total bits	Exponent	Significand
Single precision	32	8	23
Double precision	64	11	52

0 - positive value

1- negative value

Sign

Exponent

offset binary (biased representation)

Converting 0.1

Calculating exponent

to EEE-754 double precision

K = 2^{11-1}-1 = 1023

K = 2^{11-1}-1 = 1023

-4+1023=1019_{10}=01111111011_2

-4+1023=1019_{10}=01111111011_2

Converting 0.2

Calculating exponent

to EEE-754 double precision

K = 2^{11-1}-1 = 1023

K = 2^{11-1}-1 = 1023

-3+1023=1020_{10}=01111111100_2

-3+1023=1020_{10}=01111111100_2

Validating conversion

function to64bitFloat(number) {
    var f = new Float64Array(1);
    f[0] = number;
    var view = new Uint8Array(f.buffer);
    var i, result = "";
    for (i = view.length - 1; i >= 0; i--) {
        var bits = view[i].toString(2);
        if (bits.length < 8) {
            bits = new Array(8 - bits.length).fill('0').join("") + bits;
        }
        result += bits;
    }
    return result;
}

to64bitFloat(0.1);
// 0 01111111011 1001100110011001100110011001100110011001100110011010

to64bitFloat(0.2);
// 0 01111111100 1001100110011001100110011001100110011001100110011010

Calculating 0.1 + 0.2

Adjusting the exponent

1.100110011001100110011001100110011001100110011001101\cdot2^{-3} =

1.100110011001100110011001100110011001100110011001101\cdot2^{-3} =

11.00110011001100110011001100110011001100110011001101\cdot2^{-4}

11.00110011001100110011001100110011001100110011001101\cdot2^{-4}

Adding numbers

0.1100110011001100110011001100110011001100110011001101

0.1100110011001100110011001100110011001100110011001101

1.1001100110011001100110011001100110011001100110011010

1.1001100110011001100110011001100110011001100110011010

+

10.0110011001100110011001100110011001100110011001100111

10.0110011001100110011001100110011001100110011001100111

Normalizing

Rounded 0.1

Rounded 0.2

1.100110011001100110011001100110011001100110011001101\cdot2^{-4}

1.100110011001100110011001100110011001100110011001101\cdot2^{-4}

10.0110011001100110011001100110011001100110011001100111\cdot2^{-3}=

10.0110011001100110011001100110011001100110011001100111\cdot2^{-3}=

Rounding

Finding middle

=

Round to even (round up here)

Comparing

1.00110011...001100110011

1.00110011...001100110011

0.00000000...000000000001

0.00000000...000000000001

+

1.00110011...001100110100

1.00110011...001100110100

Converting to decimal

1.00110011001100110011001100110011001100110011001101 \cdot 2^{-2}=

1.00110011001100110011001100110011001100110011001101 \cdot 2^{-2}=

0.0100110011001100110011001100110011001100110011001101

0.0100110011001100110011001100110011001100110011001101

Big.DP = 52;
var numbers = "0100110011001100110011001100110011001100110011001101".split("").reverse();
var sum = Big(0);

numbers.forEach(function (number) {
    sum = sum.add(number).div(2);
    console.log(sum.toString());
});

Output

0.5
0.25
0.625
0.8125
0.40625
0.203125
0.6015625
0.80078125
0.400390625
0.2001953125
0.60009765625
0.800048828125
0.4000244140625
0.20001220703125
0.600006103515625
0.8000030517578125
0.40000152587890625
0.200000762939453125
0.6000003814697265625
0.80000019073486328125
0.400000095367431640625
0.2000000476837158203125
0.60000002384185791015625
0.800000011920928955078125
0.4000000059604644775390625
0.20000000298023223876953125
0.600000001490116119384765625
0.8000000007450580596923828125
0.40000000037252902984619140625
0.200000000186264514923095703125
0.6000000000931322574615478515625
0.80000000004656612873077392578125
0.400000000023283064365386962890625
0.2000000000116415321826934814453125
0.60000000000582076609134674072265625
0.800000000002910383045673370361328125
0.4000000000014551915228366851806640625
0.20000000000072759576141834259033203125
0.600000000000363797880709171295166015625
0.8000000000001818989403545856475830078125
0.40000000000009094947017729282379150390625
0.200000000000045474735088646411895751953125
0.6000000000000227373675443232059478759765625
0.80000000000001136868377216160297393798828125
0.400000000000005684341886080801486968994140625
0.2000000000000028421709430404007434844970703125
0.60000000000000142108547152020037174224853515625
0.800000000000000710542735760100185871124267578125
0.4000000000000003552713678800500929355621337890625
0.20000000000000017763568394002504646778106689453125
0.600000000000000088817841970012523233890533447265625
0.3000000000000000444089209850062616169452667236328125

0.3000000000000000444089209850062616169452667236328125

Result

Calculating

Resources

http://www.exploringbinary.com/binary-converter/

http://blog.chewxy.com/2014/02/24/what-every-javascript-developer-should-know-about-floating-point-numbers/

https://en.wikipedia.org/wiki/Offset_binary

Why 0.1 + 0.2

is not equal to 0.3

Converting decimal to binary

Converting fraction

base-q expansion form

Factor out 1/2

What is ?

Single out first digit

If are at most equal to 1, and the right side is greater than 1, then must be equal to 1

Factor out

What is ?

Repeat until no digits with left

So the algorithm is:

Converting integer

base-q expansion form

Factor out 2

What is ?

Single out last digits

If and are both even numbers, than must be

Factor out 2

What is ?

Repeat until no digits with 2 left

Continue...

So the algorithm is:

Converting binary to decimal

Converting fraction

base-q expansion form

So, the algorithm is...

Converting integer

base-q expansion form

So, the algorithm is...

Converting

0.1 and 0.2

to binary

Converting 0.1

Converting 0.2

Representinga number in a

scientific notation

Why scientific notation?

Normalized form

Representing 0.1 and 0.2 using normalized form

Rounding binary numbers

Possible rouding options

Defining shortest distance

Comparing with the middle

General rule for rounding

Rounding 0.1 and 0.2

Rounding 0.1 to 52 bits

Rounding 0.2 to 52 bits

Representing negative numbers with offset binary (Excess-K, biased representation)

Offset-K

Calculating number

First bit defines the sign

Floating point according to the IEEE754

Format

Converting 0.1

to EEE-754 double precision

Converting 0.2

to EEE-754 double precision

Validating conversion

Calculating 0.1 + 0.2

Adjusting the exponent

Adding numbers

Normalizing

Rounding

Converting to decimal

Resources

Why 0.1+0.2 =

More from maximk