编程范式 epesode2 negative values, float 精度-优快云博客

本文详细阐述了浮点数和整型数据的表示方式，包括负数的表示方法、类型转换规则以及浮点数在不同精度下的表示范围和计算过程。同时，通过实例解释了整数和浮点数之间的转换，以及它们在内存中的存储机制。

episode2

//it is very interesting,an excellect teacher, I love it

1,why negative is indicated the way it is indicated

2,how float is indicated

3,type conversion of negative integer type

4,type conversion between float and intergral type

----types

----negative denotation:why use the way that reverse all the bits and add 1 to represent negative dates.//very clear explain,very excellent teacher.

--take 7 for example, if use 1000 0111 as -7, than do the add operation between 7 and -7,the result is -14.
0000 0111

+ 1000 0111

= 1000 1110

of course this does not work.

--we want 7 and -7 to get full 0,now firstly we try the result full 1,because full 1 is easier to calculate than full 0. like

0000 0111

+ ???? ????

= 1111 1111

we can see what we need is

1111 1000

it it the reverse of 7:0000 0111,

--to get the result full 0, we need to add 1 to the current full 1,

1111 1111

+ 0000 0001

=10000 0000 the1 in the result is overflow,and does not count, so get 0,which is what we want.

therefore, the data we calculted should add 1 correponding to get the full 0.

which is 1111 1000 + 1,

the above is why negative dates are indicated in its way: reverse all the bits(execpt the sigh bit) and add 1.

--char, short, int, long, type conversion

--unsigned type, the simplest way, follow the protocal, electronics does not care the lost bytes.

small bytes to big bytes, just copy to low bytes.

big bytes to small bytes, cut the high bytes,use the low bytes.

--signed type, render sign expansion

to keep the signed characteristic, with negative sign, all expanded bits should be 1.domino effect

1000 0111

0000 0000 0000 0000

1111 1111 1000 0111 domino effect

----float

float：

1bit（符号位）

8bits（指数位）

23bits（尾数位）

double:

1bit（符号位）

11bits（指数位）

52bits（尾数位）

4 bytes, 有2^32个映射能够表示 -2^31到2^31-1个数。

--float protocal: （符号s） 1.xxxxx *2^（exp-127)

1位符号位s，8位exp, 23位表示小数点后的数据.xxxx

对于exp，magnitude only, int 0-255, 这意味着指数的范围是-127到128

对于23位.xxxx, when all 1, means 1减去2的-23次方，是一个与1很接近的数。当这23位 all 1 再加上1后，这个float数据就变成一个能被2整除的数。

_______________

_关于取值范围

float的指数范围为-127~128，而double的指数范围为-1023~1024，并且指数位是按补码的形式来划分的。其中负指数决定了浮点数所能表达的绝对值最小的数；而正指数决定了浮点数所能表达的绝对值最大的数，也即决定了浮点数的取值范围。

float的范围为-2^128 ~ +2^128，也即-3.40E+38 ~ +3.40E+38；double的范围为-2^1024 ~ +2^1024，也即-1.79E+308 ~ +1.79E+308。

_关于精度：就是小数点后能有多少位

float和double的精度是由尾数的位数来决定的。浮点数在内存中是按科学计数法来存储的，其整数部分始终是一个隐含着的“1”，由于它是不变的，故不能对精度造成影响。

float：2^23 = 8388608，一共七位，这意味着最多能有7位有效数字，但绝对能保证的为6位，也即float的精度为6~7位有效数字;double:2^52 = 4503599627370496 ,一共16位，其精度为15~16位。

__________________________________

--eg. 7

7.0 * 2^0

3.5 * 2^1

1.75*2^2

乘号左边范围是，1-1.9

乘号右边指数不能小于-127，否则type double love it .

----float and int , types conversion

--eg1,

int i =5;

float f =i;

cout<<f<<endl; // f is printed 5.000000

--eg2,

int i = 37;

float f = *(flat*)&i;// f is a very very small thing, barely meaningful

--eg3,

float f=7.0;

short s = *(short*)&f;

for the 4 types of the float date, the pointer point at the highest byte address,thus the short date get the high two bytes.

转载于:https://www.cnblogs.com/aprilapril/p/4330426.html