# IEEE 754 standard

Storage method: sign bit, exponent, mantissa
IEEE floating point arithmetic standard: IEEE 754

type Sign bit index Mantissa
float 1 8 23
double 1 11 52

Take float as an example to discuss floating point numbers.

The memory representation of floating-point type is different from that of integer type, and the memory representation of floating-point type is relatively complex.

According to IEEE 754 standard, float divides its 32bits into three parts, with symbol bit S accounting for 1bit, index E accounting for 8bits and mantissa M accounting for 23bits.

The following discusses the memory representation of float type with a common body.

```union
{
float f; //Truth value
struct
{
uint32_t M : 23; // index
uint32_t E : 8;  // Mantissa
uint32_t S : 1;  //Sign bit
};
uint32_t v; //Machine code
} f;
```

Floating point number calculation formula:
\$\$f = (-1){S}2{E-127}1.M\$\$

For example: given the machine code f.v = 0x41360000, the truth value f.f.

1. Expand the machine code into binary 0x41360000 - > 0b0100 0001 0011 0110 0000; S = 0b0 = 0x0 = 0, E = 0b100 0001 0 = 0x82 = 130, M = 0b011 0110 0000 = 0x360000;
2. 1.M is the binary decimal calculation, and the cube multiplied by 2 according to S and E is multiplied by 1 M. That is, the decimal point moves three digits to the right, and the true value F.F = 0b1011 011 = 11.375.

Code verification:

```#include <stdio.h>
#include <stdint.h>

union uFloat
{
float f; //Truth value
struct
{
uint32_t M : 23; // index
uint32_t E : 8;  // Mantissa
uint32_t S : 1;  //Sign bit
};
uint32_t v; //Machine code
} f;

void uPrintln(union uFloat uf)
{
printf("f: %f\nS: 0x%x, E: 0x%x, M: 0x%x\nv: 0x%08x\n", uf.f, uf.S, uf.E, uf.M, uf.v);
}

int main(void) {
f.v = 0x41360000;
uPrintln(f);
printf("sizeof: %d\n", sizeof(f));
return 0;
}
```

Output:

```f: 11.375000
S: 0x0, E: 0x82, M: 0x360000
v: 0x41360000
sizeof: 4
```

# Imprecise type

In a limited number of bits, the floating-point type is better than uint32_t represents a larger range, resulting in partial data loss. It cannot represent all accurate values. It is an imprecise type.

For example, in the open interval (33829141358099960719503586036763590656.0339620349071475272940736969149792649216.0), at least 4202553 integers cannot be represented normally.

```int main(void) {
f.f = 338291141358099960719503586036763599657.000000;
uPrintln(f);
return 0;
}
```

Output:

```f: 338291141358099960719503586036763590656.000000
S: 0x0, E: 0xfe, M: 0x7e8081
v: 0x7f7e8081
```

The closer the floating-point data is to zero, the smaller the error is, and vice versa.

# verification

This conclusion is verified by two simple procedures.

The function of this C program is to increase the floating point number corresponding to the interval sampling of machine code and print it to the table file.

```#include <stdio.h>
#include <stdint.h>

union
{
float f;
struct
{
uint32_t M : 23; // index
uint32_t E : 8;  // Mantissa
uint32_t S : 1;  //Sign bit
};
int32_t i;
} u;

void uPrintln(float f)
{
u.f = f;
printf("i: %d, f: %f, S: %d, E: %d, M: %d\n", u.i, u.f, u.S, u.E, u.M);
}

void ufPrintln(float f, FILE *fp)
{
u.f = f;
fprintf(fp, "%d,0x%08x,%f,%d,%d,%d,\n", u.i, u.i, u.f, u.S, u.E, u.M);
}

void pPrintfln(uint32_t cur, uint32_t max)
{
static uint64_t tmp = 0;
uint64_t pten = cur * 100 / max;
if (tmp != pten)
{
tmp = pten;
printf("\rprocessing: %d", pten);
}
}

#define MAX UINT32_MAX
#define INT UINT16_MAX

int main(void)
{
FILE *fp = NULL;

fp = fopen("float.csv", "w+");
if (fp == NULL)
{
printf("open error!\n");
return -1;
}
printf("open successful!\n");
fprintf(fp, "i,i,f,S,E,M,\n");
for (uint32_t i = 0; i < MAX; i += INT)
{
u.i = i;
ufPrintln(u.f, fp);
pPrintfln(i, MAX);
}
fclose(fp);

return 0;
}
```

The function of this python file is to read table files and draw points.

```import matplotlib.pyplot as plt
import pandas as pds

data.plot(x="i.1", y="f")
plt.show()
```

It can be seen that when the machine code increases to a certain extent, the floating-point number begins to change exponentially, which means that the more floating-point numbers are crossed and lost between two adjacent machine codes. # communication

Wechat official account: IOT points north
Station B: IOT refers to the north
Thousand penguins: 658685162

Tags: C

Posted by maest on Tue, 10 May 2022 16:37:10 +0300