JavaSE_day04_ Basic data type, basic data type conversion

Data type classification

In Java, data also has types (any data must have types). The classification of data in Java is shown in the following figure:

                      

Basic data types: built-in types in Java language, including integer type, decimal type, character type and boolean type. These four basic types are the simplest and most basic types.

                  

Reference data type: it is created based on the basic data type. Java se provides a super class library, which contains nearly 10000 reference data types. But now we have to learn the basic types! Let's talk about the basic data types in detail:

Type: long, int, short

  • Each integer type of Java has a fixed table number range and field length, which is not affected by the specific OS, so as to ensure the portability of Java programs.
  • The integer constant of java is of type "int" by default, and the declaration of a long constant must be followed by "L" or "L"
  • Variables in java programs are usually declared as int type, and long is used unless it is insufficient to represent a large number
  • bit: the smallest storage unit in a computer. byte: the basic storage unit in a computer.

                          

Floating point type: float, double

Similar to integer type, Java} floating-point type also has a fixed table number range and field length, which is not affected by the specific operating system. Floating point constants can be expressed in two forms:

  • Decimal number form: for example: 5.12 512.0f. 512 (must have decimal point)
  • Form of scientific counting method: e.g. 5.12e2 512E2 100E-2

float: single precision, mantissa can be accurate to 7 significant digits. In many cases, the accuracy is difficult to meet the requirements.

Double: double precision. The precision is twice that of float. This type is usually used.

The floating-point constant of Java # is double by default. When declaring a floating-point constant, it must be followed by 'f' or 'f'.

                     

Character type: char

char type data is used to represent "characters" (2 bytes) in the normal sense

All characters in Java are encoded in Unicode, so a character can store a letter, a Chinese character, or a character of other written languages.

There are three forms of character variables:

  • A character constant is a single character enclosed in single quotation marks (''). For example: char c1 = 'a'; char c2 = 'medium'; char c3 = '9';
  • The escape character '\' is also allowed in Java to convert subsequent characters into special character constants. For example: char c3 = '\ n'; '\ n' indicates a newline character
  • Directly use the Unicode value to represent the character constant: '\ uXXXX'. Where XXXX represents a hexadecimal integer. For example: \ u000a , indicates \ n.

char type can be operated. Because it all corresponds to Unicode code.

Escape character

  • It refers to using some ordinary character combinations to represent some special characters. Because the combined characters change their original meaning, it is called escape characters. The escape character in Java starts with \. The common escape characters are: \ t, \ n, \ u, \ \, \ ', \ ", where \ t represents the tab character, \ n is the newline character, \ \ represents an ordinary \ character, \' represents an ordinary ', \' represents an ordinary '.

boolean type: boolean

  • boolean # type is used to judge logical conditions and is generally used for program flow control:
  • boolean type data can only take values of true and false without null.
  • You can't replace false and true with integers other than 0 or 0, which is different from C language.
  • There is no bytecode instruction dedicated to boolean value in the Java virtual machine. The boolean value operated by the Java language is expressed and replaced by the int data type in the Java virtual machine after compilation: true is represented by 1 and false by 0.

Code demonstration

/*
Java Defined data type
 1, Variables are classified by data type:
    Basic data type:
        Integer: byte \ short \ int \ long
        Floating point type: float \ double
        Character type: char
        boolean: boolean
    Reference data type:
        Class
        Interface
        Array
 2, Where the variable is declared in the class:
        Member variable vs local variable
*/
class VariableTest1 {
    public static void main(String[] args) {
        //1. Integer: byte(1 byte=8bit) \ short(2 byte) \ int(4 byte) \ long(8 byte)
        //① byte Range:-128 ~ 127
        //
        byte b1 = 12;
        byte b2 = -128;
        //b2 = 128;//Compilation failed
        System.out.println(b1);
        System.out.println(b2);
        // ② statement long Type variable, must be"l"or"L"ending
        // ③ In general, when defining integer variables, use int Type.
        short s1 = 128;
        int i1 = 1234;
        long l1 = 3414234324L;
        System.out.println(l1);
 
        //2. Floating point: float(4 byte) \ double(8 byte)
        //① Floating point type that represents a numeric value with a decimal point
        //② float Indicates the range ratio of the value long Still big
 
        double d1 = 123.3;
        System.out.println(d1 + 1);
        //③ definition float When you type a variable, the variable should be"f"or"F"ending
        float f1 = 12.3F;
        System.out.println(f1);
        //④ In general, when defining floating-point variables, use double Type.
 
        //3. character: char (1 character=2 byte)
        //① definition char Type variable, usually using a pair'',Only one character can be written inside
        char c1 = 'a';
        //Compilation failed
        //c1 = 'AB';
        System.out.println(c1);
 
        char c2 = '1';
        char c3 = 'in';
        char c4 = 'ス';
        System.out.println(c2);
        System.out.println(c3);
        System.out.println(c4);
 
        //② Representation: 1.Declare a character 2.Escape character 3.Direct use Unicode Value to represent a character constant
        char c5 = '\n';//Line feed
        c5 = '\t';//Tab
        System.out.print("hello" + c5);
        System.out.println("world");
 
        char c6 = '\u0043';
        System.out.println(c6);
 
        //4.Boolean: boolean
        //① Only one of two values can be taken: true , false
        //② It is often used in condition judgment and loop structure
        boolean bb1 = true;
        System.out.println(bb1);
 
        boolean isMarried = true;
        if(isMarried){
            System.out.println("You can't participate\"single\"party It's over!\\n unfortunately");
        }else{
            System.out.println("You can talk more about girlfriends!");
        }
 
    }
}

Variables are classified by data type

For each kind of data, a specific data type (strongly typed language) is defined, and different sizes of memory space are allocated in memory.

           

Variables are divided according to the declared position

  • Outside the method, variables declared in the class are called member variables.
  • Variables declared inside a method are called local variables.

             

Garbled code and initial recognition of character set

Inside the computer, all data is represented in binary. Each bit has two states of 0 and 1, so eight bits can be combined into 256 States, which is called a byte. A byte can be used to represent 256 different states. Each state corresponds to a symbol, that is, 256 symbols, from 0000000 to 11111111.

ASCII code:

In the 1960s, the United States formulated a set of character codes, which made unified provisions on the relationship between English characters and binary bits. This is called ASCII code. ASCII code specifies a total of 128 characters. For example, the SPACE "SPACE" is 32 (binary 00100000), and the uppercase letter A is 65 (binary 01000001). These 128 symbols (including 32 control symbols that cannot be printed) only occupy the last 7 bits of a byte, and the first 1 bit is uniformly specified as 0.

Disadvantages:

  • Cannot represent all characters.
  • The same code represents different characters: for example, 130 represents é in French code, but it represents é in Hebrew code( ג) The letter Gimel
/*
Comparison table of numbers and characters (coding table):

ASCII Code table: American Standard Code for Information Interchange.
Unicode Code table: universal code. It is also a comparison between numbers and symbols. The beginning 0-127 is exactly the same as ASCII, but it contains more characters from 128.

48 - '0'
65 - 'A'
97 - 'a'
*/
public class Demo03DataTypeChar {
    public static void main(String[] args) {
        char zifu1 = '1';
        System.out.println(zifu1 + 0); // 49
        
        char zifu2 = 'A'; // In fact, the bottom layer stores the number 65
        
        char zifu3 = 'c';
        // On the left is int Type, right char Type,
        // char --> int,It's really from small to large
        // Automatic type conversion occurred
        int num = zifu3;
        System.out.println(num); // 99
        
        char zifu4 = 'in'; // Correct writing
        System.out.println(zifu4 + 0); // 20013
    }
}

Unicode encoding

Garbled Code: there are many coding methods in the world. The same binary number can be interpreted as different symbols. Therefore, if you want to open a text file, you must know its coding method, otherwise it will be garbled if you interpret it in the wrong coding method. Unicode: a code that includes all the symbols in the world. Each symbol is given a unique code, and there is no problem of garbled code when using Unicode.

Disadvantages of Unicode:

  • Unicode only specifies the binary code of the symbol, but does not specify how the binary code should be stored: Unicode and ASCII cannot be distinguished: the computer cannot distinguish whether three bytes represent one symbol or three symbols respectively.
  • We know that only one byte is enough for English letters. If unicode uniformly stipulates that each symbol is represented by three or four bytes, then two to three bytes in front of each English letter must be 0, which is a great waste of storage space. ​​​​​​​

UTF-8

UTF-8 is the most widely used Unicode implementation on the Internet. UTF-8 is a variable length coding method. It can use 1-6 bytes to represent a symbol, and the byte length varies according to different symbols.

Coding rules of UTF-8:

  • For single byte UTF-8 encoding, the highest bit of the byte is 0, and the other 7 bits are used to encode characters (equivalent to ASCII code).
  • For multi byte UTF-8 encoding, if the encoding contains n bytes, the first n bits of the first byte are 1, and the n+1 bits of the first byte are 0. The remaining bits of the byte are used to encode characters.
  • All bytes after the first byte have the highest two bits of "10", and the other six bits are used to encode characters.

Basic data type conversion

Automatic type conversion: automatic type conversion refers to the automatic conversion of types with small capacity into data types with large capacity

  • When there are multiple types of data mixed operation, the system will first automatically convert all data into the data type with the largest capacity, and then calculate.
  • Byte, short and char will not be converted to each other. They are first converted to int type during calculation.
  • boolean type cannot operate with other data types.
  • When the value of any basic data type is connected with the string (+), the value of the basic data type will be automatically converted to the string type.

The data types are sorted by capacity:

                       

Code demonstration, automatic type conversion

/*
When the data types are different, data type conversion will occur.
Automatic type conversion (implicit)
    1. Features: the code does not need special processing and is completed automatically.
    2. Rule: data range from small to large.
Cast (explicit)
*/
public class Demo01DataType {
    public static void main(String[] args) {
        System.out.println(1024); // This is an integer. The default is int type
        System.out.println(3.14); // This is a floating point number. The default is double type
 
        // On the left long Type. The right side is the default int The type is different from the left and right
        // An equal sign represents the assignment, and the int Constant, to the left long Variables are stored
        // int --> long,It meets the requirements of data range from small to large
        // This line of code has automatic type conversion.
        long num1 = 100;
        System.out.println(num1); // 100
 
        // On the left double Type, right float The type is different from the left and right
        // float --> double,Comply with the rules from small to large
        // Automatic type conversion also occurred
        double num2 = 2.5F;
        System.out.println(num2); // 2.5
 
        // On the left float Type, right long The type is different from the left and right
        // long --> float,The scope is float Bigger, in line with the rules from small to large
        // Automatic type conversion also occurred
        float num3 = 30L;
        System.out.println(num3); // 30.0
 
        byte b = 1;
        short s = 2;
        char c = '3';
        //byte + short +char --->int + int + int -->int
        int result = b + s + c;
        System.out.println(result);//54
 
    }
}

Forced type conversion: the reverse process of automatic type conversion. Converting a data type with large capacity into a data type with small capacity is forced type conversion.

  • Features: the code needs special format processing and cannot be completed automatically.
  • Format: type with small range variable name with small range = (type with small range) data with large range;

Code demonstration, cast type:

public class Demo02DataType {
    public static void main(String[] args) {
        // On the left int Type, right long Type, different
        // long --> int,Not from small to large
        // Automatic type conversion cannot occur!
        // Format: type with small range variable name with small range = (Type with small range) Originally a large range of data;
        int num = (int) 100L;
        System.out.println(num);//100
 
    }
}

Cast type conversion is generally not recommended, because precision loss and data overflow may occur

public class Demo03DataType {
    public static void main(String[] args) {
 
        // long Cast to int type,data overflow 
        int num2 = (int) 6000000000L;
        System.out.println(num2); // 1705032704
 
        // double --> int,Cast, precision loss
        int num3 = (int) 3.99;
        System.out.println(num3); // 3,This is not rounding, all decimal places will be discarded
 
    }
}

matters needing attention

  • Cast is not recommended
  • Converting floating point to integer and directly canceling the decimal point may cause data loss of accuracy.
  • Data overflow occurs when the value of a data type with a large range exceeds the maximum range of a data type with a small range.

Two optimizations of compiler

Optimization 1: for the three types of byte/short/char, if the value assigned on the right does not exceed the range, the javac compiler will automatically and implicitly add a (byte)(short)(char) to us.

  • If it does not exceed the range on the left, the compiler makes up the forced conversion.
  • If the right side exceeds the left side, the direct compiler will report an error.
public class DemoNotice {
    public static void main(String[] args) {
        // On the right is indeed a int The number, but not beyond the range on the left, is correct.
        // int --> byte,Not an automatic conversion type
        byte num1 = /*(byte)*/ 30; // The right side does not extend beyond the left side
        System.out.println(num1); // 30
        
        // byte num2 = 128; // The right side exceeds the left side
        
        // int --> char,Not out of range
        // The compiler will automatically fill in an implicit(char)
        char zifu = /*(char)*/ 65;
        System.out.println(zifu); // A
    }
}

Optimization 2: when assigning values to variables, if the expressions on the right side are all constants without any variables, the compiler javac will directly calculate the results of several constant expressions. short result = 5 + 8; // The right side of the equal sign is full of constants, and no variables are involved in the operation. After compilation, it is obtained class bytecode file is equivalent to [direct]: short result = 13; The constant result value on the right does not exceed the range on the left, so it is correct. This is called compiler constant optimization.

  • Note: once there are variables in the expression, this optimization cannot be carried out.

Code example

public class DemoNotice {
    public static void main(String[] args) {
        short num1 = 10; // Write correctly. The right side does not exceed the left side,
        
        short a = 5;
        short b = 8;
        // short + short --> int + int --> int
        // short result = a + b; // Wrong writing! The left side needs to be int type
        
        // On the right side, there are no variables, but constants, and there are only two constants without others
        short result = 5 + 8;
        System.out.println(result);
        
        short result2 = 5 + a + 8; // 18
    }
}

There are fixed conversion rules between basic data types. Now we summarize the following six rules. No matter which program, apply these six rules to solve the problem:

  • Among the eight basic data types, except boolean type, the other seven types can be converted;
  • If the integer literal does not exceed the value range of byte, short and char, it can be directly assigned to byte, short and char variables;
  • The conversion from small capacity to large capacity is called automatic type conversion. The order of capacity from small to large is: byte < short (char) < int < long < float < double. Both short and char occupy two bytes, but char can represent a larger positive integer;
  • Conversion from large capacity to small capacity is called forced type conversion, and "forced type converter" must be added when writing, but precision loss may occur during operation, so use it with caution;
  • When byte, short and char types are mixed, they are converted into int types before operation;
  • The mixed operation of multiple data types shall be converted to the one with the largest capacity before operation;

Tags: JavaSE

Posted by xcoderx on Tue, 24 May 2022 03:37:05 +0300