The fundamental integer data type in Java is the int, a 4-byte, big-endian, two's complement integer.
byte是整型。
长整型后缀是L.
While the difference between an 8-bit byte and a 32-bit int is insignificant for a single number, it can be very significant when several thousand to several million numbers are read. In fact, a single byte still takes up four bytes of space inside the Java virtual machine, but a byte array occupies only the amount of space it actually needs. The virtual machine includes special instructions for operating on byte arrays but does not include any instructions for operating on single bytes. They're just promoted to ints.
PS:证明java的byte基本类型在运算前已经全部转换成int类型的好例子:
byte b1 = 22;
byte b2 = 23;
byte b3 = b1 + b2;
上面发生一个编译时错误
Error:Incompatible type for declaration.
Explicit cast needed to convert int to short.
Java has no short or byte literals. When you write the literal 42 or 24000, the compiler always reads it as an int, never as a byte or a short, even when used in the right-hand side of an assignment statement to a byte or short, like this:
PS:当你写byte b = 42;对于编译器来讲,b还是4个字节,不过它的范围被限制在-128到127之间。
浮点型
float double
表示float类型数据时要在后面添加后缀F.
没有后缀总被认为是double类型。
所有浮点计算都遵从IEEE754规范。有三种特殊的浮点值。
字符型
Char
In Java, a char is a 2-byte, unsigned integerthe only unsigned type in Java.
Unicode具有从0到65536之间的编码.
前缀/u表示是Unicode值。/u2122是商标符号™.
PS:下面3个char是相同的字符
char c = 0x10;
char c2 = '/u0010';
char c3 = 16;
Unicode
Latin-1 suffices for most Western European languages (with the notable exception of Greek), but it doesn't have anywhere near the number of characters required to represent Cyrillic, Greek, Arabic, Hebrew, or Devanagari, not to mention pictographic languages like Chinese and Japanese. Chinese alone has over 80,000 different characters. To handle these scripts and many others, the Unicode character set was invented. Unicode has space for over one million different possible characters. Only about 100,000 are used in practice, the rest being reserved for future expansion. Unicode can handle most of the world's living languages and a number of dead ones as well.
The first 256 characters of Unicode are identical to the characters of the Latin-1 character set. Thus 65 is ASCII A and Unicode A; 66 is ASCII B and Unicode B, and so on.
Unicode is only a character set. It is not a character encoding. That is, although Unicode specifies that the letter A has character code 65, it doesn't say whether the number 65 is written using one byte, two bytes, or four bytes, or whether the bytes used are written in big- or little-endian order. However, there are certain standard encodings of Unicode into bytes, the most common of which are UTF-8, UTF-16, and UTF-32.
UTF-32 is the most naïve encoding. It simply represents each character as a single 4-byte (32-bit) int.
UTF-16 represents most characters as a 2-byte, unsigned short. However, certain less common Chinese characters, musical and mathematical symbols, and characters from dead languages such as Linear B are represented in four bytes each. The Java virtual machine uses UTF-16 internally. In fact, a Java char is not really a Unicode character. Rather it is a UTF-16 code point, and sometimes two Java chars are required to make up one Unicode character.
Finally, UTF-8 is a relatively efficient encoding (especially when most of your text is ASCII) that uses one byte for each of the ASCII characters, two bytes for each character in many other alphabets, and three-to-four bytes for characters from Asian languages. Java's .class files use UTF-8 internally to store string literals.