而不是编写一个算法来做到这一点,它最好使用内置函数。
Integer.bitCount()
这使得这个特别高效的是JVM可以将其视为内在的。即在支持它的平台上用单个机器代码指令识别和替换整个事物。英特尔/ AMD
演示这种优化的有效性
public static void main(String... args) {
perfTestIntrinsic();
perfTestACopy();
}
private static void perfTestIntrinsic() {
long start = System.nanoTime();
long countBits = 0;
for (int i = 0; i < Integer.MAX_VALUE; i++)
countBits += Integer.bitCount(i);
long time = System.nanoTime() - start;
System.out.printf("Intrinsic: Each bit count took %.1f ns, countBits=%d%n", (double) time / Integer.MAX_VALUE, countBits);
}
private static void perfTestACopy() {
long start2 = System.nanoTime();
long countBits2 = 0;
for (int i = 0; i < Integer.MAX_VALUE; i++)
countBits2 += myBitCount(i);
long time2 = System.nanoTime() - start2;
System.out.printf("Copy of same code: Each bit count took %.1f ns, countBits=%d%n", (double) time2 / Integer.MAX_VALUE, countBits2);
}
// Copied from Integer.bitCount()
public static int myBitCount(int i) {
// HD, Figure 5-2
i = i - ((i >>> 1) & 0x55555555);
i = (i & 0x33333333) + ((i >>> 2) & 0x33333333);
i = (i + (i >>> 4)) & 0x0f0f0f0f;
i = i + (i >>> 8);
i = i + (i >>> 16);
return i & 0x3f;
}
打印
Intrinsic: Each bit count took 0.4 ns, countBits=33285996513
Copy of same code: Each bit count took 2.4 ns, countBits=33285996513
使用固有版本和循环的每个位计数平均只需0.4纳秒。使用相同代码的副本需要6倍的时间(获得相同的结果)