How to compile Tensorflow with SSE4.2 and AVX instructions?

最新推荐文章于 2021-04-24 13:24:13 发布

原创最新推荐文章于 2021-04-24 13:24:13 发布 · 1.6k 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#SSE4.2 #Tensorflow

tensorflow 专栏收录该内容

22 篇文章

订阅专栏

本文解释了在使用TensorFlow时遇到的关于SSE4.2和AVX指令集未被利用的警告信息，并提供了如何通过从源代码安装来充分利用这些指令集的方法。

This is the message received from running a script to check if Tensorflow is working:

Warning: The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.

Warning: The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.

what is SSE4.2 and AVX?

You may think about them as a set of some additional instructions for a computer to use multiple data points against a single instruction to perform operations which may be naturally parallelized (for example adding two arrays). That means, they are some instructions to speed up your compution.

Why you get the warning?

Most probably you have not installed TF from source and instead of it used something like pip install tensorflow. That means that you installed pre-built (by someone else) binaries which were not optimized for your architecture. And these warnings tell you exactly this: something is available on your CPU architecture, but it will not be used because the binary was not compiled with it. Here is the part from documentation.

Good thing is that most probably you just want to learn/experiment with TF so everything will work properly and you should not worry about it.

what if you want to get SSE4.2 and AVX compiled?

if you want to get full use of your CPU architecture, you can compile these SSE4.2 and AVX by install TF from source. Just follow the tensorflow offical website document to install. you do not need to uninstall the existing pip-installed tensorflow.

in the configure step , you will be ask:

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:

keep Default configure you will get the optimization with your CPU archiitecture.