【python】使用numba加速python运行

最新推荐文章于 2025-04-15 22:01:39 发布

-徐徐图之-

最新推荐文章于 2025-04-15 22:01:39 发布

阅读量1.1w

点赞数 2

分类专栏： python 文章标签： python

本文链接：https://blog.youkuaiyun.com/zylooooooooong/article/details/115580829

版权

python 专栏收录该内容

12 篇文章

订阅专栏

来源：
https://www.jianshu.com/p/69d9d7e37bc5
https://zhuanlan.zhihu.com/p/193035135
这里只是稍加整理。

numba是一个用于编译Python数组和数值计算函数的开源的JIT编译器，它可以将Python和NumPy代码的子集转换为高效的机器码，能够大幅提高直接使用Python编写的函数的运算速度。
JIT的全称是 Just-in-time，在 numba 里面则特指 Just-in-time compilation（即时编译）。

编译方式有：

动态编译（dynamic compilation）：指的是“在运行时进行编译”；与之相对的是事前编译（ahead-of-time compilation，简称AOT），也叫静态编译（static compilation）。
JIT编译（just-in-time compilation）狭义来说是当某段代码即将第一次被执行时进行编译，因而叫“即时编译”。JIT编译是动态编译的一种特例。JIT编译一词后来被泛化，时常与动态编译等价；但要注意广义与狭义的JIT编译所指的区别。
自适应动态编译（adaptive dynamic compilation）也是一种动态编译，但它通常执行的时机比JIT编译迟，先让程序“以某种方式”先运行起来，收集一些信息之后再做动态编译。这样的编译可以更加优化。

jit装饰器

基础使用

import jit
@numba.jit
def add(x,y):
    return x + y

上面这段代码是numba.jit的简单应用，在函数第一次执行的时候，numba推断出参数类型，然后基于这个信息生产优化后的代码。

指定签名

import numba 
@numba.jit(int32(int32, int32))
def add_signatured(x,y):
    return x+y

@numba.jit括号内的是指定签名，编译器将控制类型选择，并不允许其他类型的参数输入，这会带来速度上的优势。

编译模式（nopython模式和object模式）

nopython和object是numba的两种编译模式，前者编译的代码更快，但是可能会因为某些限制但是退化为object, 通过nopython=True可以阻止退化并抛出异常。

nopython

@numba.jit(nopython=True)
def f(x, y):
    return x + y

@numba.njit与@numba.jit(nopython=True)等价。

@numba.njit
def f(x, y):
    return x + y

有时候不那么严格的规定数据将会带来性能的提升，此时，可以使用fastmath关键字参数：

@njit(fastmath=False)
def do_sum(A):
    acc = 0.
    # without fastmath, this loop must accumulate in strict order
    for x in A:
        acc += np.sqrt(x)
    return acc

@njit(fastmath=True)  # 速度更快
def do_sum_fast(A):
    acc = 0.
    # with fastmath, the reduction can be vectorized as floating point
    # reassociation is permitted.
    for x in A:
        acc += np.sqrt(x)
    return acc

nogil

当Numba不需要保持全局线程锁GIL时，可以设定nogil=True，当进入这类编译好的函数时，Numba将会释放全局线程锁。这样可以利用多核系统，但不能使用的函数是在object模式下编译。

@numba.jit(nogil=True)
def f(x, y):
    return x + y

cache

想要避免你调用python程序的编译时间，可以通过cache=True来指定numba保存函数编译结果到一个基于文件的缓存中。

@numba.jit(cache=True)
def f(x, y):
    return x + y

parallel

parallel=True将函数中的操作自动并行化，必须要和nopython=True配合起来一起使用。编译器将编译一个版本，并行运行多个原生的线程（没有GIL）。
（慎用，搞不好的话可能更慢，例如代码里只是简单的for循环的话开启的效果就会比较明显）

@numba.jit(nopython=True, parallel=True)
def f(x, y):
    return x + y

generated_jit

有时候想要编写一个函数，基于输入的类型实现不同的实现，generated_jit()装饰器允许用户在编译期控制不同的特性的选择。

import numpy as np

from numba import generated_jit, types

@generated_jit(nopython=True)
def is_missing(x):
    """
    Return True if the value is missing, False otherwise.
    """
    if isinstance(x, types.Float):
        return lambda x: np.isnan(x)
    elif isinstance(x, (types.NPDatetime, types.NPTimedelta)):
        # The corresponding Not-a-Time value
        missing = x('NaT')
        return lambda x: x == missing
    else:
        return lambda x: False

vectorize装饰器

Numba的vectorize允许Python函数将标量输入参数作为Numpy的ufunc使用，将纯Python函数编译成ufunc，使之速度与使用c编写的传统的ufunc函数一样。vectorizer与jit装饰器的差别：numpy的ufunc自动加载其他特性，例如：reduction, accumulation or broadcasting等。

from numba import vectorize, float64

@vectorize([float64(float64, float64)])
def f(x, y):
    return x + y
    
a = np.arange(12).reshape(3, 4)
f.reduce(a, axis=0)
f.reduce(a, axis=1)
f.accumulate(a)
f.accumulate(a, axis=1)

jitclass装饰器

Numba支持通过jitclass装饰器实现对于类的代码生成。可以使用这个装饰器来标注优化，类中的所有方法都被编译成nopython function。

import numpy as np
from numba import jitclass          # import the decorator
from numba import int32, float32    # import the types

spec = [
    ('value', int32),               # a simple scalar field
    ('array', float32[:]),          # an array field
]

@jitclass(spec)
class Bag(object):
    def __init__(self, value):
        self.value = value
        self.array = np.zeros(value, dtype=np.float32)

    @property
    def size(self):
        return self.array.size

    def increment(self, val):
        for i in range(self.size):
            self.array[i] = val
        return self.array