NumPy入门指南-优快云博客

转自http://blog.youkuaiyun.com/liyuanbhu/article/details/28611429

转自http://blog.youkuaiyun.com/liyuanbhu/article/details/28870439

翻译自官方文档Tentative NumPy Tutorial，有删节。

Numpy 入门教程

NumPy 提供了对多维数组的支持，与Python原生支持的List类型不同，数组的所有元素必须同样的类型。数组的维度被称为axes，维数称为 rank。

Numpy的数组类型为 ndarray， ndarray 的重要属性包括:

ndarray.ndim：数组的维数，也称为rank
ndarray.shape：数组各维的大小tuple 类型，对一个n 行m 列的矩阵来说， shape 为 (n,m)。
ndarray.size：元素的总数。
Ndarray.dtype：每个元素的类型，可以是 numpy.int32, numpy.int16, and numpy.float64 等。
Ndarray.itemsize：每个元素占用的字节数。
Ndarray.data：指向数据内存。

一个简单的例子：

[python]view plain copy 
   
 >>> from numpy  import *  
 >>> a = arange(15).reshape(3, 5)  
 >>> a  
 array([[ 0,  1,  2,  3,  4],  
        [ 5,  6,  7,  8,  9],  
        [10, 11, 12, 13, 14]])  
 >>> a.shape  
 (3, 5)  
 >>> a.ndim  
 2  
 >>> a.dtype.name  
 'int32'  
 >>> a.itemsize  
 4  
 >>> a.size  
 15  
 >>> type(a)  
 numpy.ndarray  
 >>> b = array([6, 7, 8])  
 >>> b  
 array([6, 7, 8])  
 >>> type(b)  
 numpy.ndarray  

生成数组

有许多种方法生成数组。比如，可以将Python list 或 tuple 转化为数组，转化后的数组元素的类型由原来的对象的类型来决定。

[python]view plain copy 
   
 >>> from numpy  import *  
 >>> a = array( [2,3,4] )  
 >>> a  
 array([2, 3, 4])  
 >>> a.dtype  
 dtype('int32')  
 >>> b = array([1.2, 3.5, 5.1])  
 >>> b.dtype  
 dtype('float64')  
 >>> b = array( [ (1.5,2,3), (4,5,6) ] )  
 >>> b  
 array([[ 1.5,  2. ,  3. ],  
        [ 4. ,  5. ,  6. ]])  

生成数组时也可以指定元素的数据类型:

[python]view plain copy 
   
 >>> c = array( [ [1,2], [3,4] ], dtype=complex )  
 >>> c  
 array([[ 1.+0.j,  2.+0.j],  
        [ 3.+0.j,  4.+0.j]])  

通常，我们无法事先知道数组元素的具体值，但是数组大小是已知的。这时可以用下面几种方法生成数组。

zeros 函数生成元素全部为0的数组，ones函数生成元素全部为1的数组empty函数生成元素没有赋值的数组，这时元素值由内存中原来的内容决定。默认的，生成的数组的元素类型为float64.

[python]view plain copy 
   
 >>> zeros( (3,4) )  
 array([[0.,  0.,  0.,  0.],  
        [0.,  0.,  0.,  0.],  
        [0.,  0.,  0.,  0.]])  
 >>> ones( (2,3,4), dtype=int16 )                # dtype can also be specified  
 array([[[ 1, 1, 1, 1],  
         [ 1, 1, 1, 1],  
         [ 1, 1, 1, 1]],  
        [[ 1, 1, 1, 1],  
         [ 1, 1, 1, 1],  
         [ 1, 1, 1, 1]]], dtype=int16)  
 >>> empty( (2,3) )  
 array([[  3.73603959e-262,   6.02658058e-154,   6.55490914e-260],  
        [  5.30498948e-313,   3.14673309e-307,   1.00000000e+000]])  

arange 函数生成的数组的元素按照等比数列排布，类似于 range函数。

[python]view plain copy 
   
 >>> arange( 10, 30, 5 )  
 array([10, 15, 20, 25])  
 >>> arange( 0, 2, 0.3 )                 # it accepts float arguments  
 array([ 0. ,  0.3,  0.6,  0.9,  1.2,  1.5,  1.8])  

linspace 函数有些类似matlab中的同名函数，下面是个例子:

[python]view plain copy 
   
 >>> linspace( 0, 2, 9 )                 # 9 numbers from 0 to 2  
 array([ 0.  ,  0.25,  0.5 ,  0.75,  1.  ,  1.25,  1.5 ,  1.75,  2.  ])  
 >>> x = linspace( 0, 2*pi, 100 )        # useful to evaluate function at lots of points  
 >>> f = sin(x)  

屏幕输出 Arrays

当用print 打印一个 array时, 输出结果类似于 lists:

[python]view plain copy 
   
 >>> a = arange(6)                         # 1d array  
 >>> print a  
 [0 1 2 3 4 5]  
 >>>  
 >>> b = arange(12).reshape(4,3)           # 2d array  
 >>> print b  
 [[ 0  1  2]  
  [ 3  4  5]  
  [ 6  7  8]  
  [ 9 10 11]]  
 >>>  
 >>> c = arange(24).reshape(2,3,4)         # 3d array  
 >>> print c  
 [[[ 0  1  2  3]  
   [ 4  5  6  7]  
   [ 8  9 10 11]]  
   
  [[12 13 14 15]  
   [16 17 18 19]  
   [20 21 22 23]]]  

如果数组过大，显示时会有一些省略号:

[python]view plain copy 
   
 >>> print arange(10000)  
 [   0    1    2 ..., 9997 9998 9999]  
 >>>  
 >>> print arange(10000).reshape(100,100)  
 [[   0    1    2 ...,   97   98   99]  
  [ 100  101  102 ...,  197  198  199]  
  [ 200  201  202 ...,  297  298  299]  
  ...,  
  [9700 9701 9702 ..., 9797 9798 9799]  
  [9800 9801 9802 ..., 9897 9898 9899]  
  [9900 9901 9902 ..., 9997 9998 9999]]  

如果我们需要显示完整的数组，可以如下设置

[python]view plain copy 
   
 >>> set_printoptions(threshold='nan')  

翻译自官方文档Tentative NumPy Tutorial，有删节。

基本操作

基本的算术运算符都可以应用于数组类型，结果为对应元素之间的运，返回值为一个新的数组。

[python]view plain copy 
   
 >>> a = array( [20,30,40,50] )  
 >>> b = arange( 4 )  
 >>> b  
 array([0, 1, 2, 3])  
 >>> c = a-b  
 >>> c  
 array([20, 29, 38, 47])  
 >>> b**2  
 array([0, 1, 4, 9])  
 >>> 10*sin(a)  
 array([ 9.12945251, -9.88031624,  7.4511316 , -2.62374854])  
 >>> a<35  
 array([True, True, False, False], dtype=bool)</span>  

乘法操作符 * 表示的也是元素乘法，如果需要矩阵乘法，可以使用dot函数或者生成一个matrix对象。

[python]view plain copy 
   
 >>> A = array( [[1,1],  
 ...             [0,1]] )  
 >>> B = array( [[2,0],  
 ...             [3,4]] )  
 >>> A*B                         # elementwise product  
 array([[2, 0],  
        [0, 4]])  
 >>> dot(A,B)                    # matrix product  
 array([[5, 4],  
        [3, 4]])  
 >>> a = ones((2,3), dtype=int)  
 >>> b = random.random((2,3))  
 >>> a *= 3  
 >>> a  
 array([[3, 3, 3],  
        [3, 3, 3]])  
 >>> b += a  
 >>> b  
 array([[ 3.69092703,  3.8324276 ,  3.0114541 ],  
        [ 3.18679111,  3.3039349 ,  3.37600289]])  
 >>> a += b                                  # b is converted to integer type  
 >>> a  
 array([[6, 6, 6],  
        [6, 6, 6]])</span>  

当两个不同元素类型的数组运算时，结果的元素类型为两者中更精确的那个。（类型提升）

[python]view plain copy 
   
 >>> a = ones(3, dtype=int32)  
 >>> b = linspace(0,pi,3)  
 >>> b.dtype.name  
 'float64'  
 >>> c = a+b  
 >>> c  
 array([ 1.        ,  2.57079633,  4.14159265])  
 >>> c.dtype.name  
 'float64'  
 >>> d = exp(c*1j)  
 >>> d  
 array([ 0.54030231+0.84147098j, -0.84147098+0.54030231j,  
        -0.54030231-0.84147098j])  
 >>> d.dtype.name  
 'complex128'</span>  

Array类型提供了许多内置的运算方法，比如。

[python]view plain copy 
   
 >>> a = random.random((2,3))  
 >>> a  
 array([[ 0.6903007 ,  0.39168346,  0.16524769],  
        [ 0.48819875,  0.77188505,  0.94792155]])  
 >>> a.sum()  
 3.4552372100521485  
 >>> a.min()  
 0.16524768654743593  
 >>> a.max()  
 0.9479215542670073</span>  

默认情况下，这些方法作用于整个 array，通过指定 axis，可以使其只作用于某一个 axis :

[python]view plain copy 
   
 >>> b = arange(12).reshape(3,4)  
 >>> b  
 array([[ 0,  1,  2,  3],  
        [ 4,  5,  6,  7],  
        [ 8,  9, 10, 11]])  
 >>>  
 >>> b.sum(axis=0)                            # sum of each column  
 array([12, 15, 18, 21])  
 >>>  
 >>> b.min(axis=1)                            # min of each row  
 array([0, 4, 8])  
 >>>  
 >>> b.cumsum(axis=1)                         # cumulative sum along each row  
 array([[ 0,  1,  3,  6],  
        [ 4,  9, 15, 22],  
        [ 8, 17, 27, 38]])</span>  

常用函数

NumPy 提供了许多常用函数，如sin, cos, and exp. 同样，这些函数作用于数组中每一个元素，返回另一个数组。

[python]view plain copy 
   
 >>> B = arange(3)  
 >>> B  
 array([0, 1, 2])  
 >>> exp(B)  
 array([ 1.        ,  2.71828183,  7.3890561 ])  
 >>> sqrt(B)  
 array([ 0.        ,  1.        ,  1.41421356])  
 >>> C = array([2., -1., 4.])  
 >>> add(B, C)  
 array([ 2.,  0.,  6.])</span>  

其他常用函数包括：

all, alltrue, any, apply along axis, argmax, argmin, argsort, average, bincount, ceil, clip, conj, conjugate, corrcoef, cov, cross, cumprod, cumsum, diff, dot, floor, inner, inv, lexsort, max, maximum, mean, median, min, minimum, nonzero, outer, prod, re, round, sometrue, sort, std, sum, trace, transpose, var, vdot, vectorize, where

索引、切片、和迭代

与list类似，数组可以通过下标索引某一个元素，也可以切片，可以用迭代器迭代。

[python]view plain copy 
   
 >>> a = arange(10)**3  
 >>> a  
 array([  0,   1,   8,  27,  64, 125, 216, 343, 512, 729])  
 >>> a[2]  
 8  
 >>> a[2:5]  
 array([ 8, 27, 64])  
 >>> a[:6:2] = -1000    # equivalent to a[0:6:2] = -1000; from start to position 6, exclusive, set every 2nd element to -1000  
 >>> a  
 array([-1000,     1, -1000,    27, -1000,   125,   216,   343,   512,   729])  
 >>> a[ : :-1]                                 # reversed a  
 array([  729,   512,   343,   216,   125, -1000,    27, -1000,     1, -1000])  
 >>> for i in a:  
 ...         print i**(1/3.),  
 ...  
 nan 1.0 nan 3.0 nan 5.0 6.0 7.0 8.0 9.0</span>  

多维数组可以用tuple 来索引.

[python]view plain copy 
   
 >>> def f(x,y):  
 ...         return 10*x+y  
 ...  
 >>> b = fromfunction(f,(5,4),dtype=int)  
 >>> b  
 array([[ 0,  1,  2,  3],  
        [10, 11, 12, 13],  
        [20, 21, 22, 23],  
        [30, 31, 32, 33],  
        [40, 41, 42, 43]])  
 >>> b[2,3]  
 23  
 >>> b[0:5, 1]                       # each row in the second column of b  
 array([ 1, 11, 21, 31, 41])  
 >>> b[ : ,1]                        # equivalent to the previous example  
 array([ 1, 11, 21, 31, 41])  
 >>> b[1:3, : ]                      # each column in the second and third row of b  
 array([[10, 11, 12, 13],  
        [20, 21, 22, 23]])  
 >>> b[-1]                                  # the last row. Equivalent to b[-1,:]  
 array([40, 41, 42, 43])</span>