python numpy 基础2

最新推荐文章于 2023-07-10 15:45:43 发布

春夏秋冬又一年

最新推荐文章于 2023-07-10 15:45:43 发布

阅读量1.1k

点赞数

CC 4.0 BY-SA版权

分类专栏： python 文章标签： python numpy

本文链接：https://blog.youkuaiyun.com/huangxia73/article/details/38063231

python 专栏收录该内容

12 篇文章

订阅专栏

本文介绍了NumPy库在数组操作中的高级应用，包括条件逻辑处理、数学统计方法、布尔数组方法、排序、唯一性与其他集合逻辑操作以及线性代数运算等内容。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

1 在数组内部使用条件逻辑（使用where）

假设我们有两个实数值数组: xarr和yarr，和一个布尔值数组 cond

In [140]: xarr = np.array([1.1, 1.2, 1.3, 1.4, 1.5])
In [141]: yarr = np.array([2.1, 2.2, 2.3, 2.4, 2.5])
In [142]: cond = np.array([True, False, True, True, False])

根据cond数组中的条件分别取xarr和yarr中的值

In [143]: result = [(x if c else y)   .....:           
        for x, y, c in zip(xarr, yarr, cond)]
In [144]: result Out[144]: [1.1000000000000001, 2.2000000000000002, 1.3, 1.3999999999999999, 2.5]

上式中有两个问题：（1）对于大数组速度较慢（2）不能使用在多维数组中

使用numpy的where

In [145]: result = np.where(cond, xarr, yarr)
In [146]: result Out[146]: array([ 1.1,  2.2,  1.3,  1.4,  2.5])

where中的第2个和第3个条件可以不必是数组，也可以是纯量（单个数值）

In [147]: arr = randn(4, 4)
In [148]: arr 
Out[148]: array([[ 0.6372,  2.2043,  1.7904,  0.0752],      
 [-1.5926, -1.1536,  0.4413,  0.3483],       
[-0.1798,  0.3299,  0.7827, -0.7585],      
 [ 0.5857,  0.1619,  1.3583, -1.3865]])

In [149]: np.where(arr > 0, 2, -2)
Out[149]: array([[ 2,  2,  2,  2],    
   [-2, -2,  2,  2],       
[-2,  2,  2, -2],      
 [ 2,  2,  2, -2]])

 # set only positive values to 2
In [150]: np.where(arr > 0, 2, arr)
Out[150]: array([[ 2.    ,  2.    ,  2.    ,  2.    ],       
[-1.5926, -1.1536,  2.    ,  2.    ], 
      [-0.1798,  2.    ,  2.    , -0.7585],      
 [ 2.    ,  2.    ,  2.    , -1.3865]])

使用where可以编写更复杂的逻辑（1默认等同于True ,0默认等同于False）

result = [] for i in range(n):  
      if cond1[i] and cond2[i]:     
           result.append(0)   
       elif cond1[i]:      
          result.append(1)   
       elif cond2[i]:        
           result.append(2)    
       else:        
            result.append(3)

上式等同于下式

np.where(cond1 & cond2, 0,  
           np.where(cond1, 1,          
                  np.where(cond2, 2, 3)))

2 数学统计方法

例如：均值函数mean，求和函数sum

In [151]: arr = np.random.randn(5, 4) 
In [152]: arr.mean()
Out[152]: 0.062814911084854597
In [153]: np.mean(arr) 
Out[153]: 0.062814911084854597
In [154]: arr.sum() 
Out[154]: 1.2562982216970919

均值函数mean和求和函数sum都有一个可选的参数axis,使用如下：

In [155]: arr.mean(axis=1) 
Out[155]: array([-1.2833,  0.2844,  0.6574,  0.6743, -0.0187])
In [156]: arr.sum(0) 
Out[156]: array([-3.1003, -1.6189,  1.4044,  4.5712])

基本的统计函数：

Method Description

sum Sum of all the elements in the array or along an axis. Zero-length arrays have sum 0.

mean Arithmetic mean. Zero-length arrays have NaN mean.

std, var Standard deviation and variance, respectively, with optional degrees of freedom adjust- ment (default denominator n).

min, max Minimum and maximum.

argmin, argmax Indices of minimum and maximum elements, respectively.

cumsum Cumulative sum of elements starting from 0 cumprod

Cumulative product of elements starting from 1

布尔数组的方法

对于布尔数组而言，sum方法只会统计元素等于True的个数

其他的方法有：any和all（都是针对数组值是否为True而言），用法如下：

In [162]: bools = np.array([False, False, True, False])
In [163]: bools.any() 
Out[163]: True
In [164]: bools.all()
Out[164]: False

排序

对于一维数组

    In [165]: arr = randn(8)
    In [166]: arr 

    Out[166]: array([ 0.6903,  0.4678,  0.0968, -0.1349,  0.9879,  0.0185, -1.3147,       -0.5425])
    In [167]: arr.sort()
    In [168]: arr 

    Out[168]: array([-1.3147, -0.5425, -0.1349,  0.0185,  0.0968,  0.4678,  0.6903,        0.9879])

对于二维数组

In [169]: arr = randn(5, 3)
In [170]: arr 
Out[170]: array([[-0.7139, -1.6331, -0.4959],      
 [ 0.8236, -1.3132, -0.1935],      
 [-1.6748,  3.0336, -0.863 ],     
  [-0.3161,  0.5362, -2.468 ],      
 [ 0.9058,  1.1184, -1.0516]])

In [171]: arr.sort(1)
In [172]: arr 
Out[172]: array([[-1.6331, -0.7139, -0.4959],       
[-1.3132, -0.1935,  0.8236],   
[-1.6748, -0.863 ,  3.0336],     
[-2.468 , -0.3161,  0.5362],      
[-1.0516,  0.9058,  1.1184]])

3 唯一性（相当于SQL的distinct）和其他集合逻辑

<strong> In [176]: names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])
 In [177]: np.unique(names) Out[177]: 

 array(['Bob', 'Joe', 'Will'],       dtype='|S4')
</strong>

上式等同于下式：

In [180]: sorted(set(names)) 
Out[180]: ['Bob', 'Joe', 'Will']

numpy的in1d方法用来检验某数组中元素是否全部来源于后面的一个数组

In [181]: values = np.array([6, 0, 0, 3, 2, 5, 6])
In [182]: np.in1d(values, [2, 3, 6])
Out[182]: array([ True, False, False,  True,  True, False,  True], dtype=bool)

对集合的操作类型及描述：

Method Description

unique(x) Compute the sorted, unique elements in x

intersect1d(x, y) Compute the sorted, common elements in x and y

union1d(x, y) Compute the sorted union of elements

in1d(x, y) Compute a boolean array indicating whether each element of x is contained in y

setdiff1d(x, y) Set difference, elements in x that are not in y

setxor1d(x, y) Set symmetric differences; elements that are in either of the arrays, but not both

4 线性代数操作

如：矩阵乘法，矩阵分解，行列式运算。

例如 numpy的dot方法就是计算两个矩阵的乘法，用法如下：

    In [194]: x = np.array([[1., 2., 3.], [4., 5., 6.]])
    In [195]: y = np.array([[6., 23.], [-1, 7], [8, 9]])
    In [198]: x.dot(y)  # equivalently np.dot(x, y)

    Out[198]: array([[  28.,   64.],       [  67.,  181.]])

numpy.linalg是一个矩阵运算的相关函数方法集合，其实现了Fortran工业标准的函数库，这也是 MATLAB and R, BLAS, LA- PACK等语言所使用的。

使用实例如下：

In [201]: from numpy.linalg import inv, qr
In [202]: X = randn(5, 5)
In [203]: mat = X.T.dot(X)
In [204]: inv(mat) 
Out[204]: array([[ 3.0361, -0.1808, -0.6878, -2.8285, -1.1911],     
  [-0.1808,  0.5035,  0.1215,  0.6702,  0.0956],      
 [-0.6878,  0.1215,  0.2904,  0.8081,  0.3049],      
 [-2.8285,  0.6702,  0.8081,  3.4152,  1.1557],      
 [-1.1911,  0.0956,  0.3049,  1.1557,  0.6051]])

In [206]: q, r = qr(mat)
In [207]: r 
Out[207]: array([[ -6.9271,   7.389 ,   6.1227,  -7.1163,  -4.9215],     
  [  0.    ,  -3.9735,  -0.8671,   2.9747,  -5.7402],       
[  0.    ,   0.    , -10.2681,   1.8909,   1.6079],     
[  0.    ,   0.    ,   0.    ,  -1.2996,   3.3577],       
[  0.    ,   0.    ,   0.    ,   0.    ,   0.5571]])