Numpy-花式索引和数组索引

最新推荐文章于 2025-10-22 14:45:54 发布

原创最新推荐文章于 2025-10-22 14:45:54 发布 · 1.2k 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#python #numpy

Python数据分析专栏收录该内容

9 篇文章

订阅专栏

本文介绍了Numpy中数组的高级索引方式，包括使用整数数组和布尔数组进行索引。通过实例展示了如何通过索引数组选取特定元素，以及如何使用布尔数组进行条件选择和赋值。此外，还探讨了ix_()函数在组合不同向量以获取所有可能组合的应用。

Numpy提供比常规Python序列更多的索引功能。除了通过整数和切片之外，正如我们之前看到的，数组可以由整数数组和布尔数组索引

使用索引数组进行索引
import numpy as np
import numpy as np
a=np.arange(12)**2
a
array([ 0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121],
dtype=int32)
i
i=np.array([1,1,3,8,5])
i
array([1, 1, 3, 8, 5])
数组索引
a[i] #数组索引
array([ 1, 1, 9, 64, 25], dtype=int32)
j
j=np.array([[3,4],
[9,7]])
j
array([[3, 4],
[9, 7]])
j
a[j]
array([[ 9, 16],
[81, 49]], dtype=int32)
当索引数组a是多维的，单个索引数组指的是第一个维度a。以下示例通过使用调色板将标签图像转换为彩色图像来显示此行为

ttlte
pattlte=np.array([[0,0,0],
[255,0,0],
[0,255,0],
[0,0,255],
[255,255,255]])
pattlte
array([[ 0, 0, 0],
[255, 0, 0],
[ 0, 255, 0],
[ 0, 0, 255],
[255, 255, 255]])
image
image=np.array([[0,1,2,0],
[0,3,4,0]])
image
array([[0, 1, 2, 0],
[0, 3, 4, 0]])
image
pattlte[image]
array([[[ 0, 0, 0],
[255, 0, 0],
[ 0, 255, 0],
[ 0, 0, 0]],

   [[  0,   0,   0],
    [  0,   0, 255],
    [255, 255, 255],
    [  0,   0,   0]]])

我们还可以为多个维度提供索引。每个维度的索引数组必须具有相同的形状

a
a=np.arange(12).reshape((3,4))
a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
i=np.array([[0,1],
[1,2]])
i
array([[0, 1],
[1, 2]])
j=np.array([[2,1],
[3,3]])
j
array([[2, 1],
[3, 3]])
#i和j必须有相同的形状
a[i,j] #i和j必须有相同的形状
array([[ 2, 5],
[ 7, 11]])
i,2
a[i,2]
array([[ 2, 6],
[ 6, 10]])
a[:,j]
array([[[ 2, 1],
[ 3, 3]],

   [[ 6,  5],
    [ 7,  7]],

   [[10,  9],
    [11, 11]]])

当然，我们可以按顺序（比如列表）放入i，j然后使用列表进行索引

l
l=[i,j]
l
[array([[0, 1],
[1, 2]]), array([[2, 1],
[3, 3]])]
l
a[l]
array([[ 2, 5],
[ 7, 11]])
但是，我们不能通过放入i和j数组来实现这一点，因为这个数组将被解释为a的第一个维度

s
s=np.array([i,j])
s
array([[[0, 1],
[1, 2]],

   [[2, 1],
    [3, 3]]])

s
a[s]

IndexError Traceback (most recent call last)
in ()
----> 1 a[s]

IndexError: index 3 is out of bounds for axis 0 with size 3

a[list(s)] #强转为列表或元组都可以
array([[ 2, 5],
[ 7, 11]])
使用数组索引的另一个常见用法是搜索与时间相关的序列的最大值

time
time=np.linspace(20,145,5) #time scale
time
array([ 20. , 51.25, 82.5 , 113.75, 145. ])
data
data=np.sin(np.arange(20)).reshape((5,4))
data
array([[ 0. , 0.84147098, 0.90929743, 0.14112001],
[-0.7568025 , -0.95892427, -0.2794155 , 0.6569866 ],
[ 0.98935825, 0.41211849, -0.54402111, -0.99999021],
[-0.53657292, 0.42016704, 0.99060736, 0.65028784],
[-0.28790332, -0.96139749, -0.75098725, 0.14987721]])
ind=data.argmax(axis=0) #每一列
ind
array([2, 0, 3, 1], dtype=int64)
time_max
time_max=time[ind]
time_max
array([ 82.5 , 20. , 113.75, 51.25])
shape
data_max=data[ind,range(data.shape[1])]
data_max
array([0.98935825, 0.84147098, 0.99060736, 0.6569866 ])
axis=0
np.all(data_max==data.max(axis=0))
True
)
range(data.shape[1])
range(0, 4)
您还可以使用数组索引作为分配给的目标

a
a=np.arange(5)
a
array([0, 1, 2, 3, 4])
a
a[[1,3,4]]=0
a
array([0, 0, 2, 0, 0])
但是当索引列表包含重复时，分配会多次完成，留下最后一个值

a
a=np.arange(5)
a
array([0, 1, 2, 3, 4])
a
a[[0,0,2]]=[1,2,3]
a
array([2, 1, 3, 3, 4])
这是合理的，但请注意是否要使用Python的+=构造，因为它可能不会按预期执行

a
a=np.arange(5)
a
array([0, 1, 2, 3, 4])
a
a[[0,0,2]]+=1
a
array([1, 1, 3, 3, 4])
即使0在索引列表中出现两次，第0个元素也只增加一次。这是因为Python要求“a+=1等同于a=a+1”

使用布尔数组进行索引
当我们使用（整数）索引数组索引数组时，我们提供了要索引的索引列表。使用布尔索引，方法是不同的；我们明确地选择我们想要的数组中的哪些项目以及我们不需要的项目。人们可以想到的最自然的布尔索引方法是使用与原始数组具有相同形状的布尔数组

a
a=np.arange(12).reshape((3,4))
a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
4
b=a>4
b
b
array([[False, False, False, False],
[False, True, True, True],
[ True, True, True, True]])
a[b]
array([ 5, 6, 7, 8, 9, 10, 11])
此属性在分配中非常有用

a
a[b]=0
a
array([[0, 1, 2, 3],
[4, 0, 0, 0],
[0, 0, 0, 0]])
使用布尔值进行索引的第二种方法更类似于整数索引；对于数组的每个维度，我们给出一个1D布尔数组，选择我们想要的切片

a
a=np.arange(12).reshape((3,4))
a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
True
b1=np.array([False,True,True])
b1
array([False, True, True])
b2
b2=np.array([True,False,True,False])
b2
array([ True, False, True, False])
a[b1,:]
array([[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
a[:,b2]
array([[ 0, 2],
[ 4, 6],
[ 8, 10]])
a[b1,b2]
array([ 4, 10])
请注意，1D布尔数组的长度必须要与切片的尺寸（轴）的长度一致。

ix_()函数
可用于组合不同的向量，以便获得每个n-uplet的结果。例如，如果要计算每个向量a，b，c中取得的所有三元组a+b*c:

a
a=np.array([2,3,4,5])
a
array([2, 3, 4, 5])
b
b=np.array([8,5,4])
b
array([8, 5, 4])
c
c=np.array([5,4,6,8,3])
c
array([5, 4, 6, 8, 3])
ax,bx,cx=np.ix_(a,b,c)
ax
ax
array([[[2]],

   [[3]],

   [[4]],

   [[5]]])

bx
bx
array([[[8],
[5],
[4]]])
cx
cx
array([[[5, 4, 6, 8, 3]]])
ax.shape
ax.shape
(4, 1, 1)
bx.shape
bx.shape
(1, 3, 1)
cx.shape
cx.shape
(1, 1, 5)
result=ax+bxcx
result=ax+bxcx
result
array([[[42, 34, 50, 66, 26],
[27, 22, 32, 42, 17],
[22, 18, 26, 34, 14]],

   [[43, 35, 51, 67, 27],
    [28, 23, 33, 43, 18],
    [23, 19, 27, 35, 15]],

   [[44, 36, 52, 68, 28],
    [29, 24, 34, 44, 19],
    [24, 20, 28, 36, 16]],

   [[45, 37, 53, 69, 29],
    [30, 25, 35, 45, 20],
    [25, 21, 29, 37, 17]]])

bx*cx
array([[[40, 32, 48, 64, 24],
[25, 20, 30, 40, 15],
[20, 16, 24, 32, 12]]])
result[3,2,4]
17
a[3]+b[2]*c[4]
17
您还可以按以下方式试下reduce
您还可以按以下方式试下reduce
r
def ufunc_reduce(ufct,*vectors):
vs=np.ix_(*vectors)
r=ufct.identity
for v in vs:
r=ufct(r,v)
return r
ufunc_reduce(np.add,a,b,c)
array([[[15, 14, 16, 18, 13],
[12, 11, 13, 15, 10],
[11, 10, 12, 14, 9]],

   [[16, 15, 17, 19, 14],
    [13, 12, 14, 16, 11],
    [12, 11, 13, 15, 10]],

   [[17, 16, 18, 20, 15],
    [14, 13, 15, 17, 12],
    [13, 12, 14, 16, 11]],

   [[18, 17, 19, 21, 16],
    [15, 14, 16, 18, 13],
    [14, 13, 15, 17, 12]]])

Numpy-花式索引和数组索引

s a[s]

s
a[s]