numpy

Numpy数组索引与切片详解,

原创已于 2023-04-10 23:55:39 修改 · 173 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#numpy #python #数据分析

于 2023-04-10 23:51:47 首次发布

本文详细介绍了Numpy中对ndarrays的索引和切片操作，包括基本索引、高级索引（整数数组索引和布尔数组索引）以及多维数组的切片。此外，还讨论了如何使用ellipsis和newaxis进行维度扩展，并展示了如何访问结构化数组的字段。

Numpy: Indexing and Slicing on ndarrays

1. Types of indexing

ndarrays can be indexed using the standard Python x[obj] syntax. According to the different obj, indexing can be divided into: Basic indexing, advanced indexing and field access.

Note that in Python, x[(exp1,exp2,…,expN)] is equivalent to x[exp1,exp2, …,expN]

2. Basic indexing

2.1 Single element indexing

0-based
accepts negative indices for indexing from the end of the array

For 1-D array

x=np.arange(12) #array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

x[0] #0
x[2] #2
x[-1] #11
x[-2] #10

For 2-D array

x.shape=(4,3)
'''
array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])
'''

It’s not necessary to separate each dimension’s index into different square brackets.

In the same suqare brackets, you can use symbol ‘,’ to split the indexing of the different axes.

x[1,2] #5

x[2,-2] #7
x[2][-2] #7

#If one indexes a multidimensional array with fewer indices than dimensions, one gets a subdimensional array
x[0] #array([0, 1, 2])

You should note that x[2,-2] == x[2][-2] , though the second case is more inefficient as a new temporary array is created after the first index that is subsequently indexed by 2.

2.2 Slicing and striding

Basic slicing of Numpy extends Python’s basic concept of slicing to N dimensions.

Note: all arrays generated by basic slicing are always views of the original array.

For 1-D array

The basic slice syntax is i:j:K where i is the starting index, j is the stopping index, and k is the step(k!=0). This selects the m elements in the correspoding dimension with index values i, i+k, …
```
x=np.arange(10) # array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

x[1:7:2] #array([1, 3, 5])
```
Negative i and j are interpreted as n+i and n+j where n is the number of elements in the correspoding dimension. Negative k makes stepping towards smaller indices.
```
x[-2:10] #array([8, 9])

x[-3:3:-1] #array([7, 6, 5, 4])
```

Omit the indices

x[5:] #array([5, 6, 7, 8, 9])

x[:] #select all indices along the axis

For 2-D array

arr2d = np.arange(12).reshape((4,3))
'''
array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])
'''

slicing only along the axis 0

arr2d[:2]
'''
array([[0, 1, 2],
       [3, 4, 5]])
'''

slicing only along the all axes

arr2d[:2, 1:]
'''
array([[1, 2],
       [4, 5]])
'''

Note: Integer i, returns the same values as i:i+1 except the dimensionality of the returned object is reduced by 1.

arr2d[1] #array([3, 4, 5])
arr2d[1:2] #array([[3, 4, 5]])

arr2d[1,1] # 4
arr2d[1,1:2] #array([4])
arr2d[1:2,1:2] #array([[4]])

A slicing tuple can always be constructed as obj and used in the x[obj] notation.

x[:3:1,1:2]
'''
array([[1],
       [4],
       [7]])
'''

obj=(slice(None,3,1),slice(1,2))
x[obj]
'''
array([[1],
       [4],
       [7]])
'''

2.3 Dimensional indexing tools

There are some tools to facilitate the easy matching of array shapes with expressions and in assignments.

Ellipsis expands to the number of : objects needed for the selection tuple to index all dimensions.

#N = 3
arr3d= x.reshape((2,2,3))
'''
array([[[ 0,  1,  2],
        [ 3,  4,  5]],

       [[ 6,  7,  8],
        [ 9, 10, 11]]])
'''

arr3d[...,0]
'''
array([[0, 3],
       [6, 9]])
'''

arr3d[:,:,0]
'''
array([[0, 3],
       [6, 9]])
'''

nexaxis object in the selection tuple serves to expands the dimensions of the resulting selction by onr unit-length dimension. The added dimension is the position of the newaxis object in the selection tuple. newaxis is an alias for None, and None can be used in place of this with the same result.

arr3d
'''
array([[[ 0,  1,  2],
        [ 3,  4,  5]],

       [[ 6,  7,  8],
        [ 9, 10, 11]]])
'''

arr3d.shape #(2, 2, 3)

arr3d[:, np.newaxis, :, :]
'''
array([[[[ 0,  1,  2],
         [ 3,  4,  5]]],


       [[[ 6,  7,  8],
         [ 9, 10, 11]]]])
'''
arr3darr3d[:, np.newaxis, :, :].shape #(2, 1, 2, 3)

arr3d[:, None, :, :]
'''
array([[[[ 0,  1,  2],
         [ 3,  4,  5]]],


       [[[ 6,  7,  8],
         [ 9, 10, 11]]]])
'''
arr3d[:, None, :, :].shape #(2, 1, 2, 3)

newaxis can be handy to combine two arrays

x = np.arange(5)

x[:, np.newaxis]
'''
array([[0],
       [1],
       [2],
       [3],
       [4]])
'''

x[np.newaxis,:]
'''
array([[0, 1, 2, 3, 4]])
'''

x[:, np.newaxis]+x[np.newaxis,:]
'''
array([[0, 1, 2, 3, 4],
       [1, 2, 3, 4, 5],
       [2, 3, 4, 5, 6],
       [3, 4, 5, 6, 7],
       [4, 5, 6, 7, 8]])
'''

3. Advanced indexing

There are two types of advanced indexing: Integer and Boolean

Advanced indexing always returns a copy of the data!

3.1 Integer array indexing

Integer array indexing allows selection of arbitary items in the array based on their N-dimensional index.

For 1-D array

x=np.arange(10, 1, -1)
# array([10,  9,  8,  7,  6,  5,  4,  3,  2])

x[[3,3,1,8]]
# array([7, 7, 9, 2])

x[[3,3,-3,8]]
# array([7, 7, 4, 2])

For multidimensional array

y =np.arange(35).reshape(5,7)
'''
array([[ 0,  1,  2,  3,  4,  5,  6],
       [ 7,  8,  9, 10, 11, 12, 13],
       [14, 15, 16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25, 26, 27],
       [28, 29, 30, 31, 32, 33, 34]])
'''

#select the elements which locate (0,0) (2,1) (4,2)
y[[0,2,4],[0,1,2]] #array([ 0, 15, 30])

#select the elements which locate (0,1) (2,1) (4,1)
y[[0, 2, 4], 1] # array([ 1, 15, 29])

y[np.array([0, 2, 4])]
'''
array([[ 0,  1,  2,  3,  4,  5,  6],
      [14, 15, 16, 17, 18, 19, 20],
      [28, 29, 30, 31, 32, 33, 34]])
'''

note that the difference between y[[0,2,4],[0,1,2]] and y[np.ix_([0,2,4],[0,1,2])]

y[[0,2,4],[0,1,2]] #array([ 0, 15, 30])

y[np.ix_([0,2,4],[0,1,2])] # [[(0,0),(0,1),(0,2)],[(2,0),(2,1),(2,2)],[(4,0),(4,1),(4,2)]]
'''
array([[ 0,  1,  2],
       [14, 15, 16],
       [28, 29, 30]])
'''

y[[0,2,4]][:,[0,1,2]]
'''
array([[ 0,  1,  2],
       [14, 15, 16],
       [28, 29, 30]])
'''''

3.2 Boolean array indexing

This advanced indexing occurs when obj is an array object of Boolean type, such as may be returned from comparion operators.

A common use case fot boolean array indexing is filtering derired element values. For example, one may wish to select all entries from an array which are not NaN:

x=np.array([[1.,2.],[np.nan,3.],[np.nan,np.nan]])
'''
array([[ 1.,  2.],
       [nan,  3.],
       [nan, nan]])
'''

np.isnan(x)
'''
array([[False, False],
       [ True, False],
       [ True,  True]])
'''

x[~np.isnan(x)] # array([1., 2., 3.])

4. combining advanced and basic indexing

y = np.arange(35).reshape(5,7)

#The slice operation extracts columns with index 1 and 2, followed by the index array operation which extracts rows with index 0,2 and 4
y[np.array([0, 2, 4]), 1:3]
'''
array([[ 1,  2],
       [15, 16],
       [29, 30]])
'''

5. Field access

If the ndarray object is a structured array the fields of the array can be accessed by indexing the array with strings, dictionary-like.

x = np.zeros((2, 2), dtype=[('a', np.int32), ('b', np.float64, (3, 3))])
'''
array([[(0, [[0., 0., 0.], [0., 0., 0.], [0., 0., 0.]]),
        (0, [[0., 0., 0.], [0., 0., 0.], [0., 0., 0.]])],
       [(0, [[0., 0., 0.], [0., 0., 0.], [0., 0., 0.]]),
        (0, [[0., 0., 0.], [0., 0., 0.], [0., 0., 0.]])]],
      dtype=[('a', '<i4'), ('b', '<f8', (3, 3))])
'''

x['a'].shape # (2,2)
x['a'].dtype # dtype('int32')
'''
array([[0, 0],
       [0, 0]])
'''

x['b'].shape #(2,2,3,3)
x['b'].dtype # dtype('float64')
'''
array([[[[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]]],


       [[[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]]]])
'''