Numpy: Indexing and Slicing on ndarrays
1. Types of indexing
ndarrays can be indexed using the standard Python x[obj] syntax. According to the different obj, indexing can be divided into: Basic indexing, advanced indexing and field access.
Note that in Python, x[(exp1,exp2,…,expN)] is equivalent to x[exp1,exp2, …,expN]
2. Basic indexing
2.1 Single element indexing
- 0-based
- accepts negative indices for indexing from the end of the array
For 1-D array
x=np.arange(12) #array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
x[0] #0
x[2] #2
x[-1] #11
x[-2] #10
For 2-D array
x.shape=(4,3)
'''
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
'''
It’s not necessary to separate each dimension’s index into different square brackets.
In the same suqare brackets, you can use symbol ‘,’ to split the indexing of the different axes.
x[1,2] #5
x[2,-2] #7
x[2][-2] #7
#If one indexes a multidimensional array with fewer indices than dimensions, one gets a subdimensional array
x[0] #array([0, 1, 2])
You should note that x[2,-2] == x[2][-2] , though the second case is more inefficient as a new temporary array is created after the first index that is subsequently indexed by 2.
2.2 Slicing and striding
Basic slicing of Numpy extends Python’s basic concept of slicing to N dimensions.
Note: all arrays generated by basic slicing are always views of the original array.
For 1-D array
-
The basic slice syntax is i:j:K where i is the starting index, j is the stopping index, and k is the step(k!=0). This selects the m elements in the correspoding dimension with index values i, i+k, …
x=np.arange(10) # array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) x[1:7:2] #array([1, 3, 5])
-
Negative i and j are interpreted as n+i and n+j where n is the number of elements in the correspoding dimension. Negative k makes stepping towards smaller indices.
x[-2:10] #array([8, 9]) x[-3:3:-1] #array([7, 6, 5, 4])
-
Omit the indices
x[5:] #array([5, 6, 7, 8, 9]) x[:] #select all indices along the axis
For 2-D array
arr2d = np.arange(12).reshape((4,3))
'''
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
'''
-
slicing only along the axis 0
arr2d[:2] ''' array([[0, 1, 2], [3, 4, 5]]) '''
-
slicing only along the all axes
arr2d[:2, 1:] ''' array([[1, 2], [4, 5]]) '''
-
Note: Integer i, returns the same values as i:i+1 except the dimensionality of the returned object is reduced by 1.
arr2d[1] #array([3, 4, 5]) arr2d[1:2] #array([[3, 4, 5]]) arr2d[1,1] # 4 arr2d[1,1:2] #array([4]) arr2d[1:2,1:2] #array([[4]])
A slicing tuple can always be constructed as obj and used in the x[obj] notation.
x[:3:1,1:2]
'''
array([[1],
[4],
[7]])
'''
obj=(slice(None,3,1),slice(1,2))
x[obj]
'''
array([[1],
[4],
[7]])
'''
2.3 Dimensional indexing tools
There are some tools to facilitate the easy matching of array shapes with expressions and in assignments.
-
Ellipsis expands to the number of : objects needed for the selection tuple to index all dimensions.
#N = 3 arr3d= x.reshape((2,2,3)) ''' array([[[ 0, 1, 2], [ 3, 4, 5]], [[ 6, 7, 8], [ 9, 10, 11]]]) ''' arr3d[...,0] ''' array([[0, 3], [6, 9]]) ''' arr3d[:,:,0] ''' array([[0, 3], [6, 9]]) '''
-
nexaxis object in the selection tuple serves to expands the dimensions of the resulting selction by onr unit-length dimension. The added dimension is the position of the newaxis object in the selection tuple. newaxis is an alias for None, and None can be used in place of this with the same result.
arr3d ''' array([[[ 0, 1, 2], [ 3, 4, 5]], [[ 6, 7, 8], [ 9, 10, 11]]]) ''' arr3d.shape #(2, 2, 3) arr3d[:, np.newaxis, :, :] ''' array([[[[ 0, 1, 2], [ 3, 4, 5]]], [[[ 6, 7, 8], [ 9, 10, 11]]]]) ''' arr3darr3d[:, np.newaxis, :, :].shape #(2, 1, 2, 3) arr3d[:, None, :, :] ''' array([[[[ 0, 1, 2], [ 3, 4, 5]]], [[[ 6, 7, 8], [ 9, 10, 11]]]]) ''' arr3d[:, None, :, :].shape #(2, 1, 2, 3)
newaxis can be handy to combine two arrays
x = np.arange(5) x[:, np.newaxis] ''' array([[0], [1], [2], [3], [4]]) ''' x[np.newaxis,:] ''' array([[0, 1, 2, 3, 4]]) ''' x[:, np.newaxis]+x[np.newaxis,:] ''' array([[0, 1, 2, 3, 4], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6], [3, 4, 5, 6, 7], [4, 5, 6, 7, 8]]) '''
3. Advanced indexing
There are two types of advanced indexing: Integer and Boolean
Advanced indexing always returns a copy of the data!
3.1 Integer array indexing
Integer array indexing allows selection of arbitary items in the array based on their N-dimensional index.
For 1-D array
x=np.arange(10, 1, -1)
# array([10, 9, 8, 7, 6, 5, 4, 3, 2])
x[[3,3,1,8]]
# array([7, 7, 9, 2])
x[[3,3,-3,8]]
# array([7, 7, 4, 2])
For multidimensional array
y =np.arange(35).reshape(5,7)
'''
array([[ 0, 1, 2, 3, 4, 5, 6],
[ 7, 8, 9, 10, 11, 12, 13],
[14, 15, 16, 17, 18, 19, 20],
[21, 22, 23, 24, 25, 26, 27],
[28, 29, 30, 31, 32, 33, 34]])
'''
#select the elements which locate (0,0) (2,1) (4,2)
y[[0,2,4],[0,1,2]] #array([ 0, 15, 30])
#select the elements which locate (0,1) (2,1) (4,1)
y[[0, 2, 4], 1] # array([ 1, 15, 29])
y[np.array([0, 2, 4])]
'''
array([[ 0, 1, 2, 3, 4, 5, 6],
[14, 15, 16, 17, 18, 19, 20],
[28, 29, 30, 31, 32, 33, 34]])
'''
note that the difference between y[[0,2,4],[0,1,2]] and y[np.ix_([0,2,4],[0,1,2])]
y[[0,2,4],[0,1,2]] #array([ 0, 15, 30])
y[np.ix_([0,2,4],[0,1,2])] # [[(0,0),(0,1),(0,2)],[(2,0),(2,1),(2,2)],[(4,0),(4,1),(4,2)]]
'''
array([[ 0, 1, 2],
[14, 15, 16],
[28, 29, 30]])
'''
y[[0,2,4]][:,[0,1,2]]
'''
array([[ 0, 1, 2],
[14, 15, 16],
[28, 29, 30]])
'''''
3.2 Boolean array indexing
This advanced indexing occurs when obj is an array object of Boolean type, such as may be returned from comparion operators.
A common use case fot boolean array indexing is filtering derired element values. For example, one may wish to select all entries from an array which are not NaN:
x=np.array([[1.,2.],[np.nan,3.],[np.nan,np.nan]])
'''
array([[ 1., 2.],
[nan, 3.],
[nan, nan]])
'''
np.isnan(x)
'''
array([[False, False],
[ True, False],
[ True, True]])
'''
x[~np.isnan(x)] # array([1., 2., 3.])
4. combining advanced and basic indexing
y = np.arange(35).reshape(5,7)
#The slice operation extracts columns with index 1 and 2, followed by the index array operation which extracts rows with index 0,2 and 4
y[np.array([0, 2, 4]), 1:3]
'''
array([[ 1, 2],
[15, 16],
[29, 30]])
'''
5. Field access
If the ndarray object is a structured array the fields of the array can be accessed by indexing the array with strings, dictionary-like.
x = np.zeros((2, 2), dtype=[('a', np.int32), ('b', np.float64, (3, 3))])
'''
array([[(0, [[0., 0., 0.], [0., 0., 0.], [0., 0., 0.]]),
(0, [[0., 0., 0.], [0., 0., 0.], [0., 0., 0.]])],
[(0, [[0., 0., 0.], [0., 0., 0.], [0., 0., 0.]]),
(0, [[0., 0., 0.], [0., 0., 0.], [0., 0., 0.]])]],
dtype=[('a', '<i4'), ('b', '<f8', (3, 3))])
'''
x['a'].shape # (2,2)
x['a'].dtype # dtype('int32')
'''
array([[0, 0],
[0, 0]])
'''
x['b'].shape #(2,2,3,3)
x['b'].dtype # dtype('float64')
'''
array([[[[0., 0., 0.],
[0., 0., 0.],
[0., 0., 0.]],
[[0., 0., 0.],
[0., 0., 0.],
[0., 0., 0.]]],
[[[0., 0., 0.],
[0., 0., 0.],
[0., 0., 0.]],
[[0., 0., 0.],
[0., 0., 0.],
[0., 0., 0.]]]])
'''