数组的集合函数、数组排序以及文件读取

最新推荐文章于 2021-04-21 20:45:36 发布

原创最新推荐文章于 2021-04-21 20:45:36 发布 · 271 阅读

0 ·

CC 4.0 BY-SA版权

python 专栏收录该内容

53 篇文章

订阅专栏

本文详细介绍了NumPy库中数组的高级操作，包括集合函数、排序、获取索引和文件读写。通过实例展示了如何进行数组去重、排序、求交集、并集、差集和对称差集，以及如何获取数组元素的索引和保存、读取数组到npy和npz文件。此外，还介绍了如何将数组以CSV格式保存。

6、数组的集合函数

arr1 = np.arange(10,20)
arr2 = np.arange(0,5)
arr3 = np.array([10,10,10,20,30,40,50,1])

print('########数组去重，排序#######')
print(np.unique(arr3))
print('####数组的交集#######')
print(np.intersect1d(arr1,arr3))
print('####数组的并集########')
print(np.union1d(arr1,arr2))
print('####数组的差集######')
print(np.setdiff1d(arr1,arr3))
print(np.setdiff1d(arr3,arr1))
print('#####对称差集：二者差集的集合####')
print(np.setxor1d(arr1,arr3))
print(np.setxor1d(arr3,arr1))
print('####判断是否包含元素####')
print(np.in1d(arr1,arr3))
print(np.in1d(arr3,arr2))

展示：
########数组去重，排序#######
[ 1 10 20 30 40 50]
####数组的交集#######
[10]
####数组的并集########
[ 0  1  2  3  4 10 11 12 13 14 15 16 17 18 19]
####数组的差集######
[11 12 13 14 15 16 17 18 19]
[ 1 20 30 40 50]
#####对称差集：二者差集的集合####
[ 1 11 12 13 14 15 16 17 18 19 20 30 40 50]
[ 1 11 12 13 14 15 16 17 18 19 20 30 40 50]
####判断是否包含元素####
[ True False False False False False False False False False]
[False False False False False False False  True]

7、数组排序

7.1 数组排序

def sort(a, axis=-1, kind='quicksort', order=None):
    """
    Return a sorted copy of an array.
    
注意点：
返回的是 原始数据的复制版本排序之后的

    def sort(self, axis=-1, kind='quicksort', order=None): # real signature unknown; restored from __doc__
        """
        a.sort(axis=-1, kind='quicksort', order=None)
        
            Sort an array, in-place.
        
            Parameters
            ----------
            axis : int, optional
                Axis along which to sort. Default is -1, which means sort along the
                last axis.
            kind : {'quicksort', 'mergesort', 'heapsort', 'stable'}, optional
                Sorting algorithm. Default is 'quicksort'.
            order : str or list of str, optional
                When `a` is an array with fields defined, this argument specifies
                which fields to compare first, second, etc.  A single field can
                be specified as a string, and not all fields need be specified,
                but unspecified fields will still be used, in the order in which
                they come up in the dtype, to break ties.
        


注意点：
在原始数据上做排序

示例：

import numpy as np

arr3 = np.array([100,10,10,10,20,30,40,50,1])

print(np.sort(arr3))
print(arr3)
arr3.sort()
print(arr3)


展示：
[  1  10  10  10  20  30  40  50 100]
[100  10  10  10  20  30  40  50   1]
[  1  10  10  10  20  30  40  50 100]

7.2 获取数组元素的索引(argwhere)

print('########argwhere取数组的下标########')
# todo 伊朗2009年的数据
yilang = np.argwhere(arr1=='伊朗')
nianfen = np.argwhere(arr1=='2009年')
print(yilang)
print(nianfen)
print(arr1[11,13])

index1 = np.argwhere(arr1=='2009年')[0,0]
index2 = np.argwhere(arr1=='伊朗')[0,1]
print(arr1[index1,index2])


展示：
########argwhere取数组的下标########
[[ 4 13]]
[[11  0]]
73370982
73370982

8、文件读取

8.1 保存npy,npz格式，数组保存

import numpy as np

arr1 = np.arange(20).reshape(4,5)
arr2 = np.arange(10).reshape(2,5)

# 保存数组的二进制文件
np.save('a1',arr1)
# 读取数组的二进制文件
arr3 = np.load('a1.npy')
print(arr3)

print('##############')
np.savez('test',nd1=arr1,nd2=arr2)
arr4 = np.load('test.npz')
print(arr4['nd1'])
print(arr4['nd2'])

展示：

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]
##############
[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]
[[0 1 2 3 4]
 [5 6 7 8 9]]

8.2 指定文件格式存储

import numpy as np

arr1 = np.array([
    ['col1','col2','col3'],
    ['java','python','go'],
    ['mysql','redis','mongodb']
])
print(arr1)
np.savetxt('aaa.csv',arr1,fmt='%s',delimiter=',')


注意点：
def savetxt(fname, X, fmt='%.18e', delimiter=' ', newline='\n', header='',
            footer='', comments='# ', encoding=None)

fmt指定的是数据的格式'%s' '%d' '%f' 等
delimiter：分隔符，CSV文件需要指定 ','

# arr2 = np.genfromtxt('aaa.csv',delimiter=',',dtype=str,usecols=(1,))
arr2 = np.genfromtxt('aaa.csv',delimiter=',',dtype=str,usecols=[1])

print(arr2)


注意点：
fname, dtype=float, comments='#', delimiter=None,usecols=None
usecols 指定的是读取的列，下标从0开始，如果元组，要使用 ','