np.bincount():统计次数
接口为:
numpy.bincount(x, weights=None, minlength=None)
尤其适用于计算数据集的标签列(y_train)的分布(distribution),也即获得 class distribution :
>>>np.bincount(y_train.astype(np.int32))
>>>np.bincount(np.array([0, 1, 1, 3, 2, 1, 7]))
array([1, 3, 1, 1, 0, 0, 0, 1], dtype=int32)
# 分别统计0-7分别出现的次数
If weights is specified the input array is weighted by it, i.e. if a value n is found at position i, out[n] += weight[i] instead of out[n] += 1.
>>> w = np.array([0.3, 0.5, 0.2, 0.7, 1., -0.6]) # weights
>>> x = np.array([0, 1, 1, 3, 2, 2])
>>> np.bincount(x, w)
array([ 0.3, 0.7, 0.4, 0.7])
# 0: 0.3
# 1:0.5+0.2
# 2: 1+(-0.6)
# 3: 0.7
np.bincount() 从零开始计数(不允许序列中出现负数);
>>> np.bincount([3, 4, 4, 3, 3, 5])
array([0, 0, 0, 3, 2, 1], dtype=int32)
# 分别表示0出现的次数,
# 1出现的次数,
# 2出现的次数,
# 。。。
---------------------
原文:https://blog.youkuaiyun.com/lanchunhui/article/details/50491632