ValueError: 'c' argument has 1 elements, which is not acceptable for use with 'x' with size 400, 'y' with size 400.课程作业问题汇总
- 问题汇总
- deeplearning.ai_01_week3(Planar data classification with on hidden layer)
- ①ERROR:ValueError: 'c' argument has 1 elements, which is not acceptable for use with 'x' with size 400, 'y' with size 400.
- ②WARNING:DataConversionWarning:A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_smaples, ) for exaple using ravel()
- ③WARNING:FutureWarning: The default value of cv will change from 3 to 5 in version 0.22. Specify it explicitly to silence this warning. warnings.warn(CV_WARNING, FutureWarning)
- 函数解读planar_utils
)
问题汇总
deeplearning.ai_01_week3(Planar data classification with on hidden layer)
①ERROR:ValueError: ‘c’ argument has 1 elements, which is not acceptable for use with ‘x’ with size 400, ‘y’ with size 400.
在运行Planar data classification with on hidden layer该程序时会遇到:(ValueError: ‘c’ argument has 1 elements, which is not acceptable for use with ‘x’ with size 400, ‘y’ with size 400.)这一问题,说明c只有一个元素,也就是Y与c不匹配,大概率是是库版本引起的问题。
查阅 scatter() 函数的说明:matplotlib散点图
matplotlib.pyplot.scatter(x, y, s=None, c=None, marker=None, cmap=None, norm=None, vmin=None, vmax=None, alpha=None, linewidths=None, *, edgecolors=None, plotnonfinite=False, data=None, **kwargs)
变量 | 含义 |
---|---|
x,y | 长度相同的数组,也就是我们即将绘制散点图的数据点,输入数据。 |
s | 点的大小,默认 20,也可以是个数组,数组每个参数为对应点的大小。 |
c | 点的颜色,默认蓝色 ‘b’,也可以是个 RGB 或 RGBA 二维行数组。 |
marker | 点的样式,默认小圆圈 ‘o’。 |
cmap | Colormap,默认 None,标量或者是一个 colormap 的名字,只有 c 是一个浮点数数组的时才使用。如果没有申明就是 image.cmap。 |
查看示例:
import matplotlib.pyplot as plt
import numpy as np
x = np.array([1, 2, 3, 4, 5, 6, 7, 8])
y = np.array([1, 4, 9, 16, 7, 11, 23, 18])
colors = np.array(["red","green","black","orange","purple","beige","cyan","magenta"])
plt.scatter(x, y, c=colors)
plt.show()
[
colors应该是一维数组或者用列表来存储颜色
方法一:将Y以一维数组的形式传给c
将scatter中的参数修改为:
plt.scatter(X[0, :], X[1, :], c=Y[0,:], s=40, cmap=plt.cm.Spectral);
注意:week3 中 Planar data classification with on hidden layer有很多这样的问题,比如在画边界图时会经常报错,记得将Y改为Y[0. :]进行传参。
方法二:以列表的形式创建Y(可以尝试一下)
我的方法是修改planar_utils.py文件中的 load_planar_dataset() 函数,将Y修改为传统的字符参数(0 for ‘’red‘’, 1 for ‘’blue‘’),如果用此方法后面作业会有些小改动,但是可以通过方法二加深对一些表达式和函数的理解,未尝不是一种好方法:
def load_planar_dataset():
np.random.seed(1)
m = 400 # number of examples
N = int(m/2) # number of points per class
D = 2 # dimensionality
X = np.zeros((m,D)) # data matrix where each row is a single example
### START ORIGINAL CODE
# labels vector (0 for red, 1 for blue)
# Y = np.zeros((m,1), dtype='uint8')
### END ORIGINAL CODE
### START MY CODE
# List can contain various strings
Y = list()
### END MY CODE
a = 4 # maximum ray of the flower
for j in range(2):
ix = range(N*j,N*(j+1))
t = np.linspace(j*3.12,(j+1)*3.12,N) + np.random.randn(N)*0.2 # theta
r = a*np.sin(4*t) + np.random.randn(N)*0.2 # radius
X[ix] = np.c_[r*np.sin(t), r*np.cos(t)]
### START ORIGINAL CODE
# Y[ix] = j
### END ORIGINAL CODE
### START MY CODE
Y[N*j: N*(j+1)] = ['red']*len(ix) if j==0 else ['blue']*len(ix)
### END MY CODE
X = X.T
# make the following line commented
# Y = Y.T
return X, Y
问题即可得到解决:
②WARNING:DataConversionWarning:A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_smaples, ) for exaple using ravel()
cell中的代码运行后出现了warning亦或是further warning,代码应该运行的通,但是提示要将Y.T进一步reshape成(samples, )这样的形式, 此cell中的Y应该reshape成(400, ) 的形式。
有趣的是吴恩达教授在前面有意无意的提出如果一个obj.shape=(samples, )那么这个obj就不是一个数组,是叫啥他也不知道whatever。如下code所示,colors是什么无所谓,反正colors.shape = (8, ) :
colors = np.array(["red","green","black","orange","purple","beige","cyan","magenta"])
print(colors.shape)
>>>(8,)
因此此处的warning大概也是想获得此类形式,根据提示使用for example using ravel() ,将Y进行改写:
老规矩,先看ravel函数菜鸟教程:
import numpy as np
a = np.arange(9).reshape(3,3)
print ('原始数组:')
for row in a:
print (row)
#对数组中每个元素都进行处理,可以使用flat属性,该属性是一个数组元素迭代器:
print ('迭代后的数组:')
for element in a.flat:
print (element)
>>>--------------------输出结果--------------------------
>>>原始数组:
>>>[0 1 2]
>>>[3 4 5]
>>>[6 7 8]
>>>迭代后的数组:
>>>0
>>>1
>>>2
>>>3
>>>4
>>>5
>>>6
>>>7
>>>8
因此将Y按列ravel即可:
# Train the logistic regression classifier
clf = sklearn.linear_model.LogisticRegressionCV();
Y = np.ravel(Y, order = 'c') # 将Y按列展开
clf.fit(X.T, Y);
第一个warning解决掉了,但是后面还会用到数组形式的Y,要用语句将Y还原成数组,否则Y以list的形式运算会报错,所以在最后4.5 Prediction处还要重新获取一下数组形式的Y。
X, Y = load_planar_dataset()
现在解决第二个warning:
③WARNING:FutureWarning: The default value of cv will change from 3 to 5 in version 0.22. Specify it explicitly to silence this warning. warnings.warn(CV_WARNING, FutureWarning)
按提示sklearn.linear_model.LogisticRegressionCV中cv参数默认值发生了改变,从3变成了5需要额外specify一下,考虑到coursra上版本比较老,这里默认选择cv=3后warning自动消失:
# Train the logistic regression classifier
clf = sklearn.linear_model.LogisticRegressionCV(cv = 3); # specify cv = 3
Y = np.ravel(Y, order = 'c')
clf.fit(X.T, Y.T);
函数解读planar_utils
planar_utils中plot_decision_boundary(model, X, y)函数比较有意思,他能够将我们predict的点划分成不同的区域,并以图形化的界面展示,作为初学者笔者对其产生了较为浓厚的兴趣。
(Continuing->)