【机器学习吴恩达deeplearning.ai_01_week3问题汇总】

神龍Str

已于 2022-07-22 15:56:51 修改

阅读量656

点赞数 1

分类专栏： Deep learning 文章标签：人工智能机器学习 python

于 2022-07-17 15:34:48 首次发布

本文链接：https://blog.youkuaiyun.com/qq_43502119/article/details/125831884

版权

Deep learning 专栏收录该内容

1 篇文章

订阅专栏

ValueError: 'c' argument has 1 elements, which is not acceptable for use with 'x' with size 400, 'y' with size 400.课程作业问题汇总

问题汇总
deeplearning.ai_01_week3(Planar data classification with on hidden layer)
函数解读planar_utils

)

问题汇总

deeplearning.ai_01_week3(Planar data classification with on hidden layer)

①ERROR：ValueError: ‘c’ argument has 1 elements, which is not acceptable for use with ‘x’ with size 400, ‘y’ with size 400.

在这里插入图片描述

在运行Planar data classification with on hidden layer该程序时会遇到：（ValueError: ‘c’ argument has 1 elements, which is not acceptable for use with ‘x’ with size 400, ‘y’ with size 400.）这一问题，说明c只有一个元素，也就是Y与c不匹配，大概率是是库版本引起的问题。

查阅 scatter() 函数的说明：matplotlib散点图

matplotlib.pyplot.scatter(x, y, s=None, c=None, marker=None, cmap=None, norm=None, vmin=None, vmax=None, alpha=None, linewidths=None, *, edgecolors=None, plotnonfinite=False, data=None, **kwargs)

变量	含义
x，y	长度相同的数组，也就是我们即将绘制散点图的数据点，输入数据。
s	点的大小，默认 20，也可以是个数组，数组每个参数为对应点的大小。
c	点的颜色，默认蓝色 ‘b’，也可以是个 RGB 或 RGBA 二维行数组。
marker	点的样式，默认小圆圈 ‘o’。
cmap	Colormap，默认 None，标量或者是一个 colormap 的名字，只有 c 是一个浮点数数组的时才使用。如果没有申明就是 image.cmap。

查看示例：

import matplotlib.pyplot as plt
import numpy as np

x = np.array([1, 2, 3, 4, 5, 6, 7, 8])
y = np.array([1, 4, 9, 16, 7, 11, 23, 18])
colors = np.array(["red","green","black","orange","purple","beige","cyan","magenta"])

plt.scatter(x, y, c=colors)
plt.show()

[ 在这里插入图片描述

colors应该是一维数组或者用列表来存储颜色

方法一:将Y以一维数组的形式传给c

将scatter中的参数修改为：

plt.scatter(X[0, :], X[1, :], c=Y[0,:], s=40, cmap=plt.cm.Spectral);

注意：week3 中 Planar data classification with on hidden layer有很多这样的问题，比如在画边界图时会经常报错，记得将Y改为Y[0. :]进行传参。

方法二:以列表的形式创建Y(可以尝试一下)

我的方法是修改planar_utils.py文件中的 load_planar_dataset() 函数，将Y修改为传统的字符参数(0 for ‘’red‘’, 1 for ‘’blue‘’)，如果用此方法后面作业会有些小改动，但是可以通过方法二加深对一些表达式和函数的理解，未尝不是一种好方法：

def load_planar_dataset():
    np.random.seed(1)
    m = 400 # number of examples
    N = int(m/2) # number of points per class
    D = 2 # dimensionality
    X = np.zeros((m,D)) # data matrix where each row is a single example
    
    ### START ORIGINAL CODE
    # labels vector (0 for red, 1 for blue)
    # Y = np.zeros((m,1), dtype='uint8') 
    ### END ORIGINAL CODE
    
    ### START MY CODE
    # List can contain various strings
    Y = list()
    ### END MY CODE

    a = 4 # maximum ray of the flower

    for j in range(2):
        ix = range(N*j,N*(j+1))
        t = np.linspace(j*3.12,(j+1)*3.12,N) + np.random.randn(N)*0.2 # theta
        r = a*np.sin(4*t) + np.random.randn(N)*0.2 # radius
        X[ix] = np.c_[r*np.sin(t), r*np.cos(t)]
        ### START ORIGINAL CODE
        # Y[ix] = j
        ### END ORIGINAL CODE
        
        ### START MY CODE
        Y[N*j: N*(j+1)] = ['red']*len(ix) if j==0 else ['blue']*len(ix)
        ### END MY CODE
    X = X.T
    # make the following line commented
    # Y = Y.T

    return X, Y

问题即可得到解决：
在这里插入图片描述

②WARNING：DataConversionWarning：A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_smaples, ) for exaple using ravel()

在这里插入图片描述
cell中的代码运行后出现了warning亦或是further warning，代码应该运行的通，但是提示要将Y.T进一步reshape成(samples, )这样的形式， 此cell中的Y应该reshape成(400, ) 的形式。

有趣的是吴恩达教授在前面有意无意的提出如果一个obj.shape=(samples, )那么这个obj就不是一个数组，是叫啥他也不知道whatever。如下code所示,colors是什么无所谓，反正colors.shape = (8, ) ：

colors = np.array(["red","green","black","orange","purple","beige","cyan","magenta"])
print(colors.shape)
>>>(8,)

因此此处的warning大概也是想获得此类形式，根据提示使用for example using ravel() ，将Y进行改写：

老规矩，先看ravel函数菜鸟教程:

import numpy as np
 
a = np.arange(9).reshape(3,3) 
print ('原始数组：')
for row in a:
    print (row)
 
#对数组中每个元素都进行处理，可以使用flat属性，该属性是一个数组元素迭代器：
print ('迭代后的数组：')
for element in a.flat:
    print (element)
>>>--------------------输出结果--------------------------
>>>原始数组：
>>>[0 1 2]
>>>[3 4 5]
>>>[6 7 8]
>>>迭代后的数组：
>>>0
>>>1
>>>2
>>>3
>>>4
>>>5
>>>6
>>>7
>>>8

因此将Y按列ravel即可：

# Train the logistic regression classifier
clf = sklearn.linear_model.LogisticRegressionCV();
Y = np.ravel(Y, order = 'c') # 将Y按列展开
clf.fit(X.T, Y);

第一个warning解决掉了，但是后面还会用到数组形式的Y，要用语句将Y还原成数组，否则Y以list的形式运算会报错，所以在最后4.5 Prediction处还要重新获取一下数组形式的Y。
在这里插入图片描述

X, Y = load_planar_dataset()

现在解决第二个warning：

③WARNING：FutureWarning: The default value of cv will change from 3 to 5 in version 0.22. Specify it explicitly to silence this warning. warnings.warn(CV_WARNING, FutureWarning)

在这里插入图片描述
按提示sklearn.linear_model.LogisticRegressionCV中cv参数默认值发生了改变，从3变成了5需要额外specify一下，考虑到coursra上版本比较老，这里默认选择cv=3后warning自动消失：

# Train the logistic regression classifier
clf = sklearn.linear_model.LogisticRegressionCV(cv = 3); # specify cv = 3
Y = np.ravel(Y, order = 'c')
clf.fit(X.T, Y.T);