Machine Learning week 6 quiz: Machine Learning System Design

最新推荐文章于 2022-08-14 17:52:28 发布

原创最新推荐文章于 2022-08-14 17:52:28 发布 · 2.5w 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#Machine Learning #quiz #System Design #Coursera #Andrew Ng

机器学习专栏收录该内容

294 篇文章

订阅专栏

本文探讨了机器学习系统设计中的关键概念，包括正负样本分类、召回率计算、大规模数据集训练条件、阈值调整对分类器性能的影响以及如何在不同数据集上评估模型的有效性。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Machine Learning System Design

5 试题

You are working on a spam classification system using regularized logistic regression. "Spam" is a positive class (y = 1) and "not spam" is the negative class (y = 0). You have trained your classifier and there are m = 1000 examples in the cross-validation set. The chart of predicted class vs. actual class is:

	Actual Class: 1	Actual Class: 0
Predicted Class: 1	85	890
Predicted Class: 0	15	10

For reference:

Accuracy = (true positives + true negatives) / (total examples)
Precision = (true positives) / (true positives + false positives)
Recall = (true positives) / (true positives + false negatives)
F1 score = (2 * precision * recall) / (precision + recall)

What is the classifier's recall (as a value from 0 to 1)?

Enter your answer in the box below. If necessary, provide at least two values after the decimal point.

Suppose a massive dataset is available for training a learning algorithm. Training on a lot of data is likely to give good performance when two of the following conditions hold true.

Which are the two?

The classes are not too skewed.

A human expert on the application domain

can confidently predict y when given only the features x

(or more generally, if we have some way to be confident

that x contains sufficient information to predict y

accurately).

Our learning algorithm is able to

represent fairly complex functions (for example, if we

train a neural network or other model with a large

number of parameters).

When we are willing to include high

order polynomial features of x (such as x21, x22,

x1x2, etc.).

Suppose you have trained a logistic regression classifier which is outputing hθ(x).

Currently, you predict 1 if hθ(x)≥threshold, and predict 0 if hθ(x)ltthreshold, where currently the threshold is set to 0.5.

Suppose you decrease the threshold to 0.1. Which of the following are true? Check all that apply.

The classifier is likely to now have higher recall.

The classifier is likely to have unchanged precision and recall, but

higher accuracy.

The classifier is likely to now have higher precision.

The classifier is likely to have unchanged precision and recall, but

lower accuracy.

Suppose you are working on a spam classifier, where spam

emails are positive examples (y=1) and non-spam emails are

negative examples (y=0). You have a training set of emails

in which 99% of the emails are non-spam and the other 1% is

spam. Which of the following statements are true? Check all

that apply.

If you always predict non-spam (output

y=0), your classifier will have 99% accuracy on the

training set, and it will likely perform similarly on

the cross validation set.

If you always predict non-spam (output

y=0), your classifier will have an accuracy of

99%.

A good classifier should have both a

high precision and high recall on the cross validation

set.

If you always predict non-spam (output

y=0), your classifier will have 99% accuracy on the

training set, but it will do much worse on the cross

validation set because it has overfit the training

data.

Which of the following statements are true? Check all that apply.

It is a good idea to spend a lot of time

collecting a large amount of data before building

your first version of a learning algorithm.

If your model is underfitting the

training set, then obtaining more data is likely to

help.

On skewed datasets (e.g., when there are

more positive examples than negative examples), accuracy

is not a good measure of performance and you should

instead use F1 score based on the

precision and recall.

After training a logistic regression

classifier, you must use 0.5 as your threshold

for predicting whether an example is positive or

negative.

Using a very large training set

makes it unlikely for model to overfit the training

data.

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

GarfieldEr007

关注关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
3
评论
分享

复制链接

分享到 QQ

分享到新浪微博

扫一扫
举报

举报

专栏目录

Machine Learning week 9 quiz: programming assignment-Anomaly Detection and Recommender Systems

GarfieldEr007的专栏

12-08

8579

一、ex8.m %% Machine Learning Online Class % Exercise 8 | Anomaly Detection and Collaborative Filtering % % Instructions % ------------ % % This file contains code that helps you get started on the

Machine Learning week 8 quiz: programming assignment-K-Means Clustering and PCA

GarfieldEr007的专栏

11-27

7819

一、ex7.m %% Machine Learning Online Class % Exercise 7 | Principle Component Analysis and K-Means Clustering % % Instructions % ------------ % % This file contains code that helps you get started

3 条评论您还未登录，请先登录后发表或查看评论

coursera机器学习每周测验（完整版，包括每节课中间暂停的测验）

04-28

注意：不是编程练习！不是编程练习！不是编程练习！重要的事情说三遍。本资源是coursera机器学习每周每课时（2--11周）的测验题目，包括每节课都有暂停之后的小练习，都是自己截图下来的，是纯英文版的，部分图片里面我个人用FastStone添加了个人理解解题思路，如有问题可以一起交流。coursera机器学习的题目还是很经典的，切中要害，对于复习巩固学过的知识具有莫大的帮助。

Coursera吴恩达ML 第六周编程week6 Regularized Linear Regression and BiasVariance编程注释选做

12-13

Coursera Machine Learning 第六周编程week6 ex5Regularized Linear Regression and BiasVariance编程全套满分题目+注释选做

Coursera 吴恩达 Machine Learning 课程 week 6 quiz

coco的专栏

02-05

560

课程内嵌练习：

Machine Learning - Coursera week6 Bias vs Variance

weixin_30528371的博客

08-06

134

Bias vs Variance 1.Diagnosing bias vs variance 有关偏差和方差的问题,也就是欠拟合与过拟合的问题。能够判断一个算法是偏差还是方差有问题对于如何改进学习算法的效果非常重要。下面看一个常见的欠拟合与过拟合的例子。对训练集数据进行预测或对验证集数据进行预测产生的平均平方误差来衡量模型的好坏。左边的模型...

带解题思路:Coursera Machine Learning 第六周 quiz (Machine Learning System Design)

BeiErGeLaiDe的博客

08-05

1万+

帮助到你了就点个赞吧~ Powered By 刘亚龙-站在巨人的肩膀上注释为ML的个人理解，不正之处还望海涵带注释的No.6~ML习题： 1. You are working on a spam classification system using regularized logistic regression. "Spam" is a positive class

machine learning week9 作业答案

04-27

在机器学习领域，异常检测（Anomaly Detection）和推荐系统（Recommender Systems）是两个重要的主题，它们在实际应用中发挥着至关重要的作用。在本篇内容中，我们将深入探讨这两个概念及其相关知识点。...

Coursera Machine Learning second week quiz answer-Alibaba Cloud

AlibabaCloud888的博客

02-11

707

Machine Learning Platform for AI provides end-to-end machine learning services, including data processing, feature engineering, model training, model prediction, and model evaluation. Machine Learning Platform for AI combines all of these services to make

coursera Machine Learning 第六周测验quiz2答案解析 Machine Learning System Design

sinat_39805237的博客

12-14

4408

1.0.85 2.ab错的 3.c 4.abc 5ce

Coursera machine learning答案

05-06

Coursera机器学习的8个练习所有答案，自己写的

Coursera_机器学习_week6_机器学习应用建议

icecutie的博客

05-18

839

bias vs. variance learning curve precision vs recall F1 score

机器学习（吴恩达）第六周-课程笔记&课后作业&编程作业

m0_57710123的博客

06-12

611

线性回归算法首先你需要对训练集进行学习得到参数θ 具体来讲就是最小化训练误差J(θ) 这里的J(θ)是使用那70%数据来定义得到的也就是仅仅是训练数据接下来你要计算出测试误差我将用J下标test来表示测试误差那么你要做的就是取出你之前从训练集中学习得到的参数θ放在这里来计算你的测试误差可以写成如下的形式这实际上是测试集平方误差的平均值这也不难想象因此我们使用包含参数θ的假设函数对每一个测试样本进行测试然后...

深度学习Course4第三周Detection Algorithms习题整理

l8947943的博客

08-14

4583

解析：you need bounding boxes in the training set. Your loss function should try to match the predictions for the bounding boxes to the true bounding boxes from the training set.解析：（2 * 2）/ （4 * 4 + 4 * 4 - 2 * 2）= 4 / 28 = 1 / 7。

Coursera Machine Learning机器学习课程编程作业参考答案

weixin_33819479的博客

10-28

1702

coursera上的machine learning课程是一门很好的机器学习入门课程。这里将该课程的所有编程作业的答案分享给大家~ 所有作业提交均正确（线性回归练习中只做了必做的部分）。该课程包含的编程作业如下： Linear Regression Logistic Regression Multi-class Classification and Neu...

Coursera Machine Learning 第一周引言(Introduction) quiz 习题答案