《learning from data》读书笔记---第一章: The Learning Problem

本文是《Learning from Data》第一章的学习笔记,主要讨论了学习问题的设置,包括监督学习、非监督学习和强化学习等不同类型,并探讨了学习的可行性,强调了误差和噪声在学习过程中的影响。文中提到了感知机算法作为简单学习模型的例子,并分析了有限数据集如何揭示目标函数的挑战。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

1 The Learning Problem

learning from data instead of analytic solution

1.1 Problem Setup

电影推荐系统:需要给电影评分,以确定是否给用户推荐此电影

模型基本步骤:基于之前的用户评分:1.构建向量描述电影 2.构建向量描述用户 3.计算这两个向量的相似度,预测评分

 It starts with random factors, then tunes these factors to make them more and more aligned with how viewers have rated movies before, until they are ultimately able to predict how viewers rate movies in general.

 1.1.1 Components of Learning 

There is a target to be learned. It is unknown to us. We have a set of examples generated by the target. The learning algorithm uses these examples to look for a hypothesis that approximates the target.  

1.1.2 A Simple Learning Model 

perceptron learning algorithm (PLA感知机算法)

 b与threshold有关,对于二维模型来说,分界线为:w1x1 + w2x2 + b = 0

PLA算法:1.初始化模型,w0->零向量 2.选择一个被误分类的记录用于更新w(t),其中t为迭代次数,w(t + 1) = w(t) + y(t)x(t),直到不存在误分类记录,停止迭代。更新的物理意义: moves in the direction of classifying x(t) correctly

课后练习

1.1. 3 Learning versus Design 

In the design approa

earning Data Mining with Python - Second Edition by Robert Layton English | 4 May 2017 | ASIN: B01MRP7VFV | 358 Pages | AZW3 | 2.85 MB Key Features Use a wide variety of Python libraries for practical data mining purposes. Learn how to find, manipulate, analyze, and visualize data using Python. Step-by-step instructions on data mining techniques with Python that have real-world applications. Book Description This book teaches you to design and develop data mining applications using a variety of datasets, starting with basic classification and affinity analysis. This book covers a large number of libraries available in Python, including the Jupyter Notebook, pandas, scikit-learn, and NLTK. You will gain hands on experience with complex data types including text, images, and graphs. You will also discover object detection using Deep Neural Networks, which is one of the big, difficult areas of machine learning right now. With restructured examples and code samples updated for the latest edition of Python, each chapter of this book introduces you to new algorithms and techniques. By the end of the book, you will have great insights into using Python for data mining and understanding of the algorithms as well as implementations. What you will learn Apply data mining concepts to real-world problems Predict the outcome of sports matches based on past results Determine the author of a document based on their writing style Use APIs to download datasets from social media and other online services Find and extract good features from difficult datasets Create models that solve real-world problems Design and develop data mining applications using a variety of datasets Perform object detection in images using Deep Neural Networks Find meaningful insights from your data through intuitive visualizations Compute on big data, including real-time data from the internet About the Author Robert Layton is a data scientist working mainly on text mining problems for industries including the finance, information security, and transport sectors. He runs dataPipeline to build algorithms for practical use, and Eurekative, helping bringing start-ups to life in regional Australia. He has presented at the last four PyCon AU conferences, at multiple international research conferences, and has been training in some capacity for five years. He has a PhD in cybercrime analytics from the Internet Commerce Security Laboratory at Federation University Australia, where he was the Inaugural Young Alumni of the Year in 2014 and is currently and Honorary Research Fellow. You can find him on LinkedIn at https://www.linkedin.com/in/drrobertlayton and on Twitter at @robertlayton. Robert writes regularly on data mining and cybercrime, in a private, consultancy, and a research capacity. Robert is an Official Member of the Ballarat Hackerspace, where he helps grow the future-tech sector in regional Victoria.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值