kaggle菜市场菜鸟-优快云博客

原创 ML-Watermelonbook-关于泛化误差与算法无关的公式推导

2025-09-28 19:19:21 93

原创 ML-Watermelonbook

今天决定用手写的方式hhhh。为了能更好的理解这个公式。

2025-09-22 18:39:49 162

找到一条穿过所有样本点的曲线，可想而知图上存在很多条曲线，因此我们的学习算法必须有某种。多的）对应的学习算法可能偏好图中比较平滑的曲线A而不是比较崎岖的曲线B。原则，并且假设“更平滑”意味着“更简单”，那么算法会自然地偏好曲线A。这时候，学习算法本身的“偏好”就起到一个关键作用。的定义，这是所谓的“假设集合”；是图中的1个点（x,y），要学得一个与。例如，相似的样本应该有相似的输出（在5个。，那么如何引导算法确立相对“正确”的。，才能产出它认为“正确”的。，才能产出它认为“正确”的。

2025-09-19 21:09:10 1119

原创 ML-Watermelonbook

testing：使用model进行predict的过程：被预测的样本/测试示例/测试例y=f(x)：在学得f后，对（用x表示），可得到其预测标记y=f(x)：泛化能力，即所学模型适用于新样本的能力（因为所学模型不仅要在训练样本上干得好，更要很好地适用于“unseen instance”（未见示例）；具有强泛化能力的model能很好地适用于整个X）i.i.d.，独立同分布）：假设X中全体instance服从一个未知分布"D"，我们获得的每个instance都是独立地从这个。

2025-09-18 19:11:27 793

原创 ML-Watermelonbook

(dimensionality) = feature的个数：每个instance用d个feature描述（例如昨天的墨尔本房产数据使用了5个feature，则d=5）= {x1,x2,x3,x4,x5,x6,...,xi}：表示包含m个instance的数据集，数据集为D，x6为第6个instance。label：对墨尔本房产数据集里那些房价≥100万澳元的instance进行标记（房价其实是结果数据，但同时是测试拟合度所必不可少的数据）”，一般是对training set进行学习，建立从X→γ的映射）

2025-09-17 18:48:16 287

原创 ML-Watermelonbook

例如我们把rooms,bathrooom,landsize,lattitude,longtitude作为5个坐标轴，则它们张成一个用于描述房产的五维空间，每个房子都可在空间里找到属于自己的坐标轴位置，由于空间中的每个点对应一个坐标向量，因此我们也把一个sample称为一个。关于ML，1997年，Mitchell给出了一个更形式化的定义：假设用P来评估计算机程序在某任务T上的性能，若一个程序通过利用经验E在任务T中获得了性能改善，则我们就说关于T和P，该程序对E进行了学习。这组记录的集合称为一个'

2025-09-16 14:04:22 403

原创 Building Your Model-Exercise

【代码】Building Your Model-Exercise。

2024-07-26 22:37:16 236

原创 Basic Data Exploration-Exercise

【代码】Basic Data Exploration-Exercise。

2024-07-25 20:56:35 947

原创 4 Types of Kaggle Competitions

These are comprehensive Machine Learning challenges posed by difficult, often business-oriented predictive problems.For example,1⃣️Using customer's history of buying insurance to predict the price they are more likely to accept2⃣️Predicting the presence an

2024-07-25 00:14:06 1019

原创 Deep Neural Networks

Like this:𝑚𝑎𝑥w*𝑥。

2024-06-12 00:23:08 901

原创 (Mac)Download Kaggle datasets with API

Then a file called kaggle.json will be generated.Why pip3, cuz my version is python3.11 If the installation is successful, then the part I circled is the location of the kaggle.Moved kaggle.json to kaggle4---Start DownloadingSearch a csv file from the data

2024-06-11 21:20:22 514

原创 Deep Learning(3)

【代码】Deep Learning(3)

2024-06-10 17:36:41 401

原创 Deep Learning(2)

w：2.5b：90𝑦𝑤0𝑥0+𝑤1𝑥1+𝑤2𝑥2+𝑏。

2024-06-07 09:26:09 297

原创 Deep Learning(1)

The linear neuron

2024-06-03 18:18:23 448 1

原创 Random Forest

【代码】Random Forest。

2024-04-06 13:10:54 321 1

原创 Underfitting and Overfitting

【代码】Underfitting and Overfitting。

2024-04-03 17:47:57 1959 1

原创 XGboost调参

L2正则化用于对叶子节点的得分进行惩罚，L1和L2正则化项共同惩罚树的复杂度,值越小模型的鲁棒性越高（减少模型过度拟合）（4）降低学习率，继续调整参数，学习率合适候选值为：[0.01, 0.015, 0.025, 0.05, 0.1]值越小模型越复杂，越容易过拟合（在决策树中，只有损失下降的值超过该值，才会继续分裂节点）每个基模型的惩罚项，降低单个模型的影响；值越接近1越容易或拟合，越接近0精度越低。（1）选择较高的学习率，例如0.1，这样可以减少迭代用时。值越大，模型越复杂，越容易过拟合。

2024-03-12 01:45:49 2242 1

原创 Measure Your Model Validation

.(Metric评价指标及损失函数-Error系列之平均绝对误差MAE)

2024-03-11 02:52:16 454

原创 Building Your Model

🥺🥺。

2024-03-10 02:48:52 406 1

原创 Basic Data Exploration(2)

【代码】How Models Work-Basic Data Exploration(2)

2024-03-10 01:59:17 1715 1

原创 Basic Data Exploration(1)

【代码】How models work-Basic Data Exploration。

2024-03-07 23:15:58 448 1

原创 How Models Work?- Improving the Decision Tree

If there are two decision trees🌲: one with only one branch👆🏻 and the other with many leaves🖐🏻, which do you think is more likely to result from fitting the real estate training data？In fact, both two decision trees can be derived from the real estate

2024-03-06 19:08:35 453

原创 How Models Work?-Introduction(2)

We need to divide houses according to the historical average price.(Because the predicted price for any house under consideration is the historical average price of houses in the same category.)We use data to decide how to break the houses into two groups,

2024-03-05 18:37:12 354

原创 How Models Work?-Introduction(1)

1---I will build models as I go through following scenario:Your cousin has made millions of dollars speculating on real estate. He's offered to become business partners with you because of your interest in data science. He'll supply the money, and you'll s

2024-03-04 19:24:56 465

weixin_59907082的博客

原创 ML-NFLT

原创 ML-Watermelonbook-关于泛化误差与算法无关的公式推导

原创 ML-Watermelonbook

原创 ML-Watermelonbook

原创 ML-Watermelonbook

原创 ML-Watermelonbook

原创 ML-Watermelonbook

原创 Building Your Model-Exercise

原创 Basic Data Exploration-Exercise

原创 4 Types of Kaggle Competitions

原创 Deep Neural Networks

原创 (Mac)Download Kaggle datasets with API

原创 Deep Learning(3)

原创 Deep Learning(2)

原创 Deep Learning(1)

原创 Random Forest

原创 Underfitting and Overfitting

原创 XGboost调参

原创 Measure Your Model Validation

原创 Building Your Model

原创 Basic Data Exploration(2)

原创 Basic Data Exploration(1)

原创 How Models Work?- Improving the Decision Tree

原创 How Models Work?-Introduction(2)

原创 How Models Work?-Introduction(1)

原创高效办公-读取文件夹下所有子文件的名称

原创选择困难症必看-把选择交给python～

原创机器学习-面向对象创建Student类，仿照创建Stock类

原创机器学习-根据数据判断等级

原创高效办公-源文件夹内多级文件移动到目标文件夹

原创高效办公-移动文件夹内2级文件到一个新文件夹

空空如也

空空如也