python实现房价预测(一)

本文是一个关于预测房价的项目,数据源自kaggle。文章详细介绍了分析步骤,包括理解项目目标、数据探索、特征选择、数据处理和模型建立。通过皮尔逊相关系数、箱线图和热力图等工具,作者挑选出如'OverallQual'、'GrLivArea'、'GarageCars'和'TotalBsmtSF'等关键特征,并讨论了如何处理缺失值和建立初步的线性回归模型。

这是一个预测房价的项目,项目来自kaggle的housing。

项目的目的是预测房价,需要从众多可能的影响因子中挑选出最能预测房价的因子来建立模型,用于预测房价。

分析步骤:

1.  理解项目目的,再围绕目的进行分析。本项目的目的根据数据预测房价;

2. 了解数据的分布特征,根据实际项目理解每列数据的意义。在数据分析时,最重要的是要熟悉业务,在业务基础上再分析,事半功倍;

3. 挑选特征。刻画每个特征与目标变量之间的关系,找出最重要的特征;同时,为了避免多重共线性,需剔除掉一个特征与特征之间相关性非常大的特征;

4. 应用交叉验证,对训练集进行建立合适的模型,再在测试集上测试;

5. 最终建立预测房价的模型。

下面是实践部分:

1. 首先查看数据  

Ask a home buyer to describe their dream house, and they probably won't begin with the height of the basement ceiling or the proximity to an east-west railroad. But this playground competition's dataset proves that much more influences price negotiations than the number of bedrooms or a white-picket fence.

With 79 explanatory variables describing (almost) every aspect of residential homes in Ames, Iowa, this competition challenges you to predict the final price of each home.

  • SalePrice - the property's sale price in dollars. This is the target variable that you're trying to predict.
  • MSSubClass: The building class
  • MSZoning: The general zoning classification
  • LotFrontage: Linear feet of street connected to property
  • LotArea: Lot size in square feet
  • Street: Type of road access
  • Alley: Type of alley access
  • LotShape: General shape of property
  • LandContour: Flatness of the property
  • Utilities: Type of utilities available
  • LotConfig: Lot configuration
  • LandSlope: Slope of property
  • Neighborhood: Physical locations within Ames city limits
  • Condition1: Proximity to main road or railroad
  • Condition2: Proximity to main road or railroad (if a second is present)
  • BldgType: Type of dwelling
  • HouseStyle: Style of dwelling
  • OverallQual: Overall material and finish quality
  • OverallCond: Overall condition rating
  • YearBuilt: Original construction date
  • YearRemodAdd: Remodel date
  • RoofStyle: Type of roof
  • RoofMatl: Roof material
  • Exterior1st: Exterior covering on house
  • Exterior2nd: Exterior covering on house (if more than one material)
  • MasVnrType: Masonry veneer type
  • MasVnrArea: Masonry veneer area in square feet
  • ExterQual: Exterior material quality
  • ExterCond: Present condition of the material on the exterior
  • Foundation: Type of foundation
  • BsmtQual: Height of the basement
  • BsmtCond: General condition of the basement
  • BsmtExposure: Walkout or garden level basement walls
  • BsmtFinType1: Quality of basement finished area
  • BsmtFinSF1: Type 1 finished square feet
评论 11
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值