Building Your Model

I've been looking forward to this moment for a long time!

1---Which library to call to create?🤔

I will use the scikit-learn library to create my models. When coding, this library is written as sklearn. Scikit-learn is easily the most popular library for modeling the types of data typically stored in DataFrames.

2---What are the steps to build and use a model?😯

  • Define: What type of model will it be? A decision tree? Some other type of model? Some other parameters of the model type are specified too.
  • Fit: Capture patterns from provided data. This is the heart of modeling.
  • Predict: Just what it sounds like
  • Evaluate: Determine how accurate the model's predictions are.

Here is an example of defining a decision tree model with scikit-learn and fitting it with the features and target variable💁🏼‍♀️:

from sklearn.tree import DecisionTreeRegressor

# Define model. Specify a number for random_state to ensure same results ea
#ch run. This is considered a good practice. 
#You use any number, and model quality won't depend meaningfully on exact
#ly what value you choose.
melbourne_model = DecisionTreeRegressor(random_state=1)

# Fit model
melbourne_model.fit(X, y)
Out[6]: DecisionTreeRegressor(random_state=1)

3---Let's get started!

Now I have a fitted model that we can use to predict☝️.

In practice, I am interested in predicting the price of new houses coming on the market rather than the houses I already have prices for.

But I still need to predict the first few rows of the training data to see how the predict function works🧐.

#To choose variables/columns,
#we’ll need to look at a list of all the columns in the dataset.
import pandas as pd
melbourne_file_path = '/Users/mac/Desktop/melb_data.csv'
melbourne_data = pd.read_csv(melbourne_file_path) 
melbourne_data.columns
#Selecting the prediction target: y = 'Price'
y = melbourne_data.Price
#For now, we'll build a model with only a few features
melbourne_features = ['Rooms', 'Bathroom', 'Landsize', 
                      'Lattitude', 'Longtitude']
X = melbourne_data[melbourne_features]
#Predict house prices using the head method
X.describe()
X.head()
from sklearn.tree import DecisionTreeRegressor

# Define model. Specify a number for random_state to ensure same results each run
melbourne_model = DecisionTreeRegressor(random_state=1)

# Fit model
melbourne_model.fit(X, y)
print("Making predictions for the following 5 houses:")
print(X.head())
print("The predictions are")
print(melbourne_model.predict(X.head()))
Making predictions for the following 5 houses:
   Rooms  Bathroom  Landsize  Lattitude  Longtitude
0      2       1.0     202.0   -37.7996    144.9984
1      2       1.0     156.0   -37.8079    144.9934
2      3       2.0     134.0   -37.8093    144.9944
3      3       2.0      94.0   -37.7969    144.9969
4      4       1.0     120.0   -37.8072    144.9941
The predictions are
[1480000. 1035000. 1465000.  850000. 1600000.]

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值