先导入工具包模块
import pandas as pd
import matplotlib.pyplot as plt
-----------------------------------------------
#读取train.csv中的数据并解析
titanic=pd.read_csv("train.csv")
#年龄中位数
print(titanic.Age.median())
print("\n")
运行结果:
28.0
-----------------------------------------------
#填充所有age字段的空值为中位数,不改变源数据
print(titanic.Age.fillna(titanic.Age.median()))
print("\n")
#打印前五行
print(titanic.head())
print("\n")
运行结果:
PassengerId Survived Pclass ... Fare Cabin Embarked
0 1 0 3 ... 7.2500 NaN S
1 2 1 1 ... 71.2833 C85 C
2 3 1 3 ... 7.9250 NaN S
3 4 1 1 ... 53.1000 C123 S
4 5 0 3 ... 8.0500 NaN S
[5 rows x 12 columns]
-----------------------------------------------
#查看数据类型
print(titanic.info())
print("\n")
运行结果
&
使用python和pandas分析泰坦尼克号乘坐者数据
最新推荐文章于 2024-07-06 19:27:21 发布