机器学习基础:多元分析与回归模型
1. 多元分析
在多元分析中,我们尝试确定所有变量之间的关系。以鸢尾花数据集为例,我们可以按物种类型确定每个特征的均值。
# print the mean for each column by species
iris.groupby(by = "species").mean()
# plot for mean of each feature for each label class
iris.groupby(by = "species").mean().plot(kind="bar")
plt.title('Class vs Measurements')
plt.ylabel('mean measurement(cm)')
plt.xticks(rotation=0) # manage the xticks rotation
plt.grid(True)
# Use bbox_to_anchor option to place the legend outside plot area to be tidy
plt.legend(loc="upper left", bbox_to_anchor=(1,1))
运行上述代码后,输出结果如下:
| species | sepallength(cm) | sepalwidth(cm) | petallength(cm) | petalwidth(cm) |
| ---- | ---- | ---- | ---- | ---- |
| setosa | 5.006 | 3.418 | 1.464