利用python进行数据分析
第八章:绘图和可视化
pandas绘图工具
>>> from pandas.plotting import scatter_matrix
>>> from pandas import Series, DataFrame
>>> import numpy as np
>>> import pandas as pd
>>> import matplotlib.pyplot as plt
1,散点图矩阵(Scatter Matrix Plot)
These functions can be imported from pandas.plotting and take a Series or DataFrame as an argument.
利用绘图工具绘图,需要引入pandas.plotting模块,以Series和DataFrame作为参数
>>> df = pd.DataFrame(np.random.randn(1000, 4), columns=['a', 'b', 'c', 'd'])
>>> scatter_matrix(df, alpha=0.2, figsize=(6, 6), diagonal='kde')
>>> plt.show()
生成4X4的共16个图片,对角线是密度图,其他的为散点图
2,密度图(Density Plot)
You can create density plots using the Series.plot.kde() and DataFrame.plot.kde() methods
利用Series.plot.kde()或DataFrame.plot.kde()方法绘制密度图
np.random.randn(1000)生成的是一个正太分布曲线
>>> ser = pd.Series(np.random.randn(1000))
>>> ser.plot.kde()
生成一个正太分布曲线图
3,安德鲁斯曲线(Andrews Curves)
Andrews curves allow one to plot multivariate data as a large number of curves that are created using the attributes of samples as coefficients for Fourier series. By coloring these curves differently for each class it is possible to visualize data clustering. Curves belonging to samples of the same class will usually