1. 创建DataFrame
fruit = pd.DataFrame({'Apple': [35, 41, 50], 'Bananas': [21, 34, 10]}, index = ['2017 Sales', '2018 Sales', '2019 Sales'])
从csv导入数据
s3 = pd.read_csv('Titanic.csv')
保存到csv
s3.to_csv('testsave.csv')
2. 创建Series
s2 = pd.Series(['4 cups', '1 cup', '2 large', '1 can'], index = ['Flaver', 'Milk', 'Eggs', 'Spam'])
index默认为数组下标(0,1,2 ...)
删除
s2.drop('Milk')
修改
s2['Milk'] = '3 cups'
查找
s2['Milk']
3. Pandas用过的函数
1)apply
DataFrame.apply(func, axis=0, broadcast=False, raw=False, reduce=None, args=(), **kwds)
def remean_points(srs):
srs.points = srs.points - review_points_mean
return srs
reviews.apply(remean_points, axis = 1)
其中func : function|要应用在行和列的函数
axis : {0 or ‘index’, 1 or ‘columns’}, default 0|选择是行还是列
broadcast : boolean, default False|For aggregation functions, return object of same size with values propagated
raw : boolean, default False|If False, convert each row or column into a Series. If raw=True the passed function will receive ndarray objects instead.
reduce : boolean or None, default None|Try to apply reduction procedures.
args : tuple|函数的参数