DataFrame索引
- 获取列数据,根据列名,只有一列时,返回Series类型
desc = reviews['description']
df[['a','b']] # 取多列值
- 获取行数据
- loc
loc函数根据索引得到数据,要与行索引相对应,0:10包括第十个数据
col = ['country','province','region_1','region_2']
ind = [0,1,10,100]
df = reviews.loc[ind,col]
- iloc
根据行号得到数据,行与列的值都要是数字,可以使用切片,0:10不包括第十个数据
first_row = reviews.iloc[0]
sample_reviews = reviews.iloc[[1,2,3,5,8]] #行号不连续
- 根据条件获取数据
italian_wines = reviews.loc[reviews.country == 'Italy']
或italian_wines = reviews[reviews.country == 'Italy']
当有多个条件时 ,使用符号&,|,小条件需使用(),不然容易出错
top_oceania_wines = reviews.loc[
(reviews.country.isin(['Australia', 'New Zealand']))
& (reviews.points >= 95)]