关于Series的详解:
http://blog.youkuaiyun.com/starter_____/article/details/79179417
创建DataFrame
DataFrame是一个二维的表结构,它含有一组有序的列。DataFrame既有行索引又有列索引,它可以被看做由Series组成的字典。
传递Dict对象创建DataFrame
若不指定行索引,则会自动创建一个0到N-1(N为数组长度)的整数型索引
若不指定列索引,则字典的键作为列索引
In [44]: data={
'name':['Mike','Lily','Jhon','Amily'],
...: 'age':[18,19,20,18],
...: 'score':[90,85,100,75]}
In [45]: frame=pd.DataFrame(data)
In [46]: frame
Out[46]:
age name score
0 18 Mike 90
1 19 Lily 85
2 20 Jhon 100
3 18 Amily 75
若指定列索引,则DataFrame的列会按照指定顺序进行排列,若列索引不存在,则以NaN填充。
In [48]: frame1=pd.DataFrame(data,columns=['score','name','age','sex'])
In [49]: frame1
Out[49]:
score name age sex
0 90 Mike 18 NaN
1 85 Lily 19 NaN
2 100 Jhon 20 NaN
3 75 Amily 18 NaN
若指定行索引且不越界(行索引不同于列索引,行索引越界时会报错)
In [51]: frame2=pd.DataFrame(data,index=['one','two','three','four'])
In [52]: frame2
Out[52]:
age name score
one 18 Mike 90
two 19 Lily 85
three 20 Jhon 100
four 18 Amily 75
传递嵌套Dict对象创建DataFrame
若不指定行索引,则字典的内层键作为行索引
若不指定列索引,则字典的外层键作为列索引
In [19]: pop={
'Nevada':{
2001:2.4,2002:2.9},
...: 'Ohio':{
2000:1.5,2001:1.8,2002:1.9}}
In [20]: frame3=pd.DataFrame(pop)
In [21]: frame3
Out[21]:
Nevada Ohio
2000 NaN 1.5
2001 2.4 1.8
2002 2.9