pandas(一)——Series的创建
import pandas as pd
import numpy as np
Series对象是pandas的基本数据结构之一,与ndarray对象类似,不同之处在于Series对象具有index标签对象,与其数据一一对应,从这点来说,Series对象又有点类似于字典(只不过是有序的)。创建Series对象的方法大致有如下几种:
arr = np.arange(5)
s1 = pd.Series(arr)
s1
0 0
1 1
2 2
3 3
4 4
dtype: int32
默认情况下,Series的标签都是从0开始的整数,但是也可以通过在创建的时候指定index的具体值来进行改变:
arr1 = np.arange(5)
s2 = pd.Series(arr1,index=["a","b","c","d","e"])
s2
a 0
b 1
c 2
d 3
e 4
dtype: int32
在这种情况下,字典的键就映射为Series的index对象,而字典的值则映射为Series的values对象
dict1 = {"Alice":99,"Bob":93,"Cindy":87}
s3 = pd.Series(dict1)
s3
Alice 99
Bob 93
Cindy 87
dtype: int64
正因为如此,当用这种方式创建Series时,如果重新指定不一样的index对象的话,那么得到的结果将是NaN
s4 = pd.Series(dict1,index=["Cindy","a","b"])
s4
Cindy 87.0
a NaN
b NaN
dtype: float64
s5 = pd.Series([1,2,3,4,5])
s5
0 1
1 2
2 3
3 4
4 5
dtype: int64
s6 = pd.Series((1,2,3,4,5))
s6
0 1
1 2
2 3
3 4
4 5
dtype: int64
s7 = pd.Series(7,index=["a","b","c"])
s7
a 7
b 7
c 7
dtype: int64
index属性与values属性分别可以查看Series对象的index标签和数据。其中index属性是可以通过赋值被修改的,而values标签则是不能被修改的
s7.index
Index(['a', 'b', 'c'], dtype='object')
s7.values
array([7, 7, 7], dtype=int64)
s7.index = ["d","e","f"]
s7
d 7
e 7
f 7
dtype: int64
s7.values = np.array([7,8,9])
s7
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
~\AppData\Local\conda\conda\envs\DataScience\lib\site-packages\pandas\core\generic.py in __setattr__(self, name, value)
5097 else:
-> 5098 object.__setattr__(self, name, value)
5099 except (AttributeError, TypeError):
AttributeError: can't set attribute
During handling of the above exception, another exception occurred:
AttributeError Traceback (most recent call last)
<ipython-input-28-38f1347c2791> in <module>
----> 1 s7.values = np.array([7,8,9])
2 s7
~\AppData\Local\conda\conda\envs\DataScience\lib\site-packages\pandas\core\generic.py in __setattr__(self, name, value)
5104 "stable/indexing.html#attribute-access",
5105 stacklevel=2)
-> 5106 object.__setattr__(self, name, value)
5107
5108 def _dir_additions(self):
AttributeError: can't set attribute
如果想要修改Series对象的数据,可以通过取值后再赋值的方式:
s7[0] = 9
s7
d 9
e 7
f 7
dtype: int64
s7[:] = 999
s7
d 999
e 999
f 999
dtype: int64