pandas之Series

98岁带病上岗

已于 2022-04-22 08:53:39 修改

阅读量1.4k

点赞数

文章标签： python

于 2022-04-19 14:06:27 首次发布

本文链接：https://blog.youkuaiyun.com/qq_41670002/article/details/124272274

版权

本文介绍了Pandas库中的Series数据结构，包括如何创建Series（如空Series、从ndarray、list、dict及标量创建），访问数据（位置索引与标签索引），以及Series的常用属性（如axes、dtype、empty等）。此外，还讲解了Series的head()、tail()方法用于查看数据，以及isnull()和notnull()函数用于检测缺失值。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

import pandas as pd
import numpy as np

一、Series

1. 创建Series对象

pandas.Series( data, index, dtype, name, copy)

参数说明：

data：一组数据(ndarray 类型)。

index：数据索引标签，如果不指定，默认从 0 开始。

dtype：数据类型，默认会自己判断。

name：设置名称。

copy：拷贝数据，默认为 False。

1.1 创建空的Series对象

s = pd.Series(dtype='float64')   # 如果不指定dtype，在未来的版本中，空的series对象中元素的默认数据类型将会是object
s

Series([], dtype: float64)

1.2 通过ndarray创建Series对象

arr = np.array([1,2,3,4,5])
s = pd.Series(arr)
s

0    1
1    2
2    3
3    4
4    5
dtype: int32

1.3 list创建Series对象

lst = ["a","b","c"]
pd.Series(lst)

0    a
1    b
2    c
dtype: object

1.4 dict创建Series对象

dct = {"a": 1,"b": 2,"c": 3}  #  字典的键会自动传给index参数
s = pd.Series(dct)
s

a    1
b    2
c    3
dtype: int64

1.5 标量创建Series对象

s = pd.Series(5,index=["a","b","c"])  # 标量创建Series对象时，必须传入索引序列，因为需要根据索引序列的长度来确定标量的个数
s

a    5
b    5
c    5
dtype: int64

2. 访问Series数据

访问Series数据有两种方式：位置索引访问和标签索引访问。
当未指定index参数时，只能通过位置索引访问。当指定了index参数时，既能通过位置索引访问，也能上通过标签索引访问。

2.1 位置索引访问

与 list 的访问方式一样

s = pd.Series(["a","d","c","b"])
s[0]

'a'

s[0] = "e"  # 修改第一个值
s

0    e
1    d
2    c
3    b
dtype: object

2.2 标签索引访问

与 dict 的访问方式一样

s = pd.Series([98, 94, 91,87],index=["zs","ls","ww","bl"])
s[3]

s["zs"]

s["zs"] = 100
s

zs    100
ls     94
ww     91
bl     87
dtype: int64

3. Series常用属性

axes 以列表的形式返回所有行索引标签。
dtype 返回对象的数据类型。
empty 返回一个空的 Series 对象。
ndim 返回输入数据的维数。
size 返回输入数据的元素数量。
values 以 ndarray 的形式返回 Series 对象。
index 返回一个RangeIndex对象，用来描述索引的取值范围

s = pd.Series([98, 94, 91,87],index=["zs","ls","ww","bl"])

s.axes

[Index(['zs', 'ls', 'ww', 'bl'], dtype='object')]

s.dtype

dtype('int64')

s.empty

False

s.ndim

s.size

s.values

array([98, 94, 91, 87], dtype=int64)

s.index

Index(['zs', 'ls', 'ww', 'bl'], dtype='object')

4. Series的常用方法

head(), tail() 查看数据
isnull() & nonull() 检测缺失值

4.1 head(), tail()方法

s = pd.Series([98, 94, 91,87,68,93,78],index=["zs","ls","ww","bl","zqh","sxj","wyy"])
s

zs     98
ls     94
ww     91
bl     87
zqh    68
sxj    93
wyy    78
dtype: int64

s.head(6)  # 默认前5行

zs     98
ls     94
ww     91
bl     87
zqh    68
sxj    93
dtype: int64

s.tail()  # 默认后5行

ww     91
bl     87
zqh    68
sxj    93
wyy    78
dtype: int64

4.2 isnull() & nonull() 检测缺失值

s = pd.Series([98, 94, 91,87,None,89,78],index=["zs","ls","ww","bl","zqh",None,"wyy"])
s

zs     98.0
ls     94.0
ww     91.0
bl     87.0
zqh     NaN
NaN    89.0
wyy    78.0
dtype: float64

# 通过Series对象的isnull()方法
s.isnull()

zs     False
ls     False
ww     False
bl     False
zqh     True
NaN    False
wyy    False
dtype: bool

# 通过pandas的isnull方法
pd.isnull(s)

zs     False
ls     False
ww     False
bl     False
zqh     True
NaN    False
wyy    False
dtype: bool

s.index.isnull()

array([False, False, False, False, False,  True, False])

pd.notnull(s)

zs      True
ls      True
ww      True
bl      True
zqh    False
NaN     True
wyy     True
dtype: bool

s.notnull()

zs      True
ls      True
ww      True
bl      True
zqh    False
NaN     True
wyy     True
dtype: bool