记录用pandas遇到的一些问题

最新推荐文章于 2024-05-08 17:05:13 发布

原创最新推荐文章于 2024-05-08 17:05:13 发布 · 663 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#python

学习笔记专栏收录该内容

5 篇文章

订阅专栏

这篇博客详细介绍了pandas的Series.map和apply函数，以及如何使用它们进行字典映射和函数映射。同时，讨论了pandas中DataFrame.equals和Series.equals用于判断数据相等的方法，以及Series.str.translate函数在字符串转换中的应用。

pandas学习笔记

pandas.Series.map
pandas中的apply()函数
- pandas.DataFrame.apply
- Series.apply
判断两个dataframe或者series是否相同
- pandas.DataFrame.equals
- pandas.Series.equals
pandas.Series.str.translate函数
举例说明

pandas.Series.map

map()函数可以用于Series对象或DataFrame对象的一列，接收函数或字典对象作为参数，返回经过函数或字典映射处理后的值。

用法:
Series.map(arg, na_action=None)

Map values of Series according to input correspondence.
Used for substituting each value in a Series with another value, that may be derived from a function, a dict or a Series.

参数:

argfunction, collections.abc.Mapping subclass or Series
Mapping correspondence.
na_action{None, ‘ignore’}, default None
If ‘ignore’, propagate NaN values, without passing them to the mapping correspondence.

Series
Same index as caller.

官方文档:https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.map.html

举例说明

data = pd.Series(['PC - Garnitury/Nausniki','Aksessuary - PS2','Aksessuary - PS3','Aksessuary - PS4','Aksessuary - PSP'])

输出:

0    PC - Garnitury/Nausniki
1           Aksessuary - PS2
2           Aksessuary - PS3
3           Aksessuary - PS4
4           Aksessuary - PSP
dtype: object

字典映射

new_data = data.map({'PC - Garnitury/Nausniki':'A','Aksessuary - PS2':'B','Aksessuary - PS3':'C','Aksessuary - PS4':'D','Aksessuary - PSP':'E'})

输出:

0    A
1    B
2    C
3    D
4    E
dtype: object

注意:Series.map()函数生成了一个原来Series的一个副本,并不改变原来data的值:

print(data)

输出:

0    PC - Garnitury/Nausniki
1           Aksessuary - PS2
2           Aksessuary - PS3
3           Aksessuary - PS4
4           Aksessuary - PSP
dtype: object

函数映射

data.map(lambda x:x.split('-'))

输出:

0    [PC ,  Garnitury/Nausniki]
1           [Aksessuary ,  PS2]
2           [Aksessuary ,  PS3]
3           [Aksessuary ,  PS4]
4           [Aksessuary ,  PSP]
dtype: object

pandas中的apply()函数

pandas中的apply()用于调用一个函数,它的作用对象是可以是Series,也可以是Dataframe。

pandas.DataFrame.apply

语法：

DataFrame.apply(func, axis=0, raw=False, result_type=None, args=(), **kwargs)
Apply a function along an axis of the DataFrame.
Objects passed to the function are Series objects whose index is either the DataFrame’s index (axis=0) or the DataFrame’s columns (axis=1). By default (result_type=None), the final return type is inferred from the return type of the applied function. Otherwise, it depends on the result_type argument.

官方文档：https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.apply.html#pandas.DataFrame.apply

Series.apply

Series.apply(func, convert_dtype=True, args=(), **kwargs)[source]
Invoke function on values of Series.
Can be ufunc (a NumPy function that applies to the entire Series) or a Python function that only works on single values.

官方文档：https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.apply.html#pandas.Series.apply

从某种意义上来说，map函数的作用与apply函数类似。但是apply函数比map函数更灵活。

这两篇文章讲的不错：
知乎
 知乎

判断两个dataframe或者series是否相同

pandas.DataFrame.equals

DataFrame.equals()函数查看两个DataFrame是否具有相同的形状和元素。

官方文档 https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.equals.html?highlight=equals#pandas.DataFrame.equals

pandas.Series.equals

Series.equals()函数查看两个Series是否具有相同的形状和元素。

官方文档 https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.equals.html?highlight=equals#pandas.Series.equals

pandas.Series.str.translate函数

pandas.Series.str.translate使用转换表根据转换表转换字符串的调用方系列。如果要翻译的值不止一个，则将字典传递给maketrans函数以创建翻译表。

官方文档 https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.str.translate.html?highlight=translate#pandas.Series.str.translate

举例说明

data = pd.DataFrame(['Книги - Открытки','Книги - Познавательная литература','Книги - Путеводители','Книги - Художественная литература'],columns=['eyu'])
data

输出:

0                       Книги - Открытки
1      Книги - Познавательная литература
2                   Книги - Путеводители
3      Книги - Художественная литература
dtype: object

将俄语转换成英语:

symbols=(u"абвгдеёжзийклмнопрстуфхцчшщъыьэюяАБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ", u"abvgdeejzijklmnoprstufhzcss_y_euaABVGDEEJZIJKLMNOPRSTUFHZCSS_Y_EUA")
english = {ord(a):ord(b) for a, b in zip(*symbols)}
data['english'] = data.eyu.str.translate(english)
print(data)

输出:

                                 eyu                            english
0                   Книги - Открытки                   Knigi - Otkrytki
1  Книги - Познавательная литература  Knigi - Poznavatel_naa literatura
2               Книги - Путеводители               Knigi - Putevoditeli
3  Книги - Художественная литература  Knigi - Hudojestvennaa literatura