记录用pandas遇到的一些问题

这篇博客详细介绍了pandas的Series.map和apply函数,以及如何使用它们进行字典映射和函数映射。同时,讨论了pandas中DataFrame.equals和Series.equals用于判断数据相等的方法,以及Series.str.translate函数在字符串转换中的应用。

pandas.Series.map

map()函数可以用于Series对象或DataFrame对象的一列,接收函数或字典对象作为参数,返回经过函数或字典映射处理后的值。

用法:
Series.map(arg, na_action=None)

  • Map values of Series according to input correspondence.
  • Used for substituting each value in a Series with another value, that may be derived from a function, a dict or a Series.

参数:

  • argfunction, collections.abc.Mapping subclass or Series
    Mapping correspondence.
  • na_action{None, ‘ignore’}, default None
    If ‘ignore’, propagate NaN values, without passing them to the mapping correspondence.

返回:

  • Series
    Same index as caller.

官方文档:https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.map.html

举例说明

data = pd.Series(['PC - Garnitury/Nausniki','Aksessuary - PS2','Aksessuary - PS3','Aksessuary - PS4','Aksessuary - PSP'])

输出:

0    PC - Garnitury/Nausniki
1           Aksessuary - PS2
2           Aksessuary - PS3
3           Aksessuary - PS4
4           Aksessuary - PSP
dtype: object

字典映射

new_data = data.map({'PC - Garnitury/Nausniki':'A','Aksessuary - PS2':'B','Aksessuary - PS3':'C','Aksessuary - PS4':'D','Aksessuary - PSP':'E'})

输出:

0    A
1    B
2    C
3    D
4    E
dtype: object

注意:Series.map()函数生成了一个原来Series的一个副本,并不改变原来data的值:

print(data)

输出:

0    PC - Garnitury/Nausniki
1           Aksessuary - PS2
2           Aksessuary - PS3
3           Aksessuary - PS4
4           Aksessuary - PSP
dtype: object

函数映射

data.map(lambda x:x.split('-'))

输出:

0    [PC ,  Garnitury/Nausniki]
1           [Aksessuary ,  PS2]
2           [Aksessuary ,  PS3]
3           [Aksessuary ,  PS4]
4           [Aksessuary ,  PSP]
dtype: object

pandas中的apply()函数

pandas中的apply()用于调用一个函数,它的作用对象是可以是Series,也可以是Dataframe。

pandas.DataFrame.apply

语法:

DataFrame.apply(func, axis=0, raw=False, result_type=None, args=(), **kwargs)
Apply a function along an axis of the DataFrame.
Objects passed to the function are Series objects whose index is either the DataFrame’s index (axis=0) or the DataFrame’s columns (axis=1). By default (result_type=None), the final return type is inferred from the return type of the applied function. Otherwise, it depends on the result_type argument.

官方文档:https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.apply.html#pandas.DataFrame.apply

Series.apply

Series.apply(func, convert_dtype=True, args=(), **kwargs)[source]
Invoke function on values of Series.
Can be ufunc (a NumPy function that applies to the entire Series) or a Python function that only works on single values.

官方文档:https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.apply.html#pandas.Series.apply

从某种意义上来说,map函数的作用与apply函数类似。但是apply函数比map函数更灵活。

这两篇文章讲的不错:
知乎
知乎

判断两个dataframe或者series是否相同

pandas.DataFrame.equals

DataFrame.equals()函数查看两个DataFrame是否具有相同的形状和元素。

官方文档 https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.equals.html?highlight=equals#pandas.DataFrame.equals

pandas.Series.equals

Series.equals()函数查看两个Series是否具有相同的形状和元素。

官方文档 https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.equals.html?highlight=equals#pandas.Series.equals

pandas.Series.str.translate函数

pandas.Series.str.translate使用转换表根据转换表转换字符串的调用方系列。如果要翻译的值不止一个,则将字典传递给maketrans函数以创建翻译表。

官方文档 https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.str.translate.html?highlight=translate#pandas.Series.str.translate

举例说明

data = pd.DataFrame(['Книги - Открытки','Книги - Познавательная литература','Книги - Путеводители','Книги - Художественная литература'],columns=['eyu'])
data

输出:

0                       Книги - Открытки
1      Книги - Познавательная литература
2                   Книги - Путеводители
3      Книги - Художественная литература
dtype: object

将俄语转换成英语:

symbols=(u"абвгдеёжзийклмнопрстуфхцчшщъыьэюяАБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ", u"abvgdeejzijklmnoprstufhzcss_y_euaABVGDEEJZIJKLMNOPRSTUFHZCSS_Y_EUA")
english = {ord(a):ord(b) for a, b in zip(*symbols)}
data['english'] = data.eyu.str.translate(english)
print(data)

输出:

                                 eyu                            english
0                   Книги - Открытки                   Knigi - Otkrytki
1  Книги - Познавательная литература  Knigi - Poznavatel_naa literatura
2               Книги - Путеводители               Knigi - Putevoditeli
3  Книги - Художественная литература  Knigi - Hudojestvennaa literatura
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值