2021-08-25

最新推荐文章于 2025-12-11 20:49:41 发布

原创最新推荐文章于 2025-12-11 20:49:41 发布 · 138 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#python

数据分析专栏收录该内容

4 篇文章

订阅专栏

该博客讨论了在处理DataFrame时遇到的一个警告，指出尝试设置DataFrame切片的值时，使用`.loc[row_indexer,col_indexer]=value`比连续使用方括号更优。.loc方法允许一次性处理多个轴，提高效率且避免了链式操作导致的性能损失。建议在处理DataFrame时，特别是涉及条件筛选和赋值时，优先考虑使用.loc。

A value is trying to be set on a copy of a slice from a DataFrame.Try using .loc[row_indexer,col_indexer] = value instead

df2[df2['是否滞销']==1][df2['滞销比例%']<1]['滞销比例%'] = df2[df2['是否滞销']==1][df2['滞销比例%']<1]['滞销比例%']*100

在运行上方代码的时候，出现了警告通过阅读提示的官方文档，发现是DataFrame使用方法的错误导致了更差的使用效果。

官方文档对示例的解释如下：

These both yield the same results, so which should you use? It is instructive to understand the order of operations on these and why method 2 (.loc) is much preferred over method 1 (chained []).

dfmi[‘one’] selects the first level of the columns and returns a DataFrame that is singly-indexed. Then another Python operation dfmi_with_one[‘second’] selects the series indexed by ‘second’. This is indicated by the variable dfmi_with_one because pandas sees these operations as separate events. e.g. separate calls to getitem, so it has to treat them as linear operations, they happen one after another.

Contrast this to df.loc[ : ,(‘one’,‘second’)] which passes a nested tuple of (slice(None),(‘one’,‘second’)) to a single call to getitem. This allows pandas to deal with this as a single entity. Furthermore this order of operations can be significantly faster, and allows one to index both axes if so desired.