pandas 入门(一)

10 Minutes to pandas

请参阅官方文档

In [1]:
Python
<span class="c1"># 设置为 inline 风格</span> <span class="o">%</span><span class="k">matplotlib</span> inline
1
2
< span class = "c1" > # 设置为 inline 风格</span>
< span class = "o" > % < / span > < span class = "k" > matplotlib < / span > inline
In [2]:
Python
<span class="c1"># 包导入</span> <span class="kn">import</span> <span class="nn">pandas</span> <span class="kn">as</span> <span class="nn">pd</span> <span class="kn">import</span> <span class="nn">numpy</span> <span class="kn">as</span> <span class="nn">np</span> <span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="kn">as</span> <span class="nn">plt</span>
1
2
3
4
< span class = "c1" > # 包导入</span>
< span class = "kn" > import < / span > < span class = "nn" > pandas < / span > < span class = "kn" > as < / span > < span class = "nn" > pd < / span >
< span class = "kn" > import < / span > < span class = "nn" > numpy < / span > < span class = "kn" > as < / span > < span class = "nn" > np < / span >
< span class = "kn" > import < / span > < span class = "nn" > matplotlib . pyplot < / span > < span class = "kn" > as < / span > < span class = "nn" > plt < / span >

创建数据集对象

In [3]:
Python
<span class="c1"># Series 对象可以理解为一维数组</span> <span class="n">s</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">Series</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span> <span class="mi">6</span><span class="p">,</span> <span class="mi">8</span><span class="p">])</span> <span class="n">s</span>
1
2
3
< span class = "c1" > # Series 对象可以理解为一维数组</span>
< span class = "n" > s < / span > < span class = "o" >= < / span > < span class = "n" > pd < / span > < span class = "o" > . < / span > < span class = "n" > Series < / span > < span class = "p" > ( [ < / span > < span class = "mi" > 1 < / span > < span class = "p" > , < / span > < span class = "mi" > 3 < / span > < span class = "p" > , < / span > < span class = "mi" > 5 < / span > < span class = "p" > , < / span > < span class = "n" > np < / span > < span class = "o" > . < / span > < span class = "n" > nan < / span > < span class = "p" > , < / span > < span class = "mi" > 6 < / span > < span class = "p" > , < / span > < span class = "mi" > 8 < / span > < span class = "p" > ] ) < / span >
< span class = "n" > s < / span >
Out[3]:
Python
0 1 1 3 2 5 3 NaN 4 6 5 8 dtype: float64
1
2
3
4
5
6
7
0      1
1      3
2      5
3    NaN
4      6
5      8
dtype : float64
In [4]:
Python
<span class="c1"># DataFrame 对象可以理解为二维数组,可以指定索引格式</span> <span class="n">dates</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">date_range</span><span class="p">(</span><span class="s1">'20160301'</span><span class="p">,</span> <span class="n">periods</span><span class="o">=</span><span class="mi">6</span><span class="p">)</span> <span class="c1"># periods:integer或None,默认值是None,表示你要从这个函数产生多少个日期索引值;如果是None的话,那么start和end必须不能为None。</span> <span class="n">dates</span>
1
2
3
4
< span class = "c1" > # DataFrame 对象可以理解为二维数组,可以指定索引格式</span>
< span class = "n" > dates < / span > < span class = "o" >= < / span > < span class = "n" > pd < / span > < span class = "o" > . < / span > < span class = "n" > date_range < / span > < span class = "p" > ( < / span > < span class = "s1" > '20160301' < / span > < span class = "p" > , < / span > < span class = "n" > periods < / span > < span class = "o" >= < / span > < span class = "mi" > 6 < / span > < span class = "p" > ) < / span >
< span class = "c1" > # periods:integer或None,默认值是None,表示你要从这个函数产生多少个日期索引值;如果是None的话,那么start和end必须不能为None。</span>
< span class = "n" > dates < / span >
Out[4]:
Python
DatetimeIndex(['2016-03-01', '2016-03-02', '2016-03-03', '2016-03-04', '2016-03-05', '2016-03-06'], dtype='datetime64[ns]', freq='D')
1
2
3
DatetimeIndex ( [ '2016-03-01' , '2016-03-02' , '2016-03-03' , '2016-03-04' ,
               '2016-03-05' , '2016-03-06' ] ,
               dtype = 'datetime64[ns]' , freq = 'D' )
In [5]:
Python
<span class="n">df</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">randn</span><span class="p">(</span><span class="mi">6</span><span class="p">,</span><span class="mi">4</span><span class="p">),</span> <span class="n">index</span><span class="o">=</span><span class="n">dates</span><span class="p">,</span> <span class="n">columns</span><span class="o">=</span><span class="nb">list</span><span class="p">(</span><span class="s1">'ABCD'</span><span class="p">))</span> <span class="n">df</span>
1
2
< span class = "n" > df < / span > < span class = "o" >= < / span > < span class = "n" > pd < / span > < span class = "o" > . < / span > < span class = "n" > DataFrame < / span > < span class = "p" > ( < / span > < span class = "n" > np < / span > < span class = "o" > . < / span > < span class = "n" > random < / span > < span class = "o" > . < / span > < span class = "n" > randn < / span > < span class = "p" > ( < / span > < span class = "mi" > 6 < / span > < span class = "p" > , < / span > < span class = "mi" > 4 < / span > < span class = "p" > ) , < / span > < span class = "n" > index < / span > < span class = "o" >= < / span > < span class = "n" > dates < / span > < span class = "p" > , < / span > < span class = "n" > columns < / span > < span class = "o" >= < / span > < span class = "nb" > list < / span > < span class = "p" > ( < / span > < span class = "s1" > 'ABCD' < / span > < span class = "p" > ) ) < / span >
< span class = "n" > df < / span >
Out[5]:
ABCD
2016-03-011.188983-1.150119-0.7005880.439065
2016-03-02-2.0415441.084507-0.3354411.969754
2016-03-031.204151-1.277714-0.2306710.629063
2016-03-04-0.352351-1.701585-0.034294-0.330139
2016-03-050.627601-0.2929390.4579752.262402
2016-03-06-1.121869-0.5332230.6274520.412665
In [6]:
Python
<span class="n">df</span><span class="o">.</span><span class="n">values</span>
1
< span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > values < / span >
Out[6]:
Python
array([[ 1.18898298, -1.15011854, -0.70058776, 0.43906549], [-2.04154443, 1.08450747, -0.33544069, 1.96975377], [ 1.2041512 , -1.27771421, -0.23067059, 0.62906316], [-0.35235094, -1.70158492, -0.03429361, -0.33013878], [ 0.62760104, -0.29293918, 0.45797463, 2.26240237], [-1.12186945, -0.53322343, 0.6274522 , 0.41266481]])
1
2
3
4
5
6
array ( [ [ 1.18898298 , - 1.15011854 , - 0.70058776 ,    0.43906549 ] ,
       [ - 2.04154443 ,    1.08450747 , - 0.33544069 ,    1.96975377 ] ,
       [ 1.2041512 , - 1.27771421 , - 0.23067059 ,    0.62906316 ] ,
       [ - 0.35235094 , - 1.70158492 , - 0.03429361 , - 0.33013878 ] ,
       [ 0.62760104 , - 0.29293918 ,    0.45797463 ,    2.26240237 ] ,
       [ - 1.12186945 , - 0.53322343 ,    0.6274522 ,    0.41266481 ] ] )
In [7]:
Python
<span class="c1"># 使用字典来创建:key 为 DataFrame 的列;value 为对应列下的值</span> <span class="n">df</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">({</span> <span class="s1">'A'</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="s1">'B'</span><span class="p">:</span> <span class="n">pd</span><span class="o">.</span><span class="n">Timestamp</span><span class="p">(</span><span class="s1">'20160301'</span><span class="p">),</span> <span class="s1">'C'</span><span class="p">:</span> <span class="nb">range</span><span class="p">(</span><span class="mi">4</span><span class="p">),</span> <span class="s1">'D'</span><span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="mi">5</span><span class="p">,</span> <span class="mi">9</span><span class="p">),</span> <span class="s1">'E'</span><span class="p">:</span> <span class="s1">'text'</span><span class="p">,</span> <span class="s1">'F'</span><span class="p">:</span> <span class="p">[</span><span class="s1">'AA'</span><span class="p">,</span> <span class="s1">'BB'</span><span class="p">,</span> <span class="s1">'CC'</span><span class="p">,</span> <span class="s1">'DD'</span><span class="p">]})</span> <span class="n">df</span>
1
2
3
4
5
6
7
8
9
< span class = "c1" > # 使用字典来创建:key 为 DataFrame 的列;value 为对应列下的值</span>
< span class = "n" > df < / span > < span class = "o" >= < / span > < span class = "n" > pd < / span > < span class = "o" > . < / span > < span class = "n" > DataFrame < / span > < span class = "p" > ( { < / span >
                   < span class = "s1" > 'A' < / span > < span class = "p" > : < / span > < span class = "mi" > 1 < / span > < span class = "p" > , < / span >
                   < span class = "s1" > 'B' < / span > < span class = "p" > : < / span > < span class = "n" > pd < / span > < span class = "o" > . < / span > < span class = "n" > Timestamp < / span > < span class = "p" > ( < / span > < span class = "s1" > '20160301' < / span > < span class = "p" > ) , < / span >
                   < span class = "s1" > 'C' < / span > < span class = "p" > : < / span > < span class = "nb" > range < / span > < span class = "p" > ( < / span > < span class = "mi" > 4 < / span > < span class = "p" > ) , < / span >
                   < span class = "s1" > 'D' < / span > < span class = "p" > : < / span > < span class = "n" > np < / span > < span class = "o" > . < / span > < span class = "n" > arange < / span > < span class = "p" > ( < / span > < span class = "mi" > 5 < / span > < span class = "p" > , < / span > < span class = "mi" > 9 < / span > < span class = "p" > ) , < / span >
                   < span class = "s1" > 'E' < / span > < span class = "p" > : < / span > < span class = "s1" > 'text' < / span > < span class = "p" > , < / span >
                   < span class = "s1" > 'F' < / span > < span class = "p" > : < / span > < span class = "p" > [ < / span > < span class = "s1" > 'AA' < / span > < span class = "p" > , < / span > < span class = "s1" > 'BB' < / span > < span class = "p" > , < / span > < span class = "s1" > 'CC' < / span > < span class = "p" > , < / span > < span class = "s1" > 'DD' < / span > < span class = "p" > ] } ) < / span >
< span class = "n" > df < / span >
Out[7]:
ABCDEF
012016-03-0105textAA
112016-03-0116textBB
212016-03-0127textCC
312016-03-0138textDD
In [8]:
Python
<span class="n">df</span><span class="o">.</span><span class="n">dtypes</span>
1
< span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > dtypes < / span >
Out[8]:
Python
A int64 B datetime64[ns] C int64 D int64 E object F object dtype: object
1
2
3
4
5
6
7
A              int64
B      datetime64 [ ns ]
C              int64
D              int64
E              object
F              object
dtype : object
In [9]:
Python
<span class="n">df</span><span class="o">.</span><span class="n">A</span>
1
< span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > A < / span >
Out[9]:
Python
0 1 1 1 2 1 3 1 Name: A, dtype: int64
1
2
3
4
5
0      1
1      1
2      1
3      1
Name : A , dtype : int64
In [10]:
Python
<span class="nb">type</span><span class="p">(</span><span class="n">df</span><span class="o">.</span><span class="n">A</span><span class="p">)</span>
1
< span class = "nb" > type < / span > < span class = "p" > ( < / span > < span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > A < / span > < span class = "p" > ) < / span >
Out[10]:
Python
pandas.core.series.Series
1
pandas . core . series . Series

查看数据

In [11]:
Python
<span class="c1"># 创建数据集</span> <span class="n">n_rows</span> <span class="o">=</span> <span class="mi">6</span> <span class="n">dates</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">date_range</span><span class="p">(</span><span class="s1">'20160301'</span><span class="p">,</span> <span class="n">periods</span><span class="o">=</span><span class="n">n_rows</span><span class="p">)</span> <span class="n">df</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">randn</span><span class="p">(</span><span class="n">n_rows</span><span class="p">,</span> <span class="mi">4</span><span class="p">),</span> <span class="n">index</span><span class="o">=</span><span class="n">dates</span><span class="p">,</span> <span class="n">columns</span><span class="o">=</span><span class="nb">list</span><span class="p">(</span><span class="s1">'ABCD'</span><span class="p">))</span> <span class="n">df</span>
1
2
3
4
5
< span class = "c1" > # 创建数据集</span>
< span class = "n" > n_rows < / span > < span class = "o" >= < / span > < span class = "mi" > 6 < / span >
< span class = "n" > dates < / span > < span class = "o" >= < / span > < span class = "n" > pd < / span > < span class = "o" > . < / span > < span class = "n" > date_range < / span > < span class = "p" > ( < / span > < span class = "s1" > '20160301' < / span > < span class = "p" > , < / span > < span class = "n" > periods < / span > < span class = "o" >= < / span > < span class = "n" > n_rows < / span > < span class = "p" > ) < / span >
< span class = "n" > df < / span > < span class = "o" >= < / span > < span class = "n" > pd < / span > < span class = "o" > . < / span > < span class = "n" > DataFrame < / span > < span class = "p" > ( < / span > < span class = "n" > np < / span > < span class = "o" > . < / span > < span class = "n" > random < / span > < span class = "o" > . < / span > < span class = "n" > randn < / span > < span class = "p" > ( < / span > < span class = "n" > n_rows < / span > < span class = "p" > , < / span > < span class = "mi" > 4 < / span > < span class = "p" > ) , < / span > < span class = "n" > index < / span > < span class = "o" >= < / span > < span class = "n" > dates < / span > < span class = "p" > , < / span > < span class = "n" > columns < / span > < span class = "o" >= < / span > < span class = "nb" > list < / span > < span class = "p" > ( < / span > < span class = "s1" > 'ABCD' < / span > < span class = "p" > ) ) < / span >
< span class = "n" > df < / span >
Out[11]:
ABCD
2016-03-011.3134190.826457-1.5741460.525008
2016-03-020.028397-1.0093490.3270140.918248
2016-03-03-0.857008-1.6826950.646229-0.183377
2016-03-04-1.112885-1.491662-1.114824-0.115619
2016-03-05-0.448713-0.163651-1.2302951.106656
2016-03-06-0.2678670.092313-0.480238-0.809923
In [12]:
Python
<span class="n">df</span><span class="o">.</span><span class="n">shape</span>
1
< span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > shape < / span >
Out[12]:
Python
(6, 4)
1
( 6 , 4 )
In [13]:
Python
<span class="n">df</span><span class="o">.</span><span class="n">head</span><span class="p">()</span>
1
< span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > head < / span > < span class = "p" > ( ) < / span >
Out[13]:
ABCD
2016-03-011.3134190.826457-1.5741460.525008
2016-03-020.028397-1.0093490.3270140.918248
2016-03-03-0.857008-1.6826950.646229-0.183377
2016-03-04-1.112885-1.491662-1.114824-0.115619
2016-03-05-0.448713-0.163651-1.2302951.106656
In [14]:
Python
<span class="n">df</span><span class="o">.</span><span class="n">head</span><span class="p">(</span><span class="mi">3</span><span class="p">)</span>
1
< span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > head < / span > < span class = "p" > ( < / span > < span class = "mi" > 3 < / span > < span class = "p" > ) < / span >
Out[14]:
ABCD
2016-03-011.3134190.826457-1.5741460.525008
2016-03-020.028397-1.0093490.3270140.918248
2016-03-03-0.857008-1.6826950.646229-0.183377
In [15]:
Python
<span class="n">df</span><span class="o">.</span><span class="n">tail</span><span class="p">()</span>
1
< span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > tail < / span > < span class = "p" > ( ) < / span >
Out[15]:
ABCD
2016-03-020.028397-1.0093490.3270140.918248
2016-03-03-0.857008-1.6826950.646229-0.183377
2016-03-04-1.112885-1.491662-1.114824-0.115619
2016-03-05-0.448713-0.163651-1.2302951.106656
2016-03-06-0.2678670.092313-0.480238-0.809923
In [16]:
Python
<span class="n">df</span><span class="o">.</span><span class="n">tail</span><span class="p">(</span><span class="mi">2</span><span class="p">)</span>
1
< span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > tail < / span > < span class = "p" > ( < / span > < span class = "mi" > 2 < / span > < span class = "p" > ) < / span >
Out[16]:
ABCD
2016-03-05-0.448713-0.163651-1.2302951.106656
2016-03-06-0.2678670.092313-0.480238-0.809923
In [17]:
Python
<span class="n">df</span><span class="o">.</span><span class="n">index</span>
1
< span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > index < / span >
Out[17]:
Python
DatetimeIndex(['2016-03-01', '2016-03-02', '2016-03-03', '2016-03-04', '2016-03-05', '2016-03-06'], dtype='datetime64[ns]', freq='D')
1
2
3
DatetimeIndex ( [ '2016-03-01' , '2016-03-02' , '2016-03-03' , '2016-03-04' ,
               '2016-03-05' , '2016-03-06' ] ,
               dtype = 'datetime64[ns]' , freq = 'D' )
In [18]:
Python
<span class="n">df</span><span class="o">.</span><span class="n">columns</span>
1
< span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > columns < / span >
Out[18]:
Python
Index([u'A', u'B', u'C', u'D'], dtype='object')
1
Index ( [ u 'A' , u 'B' , u 'C' , u 'D' ] , dtype = 'object' )
In [19]:
Python
<span class="n">df</span><span class="o">.</span><span class="n">values</span>
1
< span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > values < / span >
Out[19]:
Python
array([[ 1.31341924, 0.82645709, -1.57414606, 0.52500758], [ 0.02839742, -1.00934929, 0.32701362, 0.91824786], [-0.85700833, -1.68269525, 0.646229 , -0.18337746], [-1.11288513, -1.49166212, -1.11482404, -0.11561882], [-0.44871305, -0.16365107, -1.23029491, 1.10665563], [-0.26786722, 0.09231292, -0.48023763, -0.80992272]])
1
2
3
4
5
6
array ( [ [ 1.31341924 ,    0.82645709 , - 1.57414606 ,    0.52500758 ] ,
       [ 0.02839742 , - 1.00934929 ,    0.32701362 ,    0.91824786 ] ,
       [ - 0.85700833 , - 1.68269525 ,    0.646229    , - 0.18337746 ] ,
       [ - 1.11288513 , - 1.49166212 , - 1.11482404 , - 0.11561882 ] ,
       [ - 0.44871305 , - 0.16365107 , - 1.23029491 ,    1.10665563 ] ,
       [ - 0.26786722 ,    0.09231292 , - 0.48023763 , - 0.80992272 ] ] )
In [20]:
Python
<span class="n">df</span><span class="o">.</span><span class="n">describe</span><span class="p">()</span>
1
< span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > describe < / span > < span class = "p" > ( ) < / span >
Out[20]:
ABCD
count6.0000006.0000006.0000006.000000
mean-0.224110-0.571431-0.5710430.240165
std0.8568080.9833040.8981120.734900
min-1.112885-1.682695-1.574146-0.809923
25%-0.754935-1.371084-1.201427-0.166438
50%-0.358290-0.586500-0.7975310.204694
75%-0.0456690.0283220.1252010.819938
max1.3134190.8264570.6462291.106656
In [21]:
Python
<span class="n">df</span><span class="o">.</span><span class="n">T</span>
1
< span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > T < / span >
Out[21]:
2016-03-01 00:00:002016-03-02 00:00:002016-03-03 00:00:002016-03-04 00:00:002016-03-05 00:00:002016-03-06 00:00:00
A1.3134190.028397-0.857008-1.112885-0.448713-0.267867
B0.826457-1.009349-1.682695-1.491662-0.1636510.092313
C-1.5741460.3270140.646229-1.114824-1.230295-0.480238
D0.5250080.918248-0.183377-0.1156191.106656-0.809923
In [22]:
Python
<span class="n">df</span><span class="o">.</span><span class="n">T</span><span class="o">.</span><span class="n">shape</span>
1
< span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > T < / span > < span class = "o" > . < / span > < span class = "n" > shape < / span >
Out[22]:
Python
(4, 6)
1
( 4 , 6 )
In [23]:
Python
<span class="n">df</span><span class="o">.</span><span class="n">sort_index</span><span class="p">(</span><span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">ascending</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span> <span class="c1"># sort_index()按照索引排序</span> <span class="c1">#df.sort_index() #按照rowID进行排序,默认升序</span> <span class="c1">#df.sort_index(axis=1,ascending=False) #按照columnID进行排序,设定为降序</span>
1
2
3
4
5
< span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > sort_index < / span > < span class = "p" > ( < / span > < span class = "n" > axis < / span > < span class = "o" >= < / span > < span class = "mi" > 1 < / span > < span class = "p" > , < / span > < span class = "n" > ascending < / span > < span class = "o" >= < / span > < span class = "bp" > False < / span > < span class = "p" > ) < / span >
 
< span class = "c1" > # sort_index()按照索引排序</span>
< span class = "c1" > #df.sort_index() #按照rowID进行排序,默认升序</span>
< span class = "c1" > #df.sort_index(axis=1,ascending=False) #按照columnID进行排序,设定为降序</span>
Out[23]:
DCBA
2016-03-010.525008-1.5741460.8264571.313419
2016-03-020.9182480.327014-1.0093490.028397
2016-03-03-0.1833770.646229-1.682695-0.857008
2016-03-04-0.115619-1.114824-1.491662-1.112885
2016-03-051.106656-1.230295-0.163651-0.448713
2016-03-06-0.809923-0.4802380.092313-0.267867
In [24]:
Python
<span class="n">df</span><span class="o">.</span><span class="n">sort_values</span><span class="p">(</span><span class="n">by</span><span class="o">=</span><span class="s1">'C'</span><span class="p">)</span> <span class="c1"># df.sort_values('mpg',ascending=False)</span> <span class="c1"># Order rows by values of a column (high to low).</span> <span class="c1"># 以每一列进行排序 ascending=False默认是降序,True就是升序</span>
1
2
3
4
< span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > sort_values < / span > < span class = "p" > ( < / span > < span class = "n" > by < / span > < span class = "o" >= < / span > < span class = "s1" > 'C' < / span > < span class = "p" > ) < / span >
< span class = "c1" > # df.sort_values('mpg',ascending=False)</span>
< span class = "c1" > # Order rows by values of a column (high to low).</span>
< span class = "c1" > # 以每一列进行排序 ascending=False默认是降序,True就是升序</span>
Out[24]:
ABCD
2016-03-011.3134190.826457-1.5741460.525008
2016-03-05-0.448713-0.163651-1.2302951.106656
2016-03-04-1.112885-1.491662-1.114824-0.115619
2016-03-06-0.2678670.092313-0.480238-0.809923
2016-03-020.028397-1.0093490.3270140.918248
2016-03-03-0.857008-1.6826950.646229-0.183377

数据选择

In [25]:
Python
<span class="n">df</span><span class="p">[</span><span class="s1">'A'</span><span class="p">]</span> <span class="c1"># df[['A','B']] 取出两列</span>
1
2
< span class = "n" > df < / span > < span class = "p" > [ < / span > < span class = "s1" > 'A' < / span > < span class = "p" > ] < / span >
< span class = "c1" > # df[['A','B']] 取出两列</span>
Out[25]:
Python
2016-03-01 1.313419 2016-03-02 0.028397 2016-03-03 -0.857008 2016-03-04 -1.112885 2016-03-05 -0.448713 2016-03-06 -0.267867 Freq: D, Name: A, dtype: float64
1
2
3
4
5
6
7
2016 - 03 - 01      1.313419
2016 - 03 - 02      0.028397
2016 - 03 - 03    - 0.857008
2016 - 03 - 04    - 1.112885
2016 - 03 - 05    - 0.448713
2016 - 03 - 06    - 0.267867
Freq : D , Name : A , dtype : float64
In [26]:
Python
<span class="n">df</span><span class="p">[</span><span class="mi">2</span><span class="p">:</span><span class="mi">4</span><span class="p">]</span>
1
< span class = "n" > df < / span > < span class = "p" > [ < / span > < span class = "mi" > 2 < / span > < span class = "p" > : < / span > < span class = "mi" > 4 < / span > < span class = "p" > ] < / span >
Out[26]:
ABCD
2016-03-03-0.857008-1.6826950.646229-0.183377
2016-03-04-1.112885-1.491662-1.114824-0.115619
In [27]:
Python
<span class="n">df</span><span class="p">[</span><span class="s1">'20160302'</span><span class="p">:</span><span class="s1">'20160305'</span><span class="p">]</span>
1
< span class = "n" > df < / span > < span class = "p" > [ < / span > < span class = "s1" > '20160302' < / span > < span class = "p" > : < / span > < span class = "s1" > '20160305' < / span > < span class = "p" > ] < / span >
Out[27]:
ABCD
2016-03-020.028397-1.0093490.3270140.918248
2016-03-03-0.857008-1.6826950.646229-0.183377
2016-03-04-1.112885-1.491662-1.114824-0.115619
2016-03-05-0.448713-0.163651-1.2302951.106656

通过标签选择

In [28]:
Python
<span class="n">df</span><span class="o">.</span><span class="n">loc</span><span class="p">[</span><span class="s1">'20160301'</span><span class="p">]</span>
1
< span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > loc < / span > < span class = "p" > [ < / span > < span class = "s1" > '20160301' < / span > < span class = "p" > ] < / span >
Out[28]:
Python
A 1.313419 B 0.826457 C -1.574146 D 0.525008 Name: 2016-03-01 00:00:00, dtype: float64
1
2
3
4
5
A      1.313419
B      0.826457
C    - 1.574146
D      0.525008
Name : 2016 - 03 - 01 00 : 00 : 00 , dtype : float64
In [29]:
Python
<span class="nb">type</span><span class="p">(</span><span class="n">df</span><span class="o">.</span><span class="n">loc</span><span class="p">[</span><span class="s1">'20160301'</span><span class="p">])</span>
1
< span class = "nb" > type < / span > < span class = "p" > ( < / span > < span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > loc < / span > < span class = "p" > [ < / span > < span class = "s1" > '20160301' < / span > < span class = "p" > ] ) < / span >
Out[29]:
Python
pandas.core.series.Series
1
pandas . core . series . Series
In [30]:
Python
<span class="n">df</span><span class="o">.</span><span class="n">loc</span><span class="p">[:,</span> <span class="p">[</span><span class="s1">'A'</span><span class="p">,</span> <span class="s1">'B'</span><span class="p">]]</span> <span class="c1"># 取出 AB两列</span>
1
2
< span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > loc < / span > < span class = "p" > [ : , < / span > < span class = "p" > [ < / span > < span class = "s1" > 'A' < / span > < span class = "p" > , < / span > < span class = "s1" > 'B' < / span > < span class = "p" > ] ] < / span >
< span class = "c1" > # 取出 AB两列</span>
Out[30]:
AB
2016-03-011.3134190.826457
2016-03-020.028397-1.009349
2016-03-03-0.857008-1.682695
2016-03-04-1.112885-1.491662
2016-03-05-0.448713-0.163651
2016-03-06-0.2678670.092313
In [31]:
Python
<span class="n">df</span><span class="o">.</span><span class="n">loc</span><span class="p">[</span><span class="s1">'20160301'</span><span class="p">:</span><span class="s1">'20160305'</span><span class="p">,</span> <span class="p">[</span><span class="s1">'A'</span><span class="p">,</span> <span class="s1">'B'</span><span class="p">]]</span> <span class="c1"># 取出某几行 几列</span>
1
2
3
< span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > loc < / span > < span class = "p" > [ < / span > < span class = "s1" > '20160301' < / span > < span class = "p" > : < / span > < span class = "s1" > '20160305' < / span > < span class = "p" > , < / span > < span class = "p" > [ < / span > < span class = "s1" > 'A' < / span > < span class = "p" > , < / span > < span class = "s1" > 'B' < / span > < span class = "p" > ] ] < / span >
 
< span class = "c1" > # 取出某几行 几列</span>
Out[31]:
AB
2016-03-011.3134190.826457
2016-03-020.028397-1.009349
2016-03-03-0.857008-1.682695
2016-03-04-1.112885-1.491662
2016-03-05-0.448713-0.163651
In [32]:
Python
<span class="n">df</span><span class="o">.</span><span class="n">loc</span><span class="p">[</span><span class="s1">'2016-03-01'</span><span class="p">,</span> <span class="s1">'A'</span><span class="p">]</span>
1
< span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > loc < / span > < span class = "p" > [ < / span > < span class = "s1" > '2016-03-01' < / span > < span class = "p" > , < / span > < span class = "s1" > 'A' < / span > < span class = "p" > ] < / span >
Out[32]:
Python
1.3134192362700037
1
1.3134192362700037
In [33]:
Python
<span class="n">df</span><span class="o">.</span><span class="n">at</span><span class="p">[</span><span class="n">pd</span><span class="o">.</span><span class="n">Timestamp</span><span class="p">(</span><span class="s1">'2016-03-01'</span><span class="p">),</span> <span class="s1">'A'</span><span class="p">]</span> <span class="c1"># df.at['2016-03-01', 'A'] will raise error</span>
1
2
< span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > at < / span > < span class = "p" > [ < / span > < span class = "n" > pd < / span > < span class = "o" > . < / span > < span class = "n" > Timestamp < / span > < span class = "p" > ( < / span > < span class = "s1" > '2016-03-01' < / span > < span class = "p" > ) , < / span > < span class = "s1" > 'A' < / span > < span class = "p" > ] < / span >
< span class = "c1" > # df.at['2016-03-01', 'A'] will raise error</span>
Out[33]:
Python
1.3134192362700037
1
1.3134192362700037

通过位置选择

In [34]:
Python
<span class="n">df</span><span class="o">.</span><span class="n">iloc</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span>
1
< span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > iloc < / span > < span class = "p" > [ < / span > < span class = "mi" > 1 < / span > < span class = "p" > ] < / span >
Out[34]:
Python
A 0.028397 B -1.009349 C 0.327014 D 0.918248 Name: 2016-03-02 00:00:00, dtype: float64
1
2
3
4
5
A      0.028397
B    - 1.009349
C      0.327014
D      0.918248
Name : 2016 - 03 - 02 00 : 00 : 00 , dtype : float64
In [35]:
Python
<span class="n">df</span><span class="o">.</span><span class="n">iloc</span><span class="p">[</span><span class="mi">2</span><span class="p">:</span><span class="mi">5</span><span class="p">,</span> <span class="mi">0</span><span class="p">:</span><span class="mi">2</span><span class="p">]</span> <span class="c1"># 取出 2,3,4行,0,1,列</span>
1
2
< span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > iloc < / span > < span class = "p" > [ < / span > < span class = "mi" > 2 < / span > < span class = "p" > : < / span > < span class = "mi" > 5 < / span > < span class = "p" > , < / span > < span class = "mi" > 0 < / span > < span class = "p" > : < / span > < span class = "mi" > 2 < / span > < span class = "p" > ] < / span >
< span class = "c1" > # 取出 2,3,4行,0,1,列</span>
Out[35]:
AB
2016-03-03-0.857008-1.682695
2016-03-04-1.112885-1.491662
2016-03-05-0.448713-0.163651
In [36]:
Python
<span class="n">df</span><span class="o">.</span><span class="n">iloc</span><span class="p">[</span><span class="mi">1</span><span class="p">:</span><span class="mi">5</span><span class="p">,</span> <span class="p">:]</span> <span class="c1"># df.iloc[1:5] 这个也可以</span>
1
2
< span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > iloc < / span > < span class = "p" > [ < / span > < span class = "mi" > 1 < / span > < span class = "p" > : < / span > < span class = "mi" > 5 < / span > < span class = "p" > , < / span > < span class = "p" > : ] < / span >
< span class = "c1" > # df.iloc[1:5] 这个也可以</span>
Out[36]:
ABCD
2016-03-020.028397-1.0093490.3270140.918248
2016-03-03-0.857008-1.6826950.646229-0.183377
2016-03-04-1.112885-1.491662-1.114824-0.115619
2016-03-05-0.448713-0.163651-1.2302951.106656
In [37]:
Python
<span class="n">df</span><span class="o">.</span><span class="n">iloc</span><span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">]</span> <span class="c1"># 取出一行一列的值</span>
1
2
< span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > iloc < / span > < span class = "p" > [ < / span > < span class = "mi" > 1 < / span > < span class = "p" > , < / span > < span class = "mi" > 1 < / span > < span class = "p" > ] < / span >
< span class = "c1" > # 取出一行一列的值</span>
Out[37]:
Python
-1.009349292057921
1
- 1.009349292057921
In [38]:
Python
<span class="n">df</span><span class="o">.</span><span class="n">iat</span><span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">]</span> <span class="c1"># 也可以达到同样的效果</span>
1
2
< span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > iat < / span > < span class = "p" > [ < / span > < span class = "mi" > 1 < / span > < span class = "p" > , < / span > < span class = "mi" > 1 < / span > < span class = "p" > ] < / span >
< span class = "c1" > # 也可以达到同样的效果</span>
Out[38]:
Python
-1.009349292057921
1
- 1.009349292057921

布尔索引

In [39]:
Python
<span class="n">df</span><span class="p">[</span><span class="n">df</span><span class="o">.</span><span class="n">A</span> <span class="o"><</span> <span class="mi">0</span><span class="p">]</span> <span class="c1"># 取出 df.A小于0的值 筛选</span>
1
2
< span class = "n" > df < / span > < span class = "p" > [ < / span > < span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > A < / span > < span class = "o" > << / span > < span class = "mi" > 0 < / span > < span class = "p" > ] < / span >
< span class = "c1" > # 取出 df.A小于0的值 筛选</span>
Out[39]:
ABCD
2016-03-03-0.857008-1.6826950.646229-0.183377
2016-03-04-1.112885-1.491662-1.114824-0.115619
2016-03-05-0.448713-0.163651-1.2302951.106656
2016-03-06-0.2678670.092313-0.480238-0.809923
In [40]:
Python
<span class="n">df</span><span class="p">[</span><span class="n">df</span> <span class="o">></span> <span class="mi">0</span><span class="p">]</span>
1
< span class = "n" > df < / span > < span class = "p" > [ < / span > < span class = "n" > df < / span > < span class = "o" >> < / span > < span class = "mi" > 0 < / span > < span class = "p" > ] < / span >
Out[40]:
ABCD
2016-03-011.3134190.826457NaN0.525008
2016-03-020.028397NaN0.3270140.918248
2016-03-03NaNNaN0.646229NaN
2016-03-04NaNNaNNaNNaN
2016-03-05NaNNaNNaN1.106656
2016-03-06NaN0.092313NaNNaN
In [41]:
Python
<span class="n">df</span><span class="p">[</span><span class="s1">'tag'</span><span class="p">]</span> <span class="o">=</span> <span class="p">[</span><span class="s1">'a'</span><span class="p">]</span> <span class="o">*</span> <span class="mi">2</span> <span class="o">+</span> <span class="p">[</span><span class="s1">'b'</span><span class="p">]</span> <span class="o">*</span> <span class="mi">2</span> <span class="o">+</span> <span class="p">[</span><span class="s1">'c'</span><span class="p">]</span> <span class="o">*</span> <span class="mi">2</span> <span class="c1">#添加一列</span>
1
2
< span class = "n" > df < / span > < span class = "p" > [ < / span > < span class = "s1" > 'tag' < / span > < span class = "p" > ] < / span > < span class = "o" >= < / span > < span class = "p" > [ < / span > < span class = "s1" > 'a' < / span > < span class = "p" > ] < / span > < span class = "o" > * < / span > < span class = "mi" > 2 < / span > < span class = "o" > + < / span > < span class = "p" > [ < / span > < span class = "s1" > 'b' < / span > < span class = "p" > ] < / span > < span class = "o" > * < / span > < span class = "mi" > 2 < / span > < span class = "o" > + < / span > < span class = "p" > [ < / span > < span class = "s1" > 'c' < / span > < span class = "p" > ] < / span > < span class = "o" > * < / span > < span class = "mi" > 2 < / span >
< span class = "c1" > #添加一列</span>
In [42]:
Python
<span class="n">df</span>
1
< span class = "n" > df < / span >
Out[42]:
ABCDtag
2016-03-011.3134190.826457-1.5741460.525008a
2016-03-020.028397-1.0093490.3270140.918248a
2016-03-03-0.857008-1.6826950.646229-0.183377b
2016-03-04-1.112885-1.491662-1.114824-0.115619b
2016-03-05-0.448713-0.163651-1.2302951.106656c
2016-03-06-0.2678670.092313-0.480238-0.809923c
In [43]:
Python
<span class="n">df</span><span class="p">[</span><span class="n">df</span><span class="o">.</span><span class="n">tag</span><span class="o">.</span><span class="n">isin</span><span class="p">([</span><span class="s1">'a'</span><span class="p">,</span> <span class="s1">'c'</span><span class="p">])]</span> <span class="c1"># 筛选 通过 isin </span>
1
2
< span class = "n" > df < / span > < span class = "p" > [ < / span > < span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > tag < / span > < span class = "o" > . < / span > < span class = "n" > isin < / span > < span class = "p" > ( [ < / span > < span class = "s1" > 'a' < / span > < span class = "p" > , < / span > < span class = "s1" > 'c' < / span > < span class = "p" > ] ) ] < / span >
< span class = "c1" > # 筛选 通过 isin </span>
Out[43]:
ABCDtag
2016-03-011.3134190.826457-1.5741460.525008a
2016-03-020.028397-1.0093490.3270140.918248a
2016-03-05-0.448713-0.163651-1.2302951.106656c
2016-03-06-0.2678670.092313-0.480238-0.809923c

修改数据

In [44]:
Python
<span class="n">df</span>
1
< span class = "n" > df < / span >
Out[44]:
ABCDtag
2016-03-011.3134190.826457-1.5741460.525008a
2016-03-020.028397-1.0093490.3270140.918248a
2016-03-03-0.857008-1.6826950.646229-0.183377b
2016-03-04-1.112885-1.491662-1.114824-0.115619b
2016-03-05-0.448713-0.163651-1.2302951.106656c
2016-03-06-0.2678670.092313-0.480238-0.809923c
In [45]:
Python
<span class="n">s</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">Series</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="mi">6</span><span class="p">),</span> <span class="n">index</span><span class="o">=</span><span class="n">pd</span><span class="o">.</span><span class="n">date_range</span><span class="p">(</span><span class="s1">'20160301'</span><span class="p">,</span> <span class="n">periods</span><span class="o">=</span><span class="mi">6</span><span class="p">))</span> <span class="n">s</span>
1
2
< span class = "n" > s < / span > < span class = "o" >= < / span > < span class = "n" > pd < / span > < span class = "o" > . < / span > < span class = "n" > Series < / span > < span class = "p" > ( < / span > < span class = "n" > np < / span > < span class = "o" > . < / span > < span class = "n" > arange < / span > < span class = "p" > ( < / span > < span class = "mi" > 6 < / span > < span class = "p" > ) , < / span > < span class = "n" > index < / span > < span class = "o" >= < / span > < span class = "n" > pd < / span > < span class = "o" > . < / span > < span class = "n" > date_range < / span > < span class = "p" > ( < / span > < span class = "s1" > '20160301' < / span > < span class = "p" > , < / span > < span class = "n" > periods < / span > < span class = "o" >= < / span > < span class = "mi" > 6 < / span > < span class = "p" > ) ) < / span >
< span class = "n" > s < / span >
Out[45]:
Python
2016-03-01 0 2016-03-02 1 2016-03-03 2 2016-03-04 3 2016-03-05 4 2016-03-06 5 Freq: D, dtype: int64
1
2
3
4
5
6
7
2016 - 03 - 01      0
2016 - 03 - 02      1
2016 - 03 - 03      2
2016 - 03 - 04      3
2016 - 03 - 05      4
2016 - 03 - 06      5
Freq : D , dtype : int64
In [46]:
Python
<span class="n">df</span><span class="p">[</span><span class="s1">'E'</span><span class="p">]</span> <span class="o">=</span> <span class="n">s</span>
1
< span class = "n" > df < / span > < span class = "p" > [ < / span > < span class = "s1" > 'E' < / span > < span class = "p" > ] < / span > < span class = "o" >= < / span > < span class = "n" > s < / span >
In [47]:
Python
<span class="n">df</span>
1
< span class = "n" > df < / span >
Out[47]:
ABCDtagE
2016-03-011.3134190.826457-1.5741460.525008a0
2016-03-020.028397-1.0093490.3270140.918248a1
2016-03-03-0.857008-1.6826950.646229-0.183377b2
2016-03-04-1.112885-1.491662-1.114824-0.115619b3
2016-03-05-0.448713-0.163651-1.2302951.106656c4
2016-03-06-0.2678670.092313-0.480238-0.809923c5
In [48]:
Python
<span class="n">df</span><span class="o">.</span><span class="n">loc</span><span class="p">[</span><span class="s1">'20160301'</span><span class="p">,</span> <span class="s1">'A'</span><span class="p">]</span> <span class="o">=</span> <span class="mf">0.2</span> <span class="c1"># df.['20160301', 'A'] = 0.2 will not have effect</span>
1
2
< span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > loc < / span > < span class = "p" > [ < / span > < span class = "s1" > '20160301' < / span > < span class = "p" > , < / span > < span class = "s1" > 'A' < / span > < span class = "p" > ] < / span > < span class = "o" >= < / span > < span class = "mf" > 0.2 < / span >
< span class = "c1" > # df.['20160301', 'A'] = 0.2 will not have effect</span>
In [49]:
Python
<span class="n">df</span>
1
< span class = "n" > df < / span >
Out[49]:
ABCDtagE
2016-03-010.2000000.826457-1.5741460.525008a0
2016-03-020.028397-1.0093490.3270140.918248a1
2016-03-03-0.857008-1.6826950.646229-0.183377b2
2016-03-04-1.112885-1.491662-1.114824-0.115619b3
2016-03-05-0.448713-0.163651-1.2302951.106656c4
2016-03-06-0.2678670.092313-0.480238-0.809923c5
In [50]:
Python
<span class="n">df</span><span class="o">.</span><span class="n">at</span><span class="p">[</span><span class="n">pd</span><span class="o">.</span><span class="n">Timestamp</span><span class="p">(</span><span class="s1">'20160301'</span><span class="p">),</span> <span class="s1">'A'</span><span class="p">]</span> <span class="o">=</span> <span class="mf">0.4</span>
1
< span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > at < / span > < span class = "p" > [ < / span > < span class = "n" > pd < / span > < span class = "o" > . < / span > < span class = "n" > Timestamp < / span > < span class = "p" > ( < / span > < span class = "s1" > '20160301' < / span > < span class = "p" > ) , < / span > < span class = "s1" > 'A' < / span > < span class = "p" > ] < / span > < span class = "o" >= < / span > < span class = "mf" > 0.4 < / span >
In [51]:
Python
<span class="n">df</span>
1
< span class = "n" > df < / span >
Out[51]:
ABCDtagE
2016-03-010.4000000.826457-1.5741460.525008a0
2016-03-020.028397-1.0093490.3270140.918248a1
2016-03-03-0.857008-1.6826950.646229-0.183377b2
2016-03-04-1.112885-1.491662-1.114824-0.115619b3
2016-03-05-0.448713-0.163651-1.2302951.106656c4
2016-03-06-0.2678670.092313-0.480238-0.809923c5
In [52]:
Python
<span class="n">df</span><span class="o">.</span><span class="n">iat</span><span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="mf">0.6</span> <span class="n">df</span>
1
2
< span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > iat < / span > < span class = "p" > [ < / span > < span class = "mi" > 0 < / span > < span class = "p" > , < / span > < span class = "mi" > 0 < / span > < span class = "p" > ] < / span > < span class = "o" >= < / span > < span class = "mf" > 0.6 < / span >
< span class = "n" > df < / span >
Out[52]:
ABCDtagE
2016-03-010.6000000.826457-1.5741460.525008a0
2016-03-020.028397-1.0093490.3270140.918248a1
2016-03-03-0.857008-1.6826950.646229-0.183377b2
2016-03-04-1.112885-1.491662-1.114824-0.115619b3
2016-03-05-0.448713-0.163651-1.2302951.106656c4
2016-03-06-0.2678670.092313-0.480238-0.809923c5
In [53]:
Python
<span class="n">df</span><span class="o">.</span><span class="n">loc</span><span class="p">[:,</span> <span class="s1">'A'</span><span class="p">]</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="mi">16</span><span class="p">)</span> <span class="n">df</span>
1
2
< span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > loc < / span > < span class = "p" > [ : , < / span > < span class = "s1" > 'A' < / span > < span class = "p" > ] < / span > < span class = "o" >= < / span > < span class = "n" > np < / span > < span class = "o" > . < / span > < span class = "n" > arange < / span > < span class = "p" > ( < / span > < span class = "mi" > 10 < / span > < span class = "p" > , < / span > < span class = "mi" > 16 < / span > < span class = "p" > ) < / span >
< span class = "n" > df < / span >
Out[53]:
ABCDtagE
2016-03-01100.826457-1.5741460.525008a0
2016-03-0211-1.0093490.3270140.918248a1
2016-03-0312-1.6826950.646229-0.183377b2
2016-03-0413-1.491662-1.114824-0.115619b3
2016-03-0514-0.163651-1.2302951.106656c4
2016-03-06150.092313-0.480238-0.809923c5
In [54]:
Python
<span class="n">df2</span> <span class="o">=</span> <span class="n">df</span><span class="o">.</span><span class="n">loc</span><span class="p">[:,</span> <span class="p">[</span><span class="s1">'B'</span><span class="p">,</span> <span class="s1">'C'</span><span class="p">]]</span><span class="o">.</span><span class="n">copy</span><span class="p">()</span> <span class="n">df2</span><span class="p">[</span><span class="n">df2</span> <span class="o">></span> <span class="mi">0</span><span class="p">]</span> <span class="o">=</span> <span class="o">-</span><span class="n">df2</span> <span class="n">df2</span>
1
2
3
< span class = "n" > df2 < / span > < span class = "o" >= < / span > < span class = "n" > df < / span > < span class = "o" > . < / span > < span class = "n" > loc < / span > < span class = "p" > [ : , < / span > < span class = "p" > [ < / span > < span class = "s1" > 'B' < / span > < span class = "p" > , < / span > < span class = "s1" > 'C' < / span > < span class = "p" > ] ] < / span > < span class = "o" > . < / span > < span class = "n" > copy < / span > < span class = "p" > ( ) < / span >
< span class = "n" > df2 < / span > < span class = "p" > [ < / span > < span class = "n" > df2 < / span > < span class = "o" >> < / span > < span class = "mi" > 0 < / span > < span class = "p" > ] < / span > < span class = "o" >= < / span > < span class = "o" > - < / span > < span class = "n" > df2 < / span >
< span class = "n" > df2 < / span >
Out[54]:
BC
2016-03-01-0.826457-1.574146
2016-03-02-1.009349-0.327014
2016-03-03-1.682695-0.646229
2016-03-04-1.491662-1.114824
2016-03-05-0.163651-1.230295
2016-03-06-0.092313-0.480238



  • zeropython 微信公众号 5868037 QQ号 5868037@qq.com QQ邮箱
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值