1、通过索引连接DataFrame
In [296]: piror = pd.DataFrame({'key1' : ['k0', 'k1', 'k2', 'k3'], 'A' : ['A0', 'A1', 'A
...: 2', 'A3']})
In [297]: rear = pd.DataFrame({'key2' : ['k0', 'k1', 'k2', 'k100'], 'B' : ['B0', 'B1', '
...: B2', 'B100']})
In [298]: piror
Out[298]:
A key1
0 A0 k0
1 A1 k1
2 A2 k2
3 A3 k3
In [299]: rear
Out[299]:
B key2
0 B0 k0
1 B1 k1
2 B2 k2
3 B100 k100
In [300]: piror.join(rear, lsuffix='_piror', rsuffix='_rear')
Out[300]:
A key1 B key2
0 A0 k0 B0 k0
1 A1 k1 B1 k1
2 A2 k2 B2 k2
3 A3 k3 B100 k100
2、通过指定的列连接DataFrame
In [313]: piror.set_index('key1').join(rear.set_index('key2'))
Out[313]:
A B
key1
k0 A0 B0
k1 A1 B1
k2 A2 B2
k3 A3 NaN
通过指定索引进行join
连接:
# 指定piror中的key1为索引,将rear数据中的key2所指的与piror中相同的索引值所指向的元素合并
In [313]: piror.set_index('key1').join(rear.set_index('key2'))
Out[313]:
A B
key1
k0 A0 B0
k1 A1 B1
k2 A2 B2
k3 A3 NaN
指定连接的列设置为rear数据中的‘key2’索引,再用on指定piror的连接列。
In [314]: piror.join(rear.set_index('key2'), on='key1')
Out[314]:
A key1 B
0 A0 k0 B0
1 A1 k1 B1
2 A2 k2 B2
3 A3 k3 NaN
In [315]: piror.join(rear.set_index('key2'), on='key1', how='inner')
Out[315]:
A key1 B
0 A0 k0 B0
1 A1 k1 B1
2 A2 k2 B2
其他连接:
# 右连接
In [316]: piror.join(rear, how='right', lsuffix='_piror', rsuffix='_rear')
Out[316]:
A key1 B key2
0 A0 k0 B0 k0
1 A1 k1 B1 k1
2 A2 k2 B2 k2
3 A3 k3 B100 k100
# 外连接
In [317]: piror.join(rear.set_index('key2'), on='key1', how='outer')
Out[317]:
A key1 B
0 A0 k0 B0
1 A1 k1 B1
2 A2 k2 B2
3 A3 k3 NaN
3 NaN k100 B100