测试环境:Python3.5、Impala2.10.0、Impyla0.15.0
Impyla是用于分布式查询引擎的HiveServer2实现(如Impala、Hive)的python客户端。
1、安装Impyla
安装依赖包:
sudo pip install six
sudo pip install bit_array
sudo pip install thriftpy
安装Impyla:
sudo pip install impyla
2、测试连接impala
#-*- coding: utf-8 -*-
from impala.dbapi import connect
conn = connect(host='192.168.1.188', port=21050)
#conn = connect(host=host, port=prot_impala, user='', password='', auth_mechanism='')
cur = conn.cursor()
cur.execute('select name from user limit 10')
data_list=cur.fetchall()
for data in data_list:
print("用户名称:" + str(data[0]))
3、安装thrift_sasl
thrift_sasl是连接Hive的依赖包,此处需要安装0.2.1版本(默认安装的0.3.0会报错’TSocket’ object has no attribute ‘isOpen’)
sudo pip install thrift-sasl==0.2.1
4、测试连接Hive
#-*- coding: utf-8 -*-
from impala.dbapi import connect
conn = connect(host='192.168.1.188',port=10000)
#conn = connect(host=host, port=prot_impala, user='', password='', auth_mechanism='')
cur = conn.cursor()
cur.execute("select * from abc where date='2019-05-28'")
data_list=cur.fetchall()
for data in data_list:
print(data)