一、python连接外部数据表
https://clickhouse-driver.readthedocs.io/en/latest/features.html
https://github.com/mymarilyn/clickhouse-driver
from clickhouse_driver import Client
client = Client('172.xx.x.xx')
res= client.execute(
'''
select XXXXXXXXXXXX from xxx
''',external_tables=tables, with_column_types=True
)
res[0],res[1]
引号里写正常的sql,可以写外部表的也可以写数据库表的。
二、用户权限设置
https://www.cnblogs.com/DBArtist/p/clickhouse_access.html 转载
https://clickhouse.tech/docs/zh/sql-reference/statements/grant/ 转载
创建账号:
CREATE USER [IF NOT EXISTS | OR REPLACE] name [ON CLUSTER cluster_name]
[IDENTIFIED [WITH {NO_PASSWORD|PLAINTEXT_PASSWORD|SHA256_PASSWORD|SHA256_HASH|DOUBLE_SHA1_PASSWORD|DOUBLE_SHA1_HASH}] BY {'password'|'hash'}]
[HOST {LOCAL | NAME 'name' | REGEXP 'name_regexp' | IP 'address' | LIKE 'pattern'} [,...] | ANY | NONE]
[DEFAULT ROLE role [,...]]
[SETTINGS variable [= value] [MIN [=] min_value] [MAX [=] max_value] [READONLY|WRITABLE] | PROFILE 'profile_name'] [,...]
<-- 例如 -->
CREATE USER test1 on cluster `cluster_name`IDENTIFIED WITH PLAINTEXT_PASSWORD BY '1234qwer'
# 或者在每台机器上
cd /etc/clickhouse-server
# 修改users.xml中的<users>
vim users.xml
授权用户:
GRANT [ON CLUSTER cluster_name] role [,...] TO {user | another_role | CURRENT_USER} [,...] [WITH ADMIN OPTION]
<-- 例如 -->
GRANT ON CLUSTER `cluster_name` SELECT ON default.* TO test1
创建角色:
因为没找到如何在集群上创建角色,只成功了单机上创建角色。所以先不做记录。:(
三、分片和副本
1. 使用ReplicatedMergeTree家族引擎创建副本:
https://blog.youkuaiyun.com/qq_28603127/article/details/109753395 转载
2. 使用分布表引擎:
https://blog.youkuaiyun.com/weixin_39025362/article/details/109091321 转载
<-- 登录任意节点客户端,在集群xxx上建立本地表table_name_local -->
CREATE TABLE db_name.table_name_local on cluster cluster_name
(
`xx` UInt64,
`xx_date` Date
)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/table_name_local', '{replica}')
PARTITION BY toYYYYMM(xx_date)
ORDER BY (xx)
SETTINGS index_granularity = 8192
<-- 建立分布式表 -->
CREATE TABLE db_name.table_name on cluster cluster_name
(
`xx` UInt64,
`xx_date` Date
)
ENGINE = Distributed('cluster_name', 'db_name', 'table_name_local', intHash64(xx))
<-- 可在任意节点上对分布式表进行读写 -->
clickhouse-client -h xx.xx.xx.x -u xx --password xx --query="insert into db_name.table_name format TSV" < table_name.tsv
3. 集群方案ReplicatedMergeTree+Distributed:
https://zhuanlan.zhihu.com/p/161242274?utm_source=wechat_session 转载
分布式表尽量只读不写,写入时轮巡local表。
4. 分布式子查询会遇到的坑:
https://blog.youkuaiyun.com/qq_23160237/article/details/105683601 转载,博主举例详细解释了各种情况,超棒!
四、grafana监控
1. 安装配置grafana
#centos系统
wget https://dl.grafana.com/oss/release/grafana-6.7.2-1.x86_64.rpm
sudo yum install grafana-6.7.2-1.x86_64.rpm
#下载clickhouse插件
grafana-cli plugins install vertamedia-clickhouse-datasource
#开启grafana服务
systemctl start grafana-server.service
#测试服务是否启动
curl 192.xx.xx.xx:3000
#浏览器访问grafana,默认账号密码皆为admin
2. 建立dashboard
https://blog.youkuaiyun.com/jiangshouzhuang/article/details/103759969 转载
可以直接下载别人配置好的dashboard的json文件,直接在grafana中导入:
https://grafana.com/grafana/dashboards/2515/revisions 转载 这个主要是query_log的模板。