编译
参考官网文档:https://clickhouse.tech/docs/en/development/build/#how-to-build-clickhouse-on-any-linux
1. 环境准备
a. gcc
依赖于gcc 10或者以上版本。默认系统上没有安装这么高版本的gcc,需要手动安装。安装方法参考链接:https://blog.youkuaiyun.com/wulinncom/article/details/107773145
b. ninja
参考链接:https://www.cnblogs.com/bjarnescottlee/p/13872893.html
2. 编译
1. git clone --recursive https://github.com/ClickHouse/ClickHouse.git
2. cd ClickHouse
mkdir build && cd build
3. export CC=gcc CXX=g++
cmake -D CMAKE_INSTALL_PREFIX=/home/ck-programs ../../ClickHouse
ninja
4. cmake -P cmake_install.cmake
执行完成后,clickhouse会安装到/home/ck-programs目录。
单机分布式部署
由于机器不够,只有一台机器,需要在这一台机器部署click集群,方法如下:
1. 安装zookeeper
zookeeper 安装比较简单,在官网下载二进制表,解压即可。
然后在该机器上启动zookeeper。
2. 创建配置文件
我打算部署2个node,数据目录分别在:
/home/ck-data/node1
/home/ck-data/node2
分别创建node1和node2的配置文件:
node1的config.xml文件:
<?xml version="1.0"?>
<yandex>
<!-- 日志 -->
<logger>
<level>trace</level>
<log>/home/ck-data/node1/log/server.log</log>
<errorlog>/home/ck-data/node1/log/error.log</errorlog>
<size>1000M</size>
<count>10</count>
</logger>
<!-- 端口 -->
<http_port>8123</http_port>
<tcp_port>9000</tcp_port>
<interserver_http_port>9009</interserver_http_port>
<!-- 本机域名 -->
<interserver_http_host>这里需要用域名,如果后续用到复制的话</interserver_http_host>
<!-- 监听IP -->
<listen_host>::</listen_host>
<!-- 最大连接数 -->
<max_connections>64</max_connections>
<!-- 没搞懂的参数 -->
<keep_alive_timeout>3</keep_alive_timeout>
<!-- 最大并发查询数 -->
<max_concurrent_queries>16</max_concurrent_queries>
<!-- 单位是B -->
<uncompressed_cache_size>8589934592</uncompressed_cache_size>
<mark_cache_size>10737418240</mark_cache_size>
<!-- 存储路径 -->
<path>/home/ck-data/node1/</path>
<tmp_path>/home/ck-data/node1/tmp/</tmp_path>
<!-- user配置 -->
<users_config>users.xml</users_config>
<default_profile>default</default_profile>
<default_database>default</default_database>
<remote_servers incl="clickhouse_remote_servers" />
<zookeeper incl="zookeeper-servers" optional="true" />
<macros incl="macros" optional="true" />
<!-- 没搞懂的参数 -->
<builtin_dictionaries_reload_interval>3600</builtin_dictionaries_reload_interval>
<!-- 控制大表的删除 -->
<max_table_size_to_drop>0</max_table_size_to_drop>
<include_from>/home/ck-data/metrika.xml</include_from>
<distributed_ddl>
<!-- Path in ZooKeeper to queue with DDL queries -->
<path>/clickhouse/task_queue/ddl</path>
<cleanup_delay_period>60</cleanup_delay_period>
<task_max_lifetime>86400</task_max_lifetime>
<max_tasks_in_queue>1000</max_tasks_in_queue>
</distributed_ddl>
</yandex>
node1的users.xml配置文件:
<?xml version="1.0"?>
<yandex>
<profiles>
<!-- 读写用户设置 -->
<default>
<max_memory_usage>10000000000</max_memory_usage>
<use_uncompressed_cache>0</use_uncompressed_cache>
<load_balancing>random</load_balancing>
</default>
<!-- 只写用户设置 -->
<readonly>
<max_memory_usage>10000000000</max_memory_usage>
<use_uncompressed_cache>0</use_uncompressed_cache>
<load_balancing>random</load_balancing>
<readonly>1</readonly>
</readonly>
</profiles>
<!-- 配额 -->
<quotas>
<!-- Name of quota. -->
<default>
<interval>
<duration>3600</duration>
<queries>0</queries>
<errors>0</errors>
<result_rows>0</result_rows>
<read_rows>0</read_rows>
<execution_time>0</execution_time>
</interval>
</default>
</quotas>
<users>
<!-- 读写用户 -->
<default>
<password_sha256_hex>ef797c8118f02dfb649607dd5d3f8c7623048c9c063d532cc95c5ed7a898a64f</password_sha256_hex>
<networks incl="networks" replace="replace">
<ip>::/0</ip>
</networks>
<profile>default</profile>
<quota>default</quota>
</default>
<!-- 只读用户 -->
<ck>
<password_sha256_hex>ef797c8118f02dfb649607dd5d3f8c7623048c9c063d532cc95c5ed7a898a64f</password_sha256_hex>
<networks incl="networks" replace="replace">
<ip>::/0</ip>
</networks>
<profile>readonly</profile>
<quota>default</quota>
</ck>
</users>
</yandex>
node2的config.xml和users.xml与node1基本一样,主要是把路径和端口号改掉就行了。
metrika.xml配置文件:
<yandex>
<!-- 集群配置 -->
<clickhouse_remote_servers>
<bip_ck_cluster>
<!-- 数据分片1 -->
<shard>
<internal_replication>false</internal_replication>
<replica>
<host>localhost</host>
<port>9000</port>
<user>default</user>
<password>12345678</password>
</replica>
</shard>
<!-- 数据分片2 -->
<shard>
<internal_replication>false</internal_replication>
<replica>
<host>localhost</host>
<port>9100</port>
<user>default</user>
<password>12345678</password>
</replica>
</shard>
</bip_ck_cluster>
</clickhouse_remote_servers>
<!-- 本节点副本名称(这里无用) -->
<macros>
<replica>ck1</replica>
</macros>
<!-- 监听网络(貌似重复) -->
<networks>
<ip>::/0</ip>
</networks>
<!-- ZK -->
<zookeeper-servers>
<node index="1">
<host>localhost</host>
<port>2181</port>
</node>
</zookeeper-servers>
<!-- 数据压缩算法 -->
<clickhouse_compression>
<case>
<min_part_size>10000000000</min_part_size>
<min_part_size_ratio>0.01</min_part_size_ratio>
<method>lz4</method>
</case>
</clickhouse_compression>
</yandex>
3. 启动ck node
if [ "$1" = "start" ];then
nohup /home/ck-programs/bin/clickhouse-server --config=/home/ck-data/node1/config.xml --pid-file=/home/ck-data/node1/clickhouse-server.pid &
nohup /home/ck-programs/bin/clickhouse-server --config=/home/ck-data/node2/config.xml --pid-file=/home/ck-data/node2/clickhouse-server.pid &
else
pids=`ps ux|grep clickhouse-server|grep -v clickhouse-watchdog|grep -v grep|awk '{print $2}'`
pids=`echo $pids`
echo $pids
kill -s 2 $pids
fi
4. 至此2节点的clickhouse集群已经部署成功。
可以查看system.clusters查看部署信息:

然后就可以执行SQL了:

本文详细介绍了如何在Linux环境中编译Clickhouse,包括所需的gcc和ninja的安装,并提供了编译步骤。对于单机分布式部署,讲解了在一台机器上配置并启动clickhouse集群,涉及zookeeper的安装以及多个节点的配置文件设定,最后展示了如何验证部署成功并开始执行SQL操作。
1051

被折叠的 条评论
为什么被折叠?



