使用普通用户登录中控机,以 tidb 用户为例,后续安装 TiUP 及集群管理操作均通过该用户完成
安装TiUP及cluster组件
执行如下命令安装 TiUP 工具:
curl --proto '=https' --tlsv1.2 -sSf https://tiup-mirrors.pingcap.com/install.sh | sh
按如下步骤设置 TiUP 环境变量:
source .bash_profile
确认 TiUP 工具是否安装:
which tiup
安装 TiUP cluster 组件
tiup cluster
如果已经安装,则更新 TiUP cluster 组件至最新版本
tiup update --self && tiup update cluster
验证当前 TiUP cluster 版本信息。执行如下命令查看 TiUP cluster 组件版本:
tiup --binary cluster
由于模拟多机部署,需要通过 root 用户调大 sshd 服务的连接数限制:
修改 /etc/ssh/sshd_config
将 MaxSessions
调至 20。
重启 sshd 服务:
service sshd restart
创建并启动集群
按下面的配置模板,编辑配置文件,命名为 prd-mfg-cluster.yaml
# # Global variables are applied to all deployments and used as the default value of
# # the deployments if a specific deployment value is missing.
global:
user: "tidb"
ssh_port: 22
deploy_dir: "/app/tidb/deploy"
data_dir: "/app/tidb/data"
arch: "amd64" # Supported values: "amd64", "arm64" (default: "amd64")
# # Monitored variables are applied to all the machines.
monitored:
node_exporter_port: 9100
blackbox_exporter_port: 9115
deploy_dir: "/app/tidb/monitored/monitored-9100"
data_dir: "/app/tidb/monitored/monitored-9100/data"
log_dir: "/app/tidb/monitored/monitored-9100/log"
server_configs:
tidb:
alter-primary-key: true
binlog.enable: false
binlog.ignore-error: false
log.slow-threshold: 2000
lower-case-table-names: 1
max-index-length: 12288
mem-quota-query: 2147483648
oom-action: log
performance.max-txn-ttl: 3600000
tikv:
# server.grpc-concurrency: 4
# raftstore.apply-pool-size: 2
# raftstore.store-pool-size: 2
# rocksdb.max-sub-compactions: 1
# storage.block-cache.capacity: "16GB"
# readpool.unified.max-thread-count: 12
readpool.storage.use-unified-pool: false
readpool.coprocessor.use-unified-pool: true
pd:
schedule.leader-schedule-limit: 4
schedule.region-schedule-limit: 2048
schedule.replica-schedule-limit: 64
pd_servers:
- host: 10.254.212.17
ssh_port: 22
name: "pd-1"
client_port: 2379
peer_port: 2380
deploy_dir: "/app/tidb/deploy/pd-2379"
data_dir: "/app/tidb/data/pd-2379"
log_dir: "/app/tidb/deploy/pd-2379/log"
- host: 10.254.212.18
ssh_port: 22
name: "pd-2"
client_port: 2379
peer_port: 2380
deploy_dir: "/app/tidb/deploy/pd-2379"
data_dir: "/app/tidb/data/pd-2379"
log_dir: "/app/tidb/deploy/pd-2379/log"
- host: 10.254.212.19
ssh_port: 22
name: "pd-3"
client_port: 2379
peer_port: 2380
deploy_dir: "/app/tidb/deploy/pd-2379"
data_dir: "/app/tidb/data/pd-2379"
log_dir: "/app/tidb/deploy/pd-2379/log"
tidb_servers:
- host: 10.254.212.11
ssh_port: 22
port: 4000
status_port: 10080
deploy_dir: "/app/tidb/deploy/tidb-4000"
log_dir: "/app/tidb/deploy/tidb-4000/log"
config:
log.level: warn
log.slow-query-file: tidb_slow_query.log
- host: 10.254.212.12
ssh_port: 22
port: 4000
status_port: 10080
deploy_dir: "/app/tidb/deploy/tidb-4000"
log_dir: "/app/tidb/deploy/tidb-4000/log"
config:
log.level: warn
log.slow-query-file: tidb_slow_query.log
- host: 10.254.212.13
ssh_port: 22
port: 4000
status_port: 10080
deploy_dir: "/app/tidb/deploy/tidb-4000"
log_dir: "/app/tidb/deploy/tidb-4000/log"
config:
log.level: warn
log.slow-query-file: tidb_slow_query.log
tikv_servers:
- host: 10.254.212.14
ssh_port: 22
port: 20160
status_port: 20180
deploy_dir: "/app/tidb/deploy/tikv-20160"
data_dir: "/app/tidb/data/tikv-20160"
log_dir: "/app/tidb/deploy/tikv-20160/log"
- host: 10.254.212.15
ssh_port: 22
port: 20160
status_port: 20180
deploy_dir: "/app/tidb/deploy/tikv-20160"
data_dir: "/app/tidb/data/tikv-20160"
log_dir: "/app/tidb/deploy/tikv-20160/log"
- host: 10.254.212.16
ssh_port: 22
port: 20160
status_port: 20180
deploy_dir: "/app/tidb/deploy/tikv-20160"
data_dir: "/app/tidb/data/tikv-20160"
log_dir: "/app/tidb/deploy/tikv-20160/log"
cdc_servers:
- host: 10.254.212.20
port: 8300
deploy_dir: "/app/tidb/deploy/cdc-8300"
log_dir: "/app/tidb/deploy/cdc-8300/log"
- host: 10.254.212.21
port: 8300
deploy_dir: "/app/tidb/deploy/cdc-8300"
log_dir: "/app/tidb/deploy/cdc-8300/log"
- host: 10.254.212.22
port: 8300
deploy_dir: "/app/tidb/deploy/cdc-8300"
log_dir: "/app/tidb/deploy/cdc-8300/log"
monitoring_servers:
- host: 10.254.212.23
ssh_port: 22
port: 9090
deploy_dir: "/app/tidb/deploy/prometheus-8249"
data_dir: "/app/tidb/data/prometheus-8249"
log_dir: "/app/tidb/deploy/prometheus-8249/log"
grafana_servers:
- host: 10.254.212.23
port: 3000
deploy_dir: /app/tidb/deploy/grafana-3000
alertmanager_servers:
- host: 10.254.212.23
ssh_port: 22
web_port: 9093
cluster_port: 9094
deploy_dir: "/app/tidb/deploy/alertmanager-9093"
data_dir: "/app/tidb/data/alertmanager-9093"
log_dir: "/app/tidb/deploy/alertmanager-9093/log"
## config_file: "/home/tidb/prd-mfg-cluster/alertfeishu.yml"
执行集群部署命令:
tiup cluster deploy prd-mfg-tidb-cluster v4.0.10 /home/tidb/prd-mfg-cluster/prd-mfg-cluster.yaml --user root -p
参数 表示设置集群名称
参数 表示设置集群版本,可以通过 tiup list tidb
命令来查看当前支持部署的 TiDB 版本
启动集群
tiup cluster start prd-mfg-cluster
查看集群情况
tiup cluster display prd-mfg-cluster
访问 TiDB 的 Grafana 监控:
通过 http://10.254.212.23:3000
访问集群 Grafana 监控页面,默认用户名和密码均为 admin
。
访问 TiDB 的 Dashboard:
通过 http://10.254.212.19:2379/dashboard
访问集群 TiDB Dashboard 监控页面,默认用户名为 root
,密码为空。
执行以下命令确认当前已经部署的集群列表:
命令
tiup cluster list
将集群升级到指定版本
tiup cluster upgrade <cluster-name> <version>
以升级到 v5.0.0 版本为例:
tiup cluster upgrade <cluster-name> v5.0.0
滚动升级会逐个升级所有的组件。升级 TiKV 期间,会逐个将 TiKV 上的所有 leader 切走再停止该 TiKV 实例。默认超时时间为
5 分钟,超过后会直接停止实例。 如果不希望驱逐 leader,而希望立刻升级,可以在上述命令中指定
–force,该方式会造成性能抖动,不会造成数据损失。 如果希望保持性能稳定,则需要保证 TiKV 上的所有 leader 驱逐完成后再停止该 TiKV 实例,可以指定 --transfer-timeout 为一个超大值,如 --transfer-timeout
100000000,单位为 s。
重启某个节点或角色
tiup cluster reload ${cluster-name} [-N <nodes>] [-R <roles>]
tiup cluster reload tidbtest -N 10.254.241.66:9001
重命名集群
tiup cluster rename ${cluster-name} ${new-name}