Etcdctl工具-管理操作etcd集群

一、简介

etcd就是个分布式非关系型数据库.
3 个节点组成的集群,可以容忍 1 个节点故障。
生成环境中,不推荐使用单个节点的 etcd 集群。

  1. etcd 支持存储多个版本的数据,允许查询指定 key 历史版本的数据。
  2. etcd 为了控制数据总空间,会周期性的清理数据的历史版本。
  3. etcd 不支持修改旧版本的数据。
  4. etcd 中,数据以二进制的方式存储在磁盘中。

二、安装

参考链接:Releases · etcd-io/etcd · GitHub

2.1 使用脚本部署

ETCD_VER=v3.4.20

# choose either URL
GOOGLE_URL=https://storage.googleapis.com/etcd
GITHUB_URL=https://github.com/etcd-io/etcd/releases/download
DOWNLOAD_URL=${GOOGLE_URL}

rm -f /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz
rm -rf /tmp/etcd-download-test && mkdir -p /tmp/etcd-download-test

curl -L ${DOWNLOAD_URL}/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz -o /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz
tar xzvf /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz -C /tmp/etcd-download-test --strip-components=1
rm -f /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz

cp /tmp/etcd-download-test/etcd /usr/bin/
cp /tmp/etcd-download-test/etcdctl /usr/bin/

etcd --version
etcdctl version

使用etcdctlv3的版本时,需设置环境变量ETCDCTL_API=3

vim /etc/profile

...
ETCDCTL_API=3
...

###
source /etc/profile

2.2 检查

[root@k8s-master][16:09:03][FAIL] ~/etcdctl/etcd-v3.4.20-linux-amd64 
#etcd --version
etcd Version: 3.4.20
Git SHA: 1e26823
Go Version: go1.16.15
Go OS/Arch: linux/amd64

[root@k8s-master][16:09:05][OK] ~/etcdctl/etcd-v3.4.20-linux-amd64 
#etcdctl version
etcdctl version: 3.4.20
API version: 3.4

三、使用

3.1 帮助信息

#etcdctl --help
NAME:
	etcdctl - A simple command line client for etcd3.

USAGE:
	etcdctl [flags]

VERSION:
	3.4.20

API VERSION:
	3.4


COMMANDS:
	alarm disarm		Disarms all alarms
	alarm list		Lists all alarms
	auth disable		Disables authentication
	auth enable		Enables authentication
	check datascale		Check the memory usage of holding data for different workloads on a given server endpoint.
	check perf		Check the performance of the etcd cluster
	compaction		Compacts the event history in etcd
	defrag			Defragments the storage of the etcd members with given endpoints
	del			Removes the specified key or range of keys [key, range_end)
	elect			Observes and participates in leader election
	endpoint hashkv		Prints the KV history hash for each endpoint in --endpoints
	endpoint health		Checks the healthiness of endpoints specified in `--endpoints` flag
	endpoint status		Prints out the status of endpoints specified in `--endpoints` flag
	get			Gets the key or a range of keys
	help			Help about any command
	lease grant		Creates leases
	lease keep-alive	Keeps leases alive (renew)
	lease list		List all active leases
	lease revoke		Revokes leases
	lease timetolive	Get lease information
	lock			Acquires a named lock
	make-mirror		Makes a mirror at the destination etcd cluster
	member add		Adds a member into the cluster
	member list		Lists all members in the cluster
	member promote		Promotes a non-voting member in the cluster
	member remove		Removes a member from the cluster
	member update		Updates a member in the cluster
	migrate			Migrates keys in a v2 store to a mvcc store
	move-leader		Transfers leadership to another etcd cluster member.
	put			Puts the given key into the store
	role add		Adds a new role
	role delete		Deletes a role
	role get		Gets detailed information of a role
	role grant-permission	Grants a key to a role
	role list		Lists all roles
	role revoke-permission	Revokes a key from a role
	snapshot restore	Restores an etcd member snapshot to an etcd directory
	snapshot save		Stores an etcd node backend snapshot to a given file
	snapshot status		Gets backend snapshot status of a given file
	txn			Txn processes all the requests in one transaction
	user add		Adds a new user
	user delete		Deletes a user
	user get		Gets detailed information of a user
	user grant-role		Grants a role to a user
	user list		Lists all users
	user passwd		Changes password of user
	user revoke-role	Revokes a role from a user
	version			Prints the version of etcdctl
	watch			Watches events stream on keys or prefixes

OPTIONS:
      --cacert=""				verify certificates of TLS-enabled secure servers using this CA bundle
      --cert=""					identify secure client using this TLS certificate file
      --command-timeout=5s			timeout for short running command (excluding dial timeout)
      --debug[=false]				enable client-side debug logging
      --dial-timeout=2s				dial timeout for client connections
  -d, --discovery-srv=""			domain name to query for SRV records describing cluster endpoints
      --discovery-srv-name=""			service name to query when using DNS discovery
      --endpoints=[127.0.0.1:2379]		gRPC endpoints
  -h, --help[=false]				help for etcdctl
      --hex[=false]				print byte strings as hex encoded strings
      --insecure-discovery[=true]		accept insecure SRV records describing cluster endpoints
      --insecure-skip-tls-verify[=false]	skip server certificate verification (CAUTION: this option should be enabled only for testing purposes)
      --insecure-transport[=true]		disable transport security for client connections
      --keepalive-time=2s			keepalive time for client connections
      --keepalive-timeout=6s			keepalive timeout for client connections
      --key=""					identify secure client using this TLS key file
      --password=""				password for authentication (if this option is used, --user option shouldn't include password)
      --user=""					username[:password] for authentication (prompt if password is not supplied)
  -w, --write-out="simple"			set the output format (fields, json, protobuf, simple, table)

3.2 指定etcd集群

HOST_1=10.240.0.17
HOST_2=10.240.0.18
HOST_3=10.240.0.19
ENDPOINTS=$HOST_1:2379,$HOST_2:2379,$HOST_3:2379

etcdctl --endpoints=$ENDPOINTS member list

3.3. 增删改查

3.3.1 增

 etcdctl --endpoints=$ENDPOINTS put foo "Hello World!"

3.3.2 查 

etcdctl --endpoints=$ENDPOINTS get foo
etcdctl --endpoints=$ENDPOINTS --write-out="json" get foo 

基于相同前缀查找 

etcdctl --endpoints=$ENDPOINTS put web1 value1
etcdctl --endpoints=$ENDPOINTS put web2 value2
etcdctl --endpoints=$ENDPOINTS put web3 value3

etcdctl --endpoints=$ENDPOINTS get web --prefix

3.3.3 删 

etcdctl --endpoints=$ENDPOINTS put key myvalue
etcdctl --endpoints=$ENDPOINTS del key

etcdctl --endpoints=$ENDPOINTS put k1 value1
etcdctl --endpoints=$ENDPOINTS put k2 value2
etcdctl --endpoints=$ENDPOINTS del k --prefix

3.3.4 集群状态

集群状态主要是etcdctl endpoint status 和etcdctl endpoint health两条命令。

etcdctl --write-out=table --endpoints=$ENDPOINTS endpoint status

+------------------+------------------+---------+---------+-----------+-----------+------------+
|     ENDPOINT     |        ID        | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+------------------+------------------+---------+---------+-----------+-----------+------------+
| 10.240.0.17:2379 | 4917a7ab173fabe7 | 3.0.0   | 45 kB   | true      |         4 |      16726 |
| 10.240.0.18:2379 | 59796ba9cd1bcd72 | 3.0.0   | 45 kB   | false     |         4 |      16726 |
| 10.240.0.19:2379 | 94df724b66343e6c | 3.0.0   | 45 kB   | false     |         4 |      16726 |
+------------------+------------------+---------+---------+-----------+-----------+------------+

etcdctl --endpoints=$ENDPOINTS endpoint health

10.240.0.17:2379 is healthy: successfully committed proposal: took = 3.345431ms
10.240.0.19:2379 is healthy: successfully committed proposal: took = 3.767967ms
10.240.0.18:2379 is healthy: successfully committed proposal: took = 4.025451ms
  • ENDPOINT:etcd 实例的访问端点。
  • ID:etcd 实例的唯一标识符。
  • VERSION:etcd 实例的版本号。
  • DB SIZE:etcd 数据库的大小。
  • IS LEADER:该实例是否是当前集群的领导者。
  • RAFT TERM:当前的 Raft 任期。
  • RAFT INDEX:当前的 Raft 日志索引。

其中以下两个值帮我们判断etcd集群数据一致性:

  • DB SIZE:

    • 含义DB SIZE 表示 etcd 数据库文件的大小。这是 etcd 实例当前存储的数据量的物理大小。
    • 作用:此值用于监控 etcd 存储的使用情况,可以帮助管理员确定是否需要扩展存储或者进行数据清理。大多数情况下,数据库的大小是定期增长的,具体取决于 etcd 中存储的数据量和写入速率。
  • RAFT INDEX:

    • 含义RAFT INDEX 表示当前 etcd 实例的 Raft 日志索引。Raft 是 etcd 用于分布式一致性的共识算法。RAFT INDEX 代表当前日志条目的索引位置。
    • 作用:此值用于了解 etcd 集群内的日志复制状态和一致性状态。较高的 RAFT INDEX 可能意味着大量的操作日志需要复制到其他 etcd 节点。这也用于帮助管理员排查和调试一致性问题。

3.3.5 集群成员

跟集群成员相关的命令如下:

member add          Adds a member into the cluster
member remove    Removes a member from the cluster
member update     Updates a member in the cluster
member list            Lists all members in the cluster 

 例如 etcdctl member list列出集群成员的命令。

etcdctl --endpoints=http://172.16.5.4:12379 member list -w table

+-----------------+---------+-------+------------------------+-----------------------------------------------+
|       ID        | STATUS  | NAME  |       PEER ADDRS       |                 CLIENT ADDRS                  |
+-----------------+---------+-------+------------------------+-----------------------------------------------+
| c856d92a82ba66a | started | etcd0 | http://172.16.5.4:2380 | http://172.16.5.4:2379,http://172.16.5.4:4001 |
+-----------------+---------+-------+------------------------+-----------------------------------------------+

3.4 指定授权文件

在执行etcdctl命令时需要指定认证授权文件, 所以将认证授权步骤 别名至 etcdctl 简化操作

# 指定ETCDCTL_API版本为3
$ export ETCDCTL_API=3

# 创建etcdctl别名,指定监听地址,和证书
$ alias etcdctl='etcdctl --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt --key=/etc/kubernetes/pki/etcd/healthcheck-client.key'

3.4.1 查看etcd集群的成员节点

#etcdctl member list -w table
+------------------+---------+------------+------------------------+------------------------+------------+
|        ID        | STATUS  |    NAME    |       PEER ADDRS       |      CLIENT ADDRS      | IS LEARNER |
+------------------+---------+------------+------------------------+------------------------+------------+
| 8dc8eb40f5ed7ad6 | started | k8s-master | https://10.0.0.16:2380 | https://10.0.0.16:2379 |      false |
+------------------+---------+------------+------------------------+------------------------+------------+

3.4.2 查看etcd集群节点状态

[root@k8s-master][16:19:15][OK] ~/etcdctl/etcd-v3.4.20-linux-amd64 
#etcdctl endpoint status -w table
+------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|        ENDPOINT        |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://127.0.0.1:2379 | 8dc8eb40f5ed7ad6 |   3.5.3 |   46 MB |      true |      false |        10 |     380897 |             380897 |        |
+------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

[root@k8s-master][16:20:35][OK] ~/etcdctl/etcd-v3.4.20-linux-amd64 
#etcdctl endpoint health -w table
+------------------------+--------+-------------+-------+
|        ENDPOINT        | HEALTH |    TOOK     | ERROR |
+------------------------+--------+-------------+-------+
| https://127.0.0.1:2379 |   true | 11.021122ms |       |
+------------------------+--------+-------------+-------+

3.5 备份数据

# 字符串拼接用于定时任务
etcdctl snapshot save `hostname`-etcd_`date +%Y%m%d%H%M`.db

3.6 恢复快照

#停止etcd和apiserver
## 移走当前数据目录
mv /var/lib/etcd/ /var/lib/etcd.bak

#恢复快照
etcdctl snapshot restore `hostname`-etcd_`date +%Y%m%d%H%M`.db --data-dir=/var/lib/etcd/

二进制部署的ETCD恢复快照

在这里插入图片描述

四、故障排查

journalctl -u etcd > a.log导出日志慢慢分析
<think>嗯,用户遇到了执行kubectl exec命令查看etcd节点状态时连接被拒绝的问题。首先,我需要回忆一下kubectl exec的基本用法,以及etcdctl的常见参数。错误信息显示connection refused,说明无法连接到本地的2379端口。可能的步骤包括检查etcd pod是否正常运行,确认端口是否正确暴露,以及etcd服务是否监听在预期的地址上。 用户使用的是my-release-etcd-0这个pod,可能来自Helm部署的etcd集群etcd默认监听的是2379端口,但有时候可能绑定到Pod的IP而不是localhost,尤其是在Kubernetes中。这时候,直接使用localhost可能无法访问,需要指定Pod的真实IP或者服务名称。例如,如果etcd集群通过headless service暴露,可能需要使用服务端点而不是localhost。 另外,检查etcd的日志可能会有帮助,看看是否有启动错误或者配置问题。kubectl logs my-release-etcd-0可以查看日志。网络策略或安全组设置也可能阻止了连接,需要确认是否有网络策略限制Pod之间的通信。还有可能etcd使用了TLS认证,而命令中没有提供证书参数,导致连接失败。 用户可能没有意识到在Kubernetes中,每个Pod有自己的网络命名空间,localhost指的是Pod本身,而如果etcd配置为监听节点IP或服务名称,那么使用localhost可能不正确。这时候需要确认etcd的启动参数,比如--listen-client-urls和--advertise-client-urls是否正确设置。如果etcd监听的是0.0.0.0:2379,那么localhost应该可以,否则可能需要使用服务端点。 总结下来,可能的解决步骤包括检查Pod状态、查看日志、确认网络配置、验证etcd的客户端连接参数是否正确,以及是否需要TLS证书。此外,尝试使用服务名称而不是localhost来连接etcd,比如通过服务名my-release-etcd来访问,可能需要调整--endpoints参数。</think>你提供的命令执行后报错,说明无法连接到etcd服务。以下是逐步分析及解决方案: ----- ### **1. 理解命令意图** 你尝试通过`kubectl exec`进入名为`my-release-etcd-0`的Pod,并在容器内使用`etcdctl`工具查询etcd集群状态: ```bash kubectl exec -it my-release-etcd-0 -- etcdctl --endpoints=localhost:2379 endpoint status ``` ----- ### **2. 错误原因分析** 错误信息表明连接被拒绝: ```bash connection error: dial tcp [::1]:2379: connect: connection refused ``` 可能原因包括: - **Pod未正常运行**:etcd容器可能未启动或崩溃 - **端口未暴露**:etcd服务未监听`2379`端口 - **网络配置问题**:k8s网络策略阻止了访问 - **etcd客户端配置错误**:未正确指定`--endpoints` ----- ### **3. 诊断步骤** #### **(1) 检查Pod状态** 确认Pod处于`Running`状态且容器已就绪: ```bash kubectl get pod my-release-etcd-0 -o wide ``` - 如果状态为`CrashLoopBackOff`,需查看日志: ```bash kubectl logs my-release-etcd-0 ``` #### **(2) 验证etcd端口监听** 进入Pod检查端口监听情况: ```bash kubectl exec -it my-release-etcd-0 -- sh # 在容器内执行: netstat -tuln | grep 2379 ``` - 正常应输出:`tcp6 0 0 :::2379 :::* LISTEN` #### **(3) 检查etcd服务配置** etcd启动参数需包含客户端监听地址: ```bash kubectl describe pod my-release-etcd-0 | grep '--listen-client-urls' ``` - 预期输出:`--listen-client-urls=http://0.0.0.0:2379`(允许所有IP连接) ----- ### **4. 解决方案** #### **方案A:调整endpoints参数** 如果etcd集群模式运行,尝试使用服务名称替代`localhost`: ```bash kubectl exec -it my-release-etcd-0 -- etcdctl --endpoints=http://my-release-etcd:2379 endpoint status ``` #### **方案B:检查Service定义** 确认etcd的Headless Service存在: ```bash kubectl get svc my-release-etcd ``` 正常应输出类似: ``` NAME TYPE CLUSTER-IP PORT(S) AGE my-release-etcd ClusterIP None 2379/TCP 5d ``` #### **方案C:带证书访问(如果启用TLS)** 若etcd启用了TLS认证,需添加证书参数: ```bash kubectl exec -it my-release-etcd-0 -- etcdctl \ --endpoints=https://localhost:2379 \ --cacert=/etc/etcd/ssl/ca.crt \ --cert=/etc/etcd/ssl/etcd.crt \ --key=/etc/etcd/ssl/etcd.key \ endpoint status ``` ----- ### **5. 补充建议** - 使用`kubectl port-forward`临时调试: ```bash kubectl port-forward pod/my-release-etcd-0 2379:2379 ``` 新终端中执行: ```bash etcdctl --endpoints=localhost:2379 endpoint status ``` 通过上述步骤,可系统性排查etcd连接问题。如果仍无法解决,请提供更多集群环境信息(如k8s版本、etcd部署方式)。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

CN-FuWei

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值