Cloudera 5.4.x cluster randomly reports "Clock Offset Bad" with working NTP Server

本文介绍了Cloudera Manager中NTP健康检查的工作原理。检查通过执行ntpdc-np命令实现,若2秒内未返回结果则视为失败。检查结果包括时钟偏移量等信息,这些信息将被发送至主机监控服务进行处理。文章还提供了故障排查指南。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

 The clock NTP health check is executed by each agent running on nodes on your cluster.  The command executed is:

ntpdc -np

 
A timeout of 2 seconds is used, so if the ntp client does not return in 2 seconds, the health check will fail.
If there is a result, then the agent script will parse the result text and return a result metric that includes the clock offset.  this will be sent to the Host Monitor Management Service for processing.
 
You have 2 options here:  
 
If you are convinced there are no problems, you can turn off the  Cloudera Manager Server Clock Offset Thresholds health check or adjust it as necessary in the Cloudera Manager management services.
 
Or, if you wish to troubleshoot, check the  /var/log/cloudera-scm-agent/cloudera-scm-agent.log file for clues.
Search in that file for "ntpdc".  If there are any errors running the command, a stack trace will be provided.
 
The agent merely parses the ntpdc output, so assuming your output looks something like this:
 
ntpdc -np
     remote           local      st poll reach  delay   offset    disp
=======================================================================
*132.163.4.101   10.17.81.194     1 1024  377 0.02972  0.001681 0.13664
=198.55.111.5    10.17.81.194     2 1024  377 0.01395  0.002177 0.13667
=50.116.55.65    10.17.81.194     2 1024  377 0.07263  0.001220 0.12172
 
The script will look for a line that starts with an "*" character.  So, in our example:
 
*132.163.4.101   10.17.81.194     1 1024  377 0.02972  0.001681 0.13664
 
Then, it will get the 'offset' column.
This value is returned to the Host Monitor which, will pull the metric and filter it through your health check configuration to decide if it warrants an alert.
 
Lastly, I'm not aware that anything has changed in the offset health check between CM 5.3 and 5.4, so I would recommend troubleshooting this to try to figure out why clock is offset.  Timing is important in hadoop, so it is worth a look.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值