503 Service Unavailable (Failed to connect to endpoint: [N7Vmacore4Http20NamedPipeServiceSpecE:0x0000558877143430] _serverNamespace = / action = Allow _pipeName =/var/run/vmware/vpxd-webserver-pipe)
1、检查了SSL证书没有过期:
2、进入shell检测服务状态,发现vpxd无法启动
service-control --status --all
尝试手动开启服务:service-contrlol --start vpxd
最后告警如下:
查找各种资料后,发现可能是其他服务的证书过期:
执行命令,检查STS以外的证书状态:
for i in $(/usr/lib/vmware-vmafd/bin/vecs-cli store list); do echo STORE $i; /usr/lib/vmware-vmafd/bin/vecs-cli entry list --store $i --text | egrep "Alias|Not After"; done
留意下面的截图,发现有部分证书的时间和其他不一样,其中VPXD和Vphere-Client等多个证书显示已到期,这种情况会导致服务无法启动,页面提示503异常。(最下方bak_*的证书过期也有可能导致这个情况,解决方法在下面)
解决方法:重新更新证书
执行 /usr/lib/vmware-vmca/bin/certificate-manager
选择4,以下基本都是选择默认的参考,等待运行结束
(不在轻易的使用8,我的问题就是使用8后发生报错,导致部分证书没有完成更新遗留而产生的问题)
检查服务状态
service-control --status --all
几个重要的服务也已经恢复正常
再检查证书状态
for i in $(/usr/lib/vmware-vmafd/bin/vecs-cli store list); do echo STORE $i; /usr/lib/vmware-vmafd/bin/vecs-cli entry list --store $i --text | egrep "Alias|Not After"; done
注意上图中 data-encipherment 证书时间依然未更新,这需要官方的修复文件,我这里直接给出文件内容,请自行复制使用。
创建 vi fix_encipherment_cert.sh
授权 chmod +x fix_encipherment_cert.sh
执行 ./fix_encipherment_cert.sh #会自动重启vpxd
#!/bin/bash
# Run this from the vCenter Server where data-encipherment certificate is expired and needs to be replaced
echo "Replacing Certificate in data-encipherment VECS Store"
echo ""
PNID=$(/opt/likewise/bin/lwregshell list_values '[HKEY_THIS_MACHINE\Services\vmafd\Parameters]' | grep PNID | awk '{print $4}'|tr -d '"')
echo "Detected PNID: $PNID"
echo ""
PSC=$(/opt/likewise/bin/lwregshell list_values '[HKEY_THIS_MACHINE\Services\vmafd\Parameters]' | grep DCName | awk '{print $4}'|tr -d '"')
echo "Detected PSC: $PSC"
echo ""
echo "Taking backup of old certificate and private key to /tmp directory"
/usr/lib/vmware-vmafd/bin/vecs-cli entry getcert --store data-encipherment --alias data-encipherment --output /tmp/old-data-encipherment.crt
/usr/lib/vmware-vmafd/bin/vecs-cli entry getkey --store data-encipherment --alias data-encipherment --output /tmp/old-data-encipherment.key
echo ""
echo "Deleting the existing certificate from the VECS store"
/usr/lib/vmware-vmafd/bin/vecs-cli entry delete -y --store data-encipherment --alias data-encipherment
echo ""
echo "Generating new certificate using the existing private key and add to the VECS store"
/usr/lib/vmware-vmca/bin/certool --server=$PSC --genCIScert --dataencipherment --privkey=/tmp/old-data-encipherment.key --cert=/tmp/tmp-data-encipherment.crt --Name=data-encipherment --FQDN=$PNID
echo ""
echo "Listing the new certificate in VECS Store"
/usr/lib/vmware-vmafd/bin/vecs-cli entry list --store data-encipherment --text | egrep 'Alias|Serial Number:|Subject:|Not Before|Not After'
echo ""
echo "*************************************************************************************************************************"
echo " Completed the script execution, please follow the manual steps in case the script fails to replace the Certificate"
echo ""
echo " VPXD Service needs to be restarted for the changes to take effect, otherwise Guest OS Customizations might fail"
echo " Please execute following command to restart the service: "
echo ""
echo " service-control --stop vpxd && service-control --start vpxd "
echo "*************************************************************************************************************************"
修复完成后检查证书状态
for i in $(/usr/lib/vmware-vmafd/bin/vecs-cli store list); do echo STORE $i; /usr/lib/vmware-vmafd/bin/vecs-cli entry list --store $i --text | egrep "Alias|Not After"; done
已更新
最后重新启动服务或重启
service-control --stop --all && service-control --start --all
关于bak_* 的非STS证书过期问题
使用官方脚本处理,脚本在页面最左下角
Certificate alarm - Clearing BACKUP_STORES certificates in the VCSA
怕以后没办法下载,这里留下代码
#!/bin/bash
#Cesar Badilla Monday, November 16, 2020 10:41:17 PM
echo "######################################################"
echo;echo "These are the current Certificate Stores:";echo
for i in $(/usr/lib/vmware-vmafd/bin/vecs-cli store list); do echo STORE $i; /usr/lib/vmware-vmafd/bin/vecs-cli entry list --store $i --text | egrep "Alias|Not After"; done;
echo;echo "If there is any expired or expiring Certificates within the BACKUP_STORES please continue to run this script";echo "######################################################";echo
read -p "Have you taken powered off snapshots of all PSC's and VCSA's within the SSO domain(Y|y|N|n)" -n 1 -r
if [[ ! $REPLY =~ ^[Yy]$ ]]
then
exit 1
fi
echo
for i in $(/usr/lib/vmware-vmafd/bin/vecs-cli entry list --store BACKUP_STORE |grep -i "alias" | cut -d ":" -f2);do echo BACKUP_STORE $i; /usr/lib/vmware-vmafd/bin/vecs-cli entry delete --store BACKUP_STORE --alias $i -y; done
for i in $(/usr/lib/vmware-vmafd/bin/vecs-cli store list); do echo STORE $i; /usr/lib/vmware-vmafd/bin/vecs-cli entry list --store $i --text | egrep "Alias|Not After"; done | grep -i 'BACKUP_STORE_H5C'&> /dev/null
if [ $? == 0 ]; then
for i in $(/usr/lib/vmware-vmafd/bin/vecs-cli entry list --store BACKUP_STORE_H5C |grep -i "alias" | cut -d ":" -f2); do echo BACKUP_STORE_H5C $i; /usr/lib/vmware-vmafd/bin/vecs-cli entry delete --store BACKUP_STORE_H5C --alias $i -y; done
echo
echo "--------------------------------------------------------";
fi
echo "######################################################";
echo;echo "The resulting BACKUP_STORES after the cleanups are: ";echo
for i in $(/usr/lib/vmware-vmafd/bin/vecs-cli store list); do echo STORE $i; /usr/lib/vmware-vmafd/bin/vecs-cli entry list --store $i --text | egrep "Alias|Not After"; done
echo "######################################################";echo "--------------------------------------------------------"; echo "--------------------------------------------------------";
echo "Results: ";
echo "--------------------------------------------------------"; echo "--------------------------------------------------------";
echo;echo "The Certificate BACKUP_STORES were successfully cleaned";echo;
echo "Please acknowlege and reset to green any certificate related alarm."
echo "Restart services on all PSC's and VCSA's in the SSO Domain with command.";echo;echo "service-control --stop --all && service-control --start --all(optional)."
echo "--------------------------------------------------------";
echo;echo "If you could not restart the services, please monitor
the VCSA for 24 hours and the alarm should not reappear
after the acknowlegement."
echo;echo "######################################################"
创建: vi clean_backup_stores.sh
授权:chmod +x clean_backup_stores.sh
执行:./clean_backup_stores.sh
该脚本将首先验证其中一个备份存储(BACKUP_STORE 和 BACKUP_STORE_H5C)中是否确实存在任何过期的证书。它会提示你执行快照(必须要先创建最新的全量快照,这非常重要):
Have you taken powered off snapshots of all PSC's and VCSA's within the SSO domain(Y|y|N|n)
输入y
如果有如下报错
-bash: ./clean_backup_stores.sh: /bin/bash^M: bad interpreter: No such file or directory.
则运行
sed -i -e 's/\r$//' clean_backup_stores.sh
./clean_backup_stores.sh
#注意提示
重启所有服务
service-control --stop --all && service-control --start --all