Enable and Use Automatic Alerting
Automatic alerting on various indications significantly helps in troubleshooting congestion. For example, a host generating alerts at the same time when a switch generates credit loss is a strong correlation between the cause and the effect of the problems. 根据各种迹象自动发出警报大大有助于排除拥塞故障。例如,当交换机产生信用损失时,主机会同时发出警报,这与问题的起因和结果密切相关。
Cisco MDS switches have the Port-Monitor feature for alerting on congestion symptoms, as explained in Chapter 3, the section on Port-Monitor on Cisco MDS Switches. Cisco MDS 交换机具有端口监控功能,可对拥塞症状发出警报,详见第 3 章 "Cisco MDS 交换机上的端口监控 "一节。
Remember to not just enable the alerts, it is important to use them and ensure that these alerts are part of your workflow. We often come across production environments where alerts are enabled but nobody knows where they go. If the alerts are not generating notifications, such as emails, they fail to get the due attention. Essentially, there is a gap between the devices detecting a problem and the users being aware of them. 切记不要只启用警报,重要的是要使用它们,并确保这些警报成为工作流程的一部分。我们经常在生产环境中遇到这样的情况:警报已启用,但没人知道它们去了哪里。如果警报没有生成通知(如电子邮件),它们就无法得到应有的关注。从根本上说,在设备检测到问题和用户意识到问题之间存在着差距。
Use a Remote Monitoring Platform (NDFC/DCNM)
Remote monitoring platforms, such as Cisco NDFC/DCNM, can significantly help in troubleshooting congestion because they show an end-to-end traffic path using topology, centralized monitoring of congestion metrics, alerts generated by multiple switches, etc. For more details, refer to Chapter 3, the section on Detecting Congestion on a Remote Monitoring Platform. 远程监控平台(如 Cisco NDFC/DCNM)可通过拓扑显示端到端的流量路径、集中监控拥塞指标、多个交换机生成的警报等,从而极大地帮助排除拥塞故障。有关详细信息,请参阅第 3 章 "在远程监控平台上检测拥塞 "一节。
The case studies in this chapter use only the NX-OS commands on the Cisco MDS switches for educational purpose. A remote monitoring platform (such as Cisco NDFC/DCNM) would have significantly simplified many basic steps, such as finding FCID, switchport, and VSAN of an end device, members of a port-channel, topology, traffic distribution, etc. 出于教学目的,本章的案例研究仅使用了 Cisco MDS 交换机上的 NX-OS 命令。远程监控平台(如 Cisco NDFC/DCNM)可以大大简化许多基本步骤,如查找终端设备的 FCID、交换端口和 VSAN、端口通道成员、拓扑结构、流量分布等。
Cisco MDS NX-OS Commands for Troubleshooting Congestion
Cisco MDS switches have a wide variety of commands to det