最近做到always on监控,记录一下比较常用的表和字段
表:sys.dm_hadr_database_replica_states
主要字段:database_state ,synchronization_state,log_send_rate ,log_send_queue_size, last_commit_time(主辅差值表示滞后程度)
表:sys.databases
主要字段:target_recovery_time_in_seconds
表:sys.availability_replicas
主要字段:failove_mode , availability_mode
表:sys.dm_hadr_availability_replica_states
主要字段:role,recovery_health,synchronization_health,connected_state
同步延时和redo同步延时:
https://www.sqlshack.com/measuring-availability-group-synchronization-lag/
T_SQL:
SELECT Pri_CommitTime.replica_server_name[primary_replica]
,Pri_CommitTime.[DBName] AS [DatabaseName]
,Sec_CommitTime.replica_server_name [secondary_replica]
,DATEDIFF(ss,Sec_CommitTime.last_commit_time,Pri_CommitTime.last_commit_time) AS[Sync_Lag_Secs]
FROM
(SELECT replica_server_name
,DBName
,last_commit_time
FROM (SELECTAR.replica_server_name,
HARS.role_desc,
Db_name(DRS.database_id)[DBName],
DRS.last_commit_time
FROM sys.dm_hadr_database_replica_states DRS
INNER JOIN sys.availability_replicas AR ONDRS.replica_id = AR.replica_id
INNER JOINsys.dm_hadr_availability_replica_states HARS ON AR.group_id = HARS.group_id
ANDAR.replica_id = HARS.replica_id) AG_Stats
WHERE role_desc = 'PRIMARY') Pri_CommitTime
INNER JOIN
(SELECT replica_server_name
,DBName
,last_commit_time
FROM(SELECT AR.replica_server_name,
HARS.role_desc,
Db_name(DRS.database_id)[DBName],
DRS.last_commit_time
FROM sys.dm_hadr_database_replica_states DRS
INNER JOIN sys.availability_replicas AR ONDRS.replica_id = AR.replica_id
INNER JOINsys.dm_hadr_availability_replica_states HARS ON AR.group_id = HARS.group_id
ANDAR.replica_id = HARS.replica_id) AG_Stats
WHERE role_desc = 'SECONDARY') Sec_CommitTime
ON [Sec_CommitTime].[DBName] =[Pri_CommitTime].[DBName]
dashboard相关参数 tsql:
https://www.sqlskills.com/blogs/joe/answering-questions-with-the-alwayson-dashboard/
https://blog.youkuaiyun.com/dba_huangzj/article/details/59056716
强制故障转移(可能丢失数据)
ALTER AVAILABILITY GROUP group_nameFORCE_FAILOVER_ALLOW_DATA_LOSS;
计划故障转移
ALTER AVAILABILITY GROUP group_nameFAILOVER;
恢复本地挂起的辅助数据库
ALTER DATABASE database_name SET HADRRESUME
计数器
T-SQL:
selectobject_name,counter_name,instance_name,cntr_value
from sys.dm_os_performance_counters
whereobject_name like '%replica%'
先决条件
计划切换:AVAILABILITY_MODE 需要时SYNCHRONOUS_COMMIT状态(主和备)。 当is_failover_ready = 0,不可用于计划的手动故障转移。 如果您强制故障转移到主机辅助副本,则在此数据库上数据将丢失。(主和备都需要是1)
查询is_failover_ready TSQL:SELECT is_failover_ready FROM sys.dm_hadr_database_replica_cluster_states
WHERE replica_id=(SELECT replica_id FROM sys.availability_replicas
WHERE replica_server_name ='NODE4')
灾难切换:当辅助数据库处于 REVERTING 或 INITIALIZING 状态时,强制故障转移将导致该数据库无法作为主数据库启动。切换后,原来的主库挂起,需要 手动的恢复。
恢复挂起数据库TSQL:ALTER DATABASE TestAG SET HADR RESUME
传输模式切换:切换时FAILOVER_MODE需要是MANUAL,当FAILOVER_MODE=MANUAL或者AVAILABILITY_MODE = ASYNCHRONOUS_COMMIT时 无法使用自动故障转移。