糟糕的周日—Mysql主从故障
(声明:本教程仅供本人学习使用,如有人使用该技术触犯法律与本人无关)
(如果有错误,还希望指出。共同进步)
解决问题时的资料
【资料一:mysql主从复制不同步】
【资料二:mysql在线开启/关闭GTID】
【资料三:mysql主从(gtid模式)】
【资料四:转mysql在线开启/关闭GTID】
【资料五:linux命令ps aux|grep xxx详解】
问题
- 某个周日我正在悠哉的看着某综艺,突然主从监控脚本告知我Master_Host:‘xxxxxxxx’ Last_IO_Errno:‘1593’ Last_IO_Error:‘The replication receiver thread cannot start because the master has GTID_MODE = ON and this server has GTID_MODE = OFF.’ Last_SQL_Errno:‘0’ Last_SQL_Error:’’
排查
- 在linux系统上查看日志 /var/log/syslog 发现如下
Nov 8 15:11:58 oss kernel: [32383401.929173] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
Nov 8 15:11:58 oss kernel: [32383401.929181] [ 250] 0 250 8667 174 114688 301 -1000 systemd-udevd
Nov 8 15:11:58 oss kernel: [32383401.929183] [ 305] 100 305 20010 55 180224 141 0 systemd-network
Nov 8 15:11:58 oss kernel: [32383401.929185] [ 494] 101 494 17782 134 180224 164 0 systemd-resolve
Nov 8 15:11:58 oss kernel: [32383401.929187] [ 536] 102 536 65758 379 159744 100 0 rsyslogd
Nov 8 15:11:58 oss kernel: [32383401.929189] [ 547] 0 547 17764 196 180224 123 0 systemd-logind
Nov 8 15:11:58 oss kernel: [32383401.929191] [ 564] 112 564 545826 46010 2846720 68586 0 mongod
Nov 8 15:11:58 oss kernel: [32383401.929193] [ 567] 0 567 27602 32 110592 51 0 irqbalance
Nov 8 15:11:58 oss kernel: [32383401.929195] [ 568] 103 568 12619 156 139264 102 -900 dbus-daemon
Nov 8 15:11:58 oss kernel: [32383401.929197] [ 610] 0 610 7083 4 102400 48 0 atd
Nov 8 15:11:58 oss kernel: [32383401.929199] [ 611] 0 611 72057 301 188416 110 0 accounts-daemon
Nov 8 15:11:58 oss kernel: [32383401.929201] [ 621] 0 621 3989 0 77824 37 0 agetty
Nov 8 15:11:58 oss kernel: [32383401.929203] [ 637] 108 637 26423 74 114688 41 0 chronyd
Nov 8 15:11:58 oss kernel: [32383401.929205] [ 843] 0 843 18074 32 188416 154 -1000 sshd
Nov 8 15:11:58 oss kernel: [32383401.929208] [ 1482] 1000 1482 19160 89 184320 187 0 systemd
Nov 8 15:11:58 oss kernel: [32383401.929209] [ 1483] 1000 1483 27879 74 249856 454 0 (sd-pam)
Nov 8 15:11:58 oss kernel: [32383401.929211] [ 9668] 0 9668 15933 1504 122880 177 0 redis-server
Nov 8 15:11:58 oss kernel: [32383401.929213] [15168] 1000 15168 92360 1980 499712 14767 0 python3
Nov 8 15:11:58 oss kernel: [32383401.929215] [15767] 1000 15767 94050 179 503808 16635 0 python3
Nov 8 15:11:58 oss kernel: [32383401.929216] [15768] 1000 15768 93765 299 503808 16252 0 python3
Nov 8 15:11:58 oss kernel: [32383401.929219] [17439] 105 17439 7034 1 106496 45 0 uuidd
Nov 8 15:11:58 oss kernel: [32383401.929221] [26700] 110 26700 421728 71851 1888256 115060 0 mysqld
Nov 8 15:11:58 oss kernel: [32383401.929223] [30056] 0 30056 19722 710 196608 3336 0 supervisord
...... 前面省略......
Nov 8 15:11:58 oss kernel: [32383401.929754] Out of memory: Kill process 26700 (mysqld) score 149 or sacrifice child
Nov 8 15:11:58 oss kernel: [32383401.930962] Killed process 26700 (mysqld) total-vm:1686912kB, anon-rss:287404kB, file-rss:0kB, shmem-rss:0kB
Nov 8 15:11:58 oss kernel: [32383401.973558] oom_reaper: reaped process 26700 (mysqld), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Nov 8 15:11:59 oss kernel: [32383402.563776] systemd[1]: systemd-journald.service: Failed with result 'timeout'.
......注意看上面......
Nov 8 15:12:17 oss systemd[1]: Started MySQL Community Server.
Nov 8 15:12:17 oss cron[27361]: Please install an MTA on this system if you want to use sendmail!
- 发现了 OOM(Out of memory) 系统把最大占用内存的mysql给kill掉了。
- mysql 重新启动。
原因
- mysql 重新启动后配置文件中没有指定 GTID_MODE 参数,与主库设置不一样。
解决方案
1、在线开启(进入mysql中)
停止主从复制
stop slave;
设置 执行gtid一致性开启 必须在设置gtid_mode前设置
set global enforce_gtid_consistency=on;
设置gtid_mode参数,这里有个坑!!!!!!
【记录坑位】gtid_mode 从off 不能直接变为on 会进行如下报错
The value of @@GLOBAL.GTID_MODE can only be changed one step at a time: OFF <-> OFF_PERMISSIVE <-> ON_PERMISSIVE <-> ON. Also note that this value must be stepped up or down simultaneously on all servers. See the Manual for instructions.
# 从OFF => ON
set global gtid_mode=OFF_PERMISSIVE;
set global gtid_mode=ON_PERMISSIVE;
set global gtid_mode=ON;
# 少一步都不可以~
开启主从
start slave;
2、配置文件中配置(再次重启时无需在线开启)
查找mysql配置文件位置
-
mysqld --verbose --help | grep -A 1 'Default options'
- mysql 配置文件顺序为/etc/my.cnf、 /etc/mysql/my.cnf、 ~/.my.cnf, 如果第一个没有的话,会是第二个,依次…
配置中增加的参数
gtid_mode=on
enforce_gtid_consistency=on
重启mysql
# !!!!!ubuntu系统!!!!!
# 查看mysql状态
systemctl status mysql
# 重启mysql
systemctl restart mysql
确认mysql重启启动
进程确认
# 执行命令
ps aux | grep mysql
# 结果
mysql 19393 16.4 9.0 1626228 365548 ? Sl 09:56 59:10 /usr/sbin/mysqld --daemonize --pid-file=/run/mysqld/mysqld.pid
admin 25659 0.0 0.0 14428 1032 pts/0 S+ 15:55 0:00 grep --color=auto mysql
- 图中红框表示启动时间,如果不是今天启动,会带日期
参数确认
# 进入mysql中,输入
show variables like '%gtid%';
# 确认gtid_mode参数状态