https://access.redhat.com/articles/2642741
Updated September 20 2016 at 7:32 AM -
How can you distinguish between a system crash and a graceful reboot or shutdown in RHEL 7? This article outlines 4 approaches:
- Inspect wtmp with last -x
- Inspect auditd logs with ausearch
- Create a custom service unit
- Inspect previous boots with journalctl
(1) Inspect wtmp with last -x
With a simple last -n2 -x shutdown reboot
command, the system wtmp file reports the two most recent shutdowns or reboots. reboot denotes the system booting up; whereas, shutdown denotes the system going down. So a graceful shutdown would show up as reboot preceded by shutdown, as in the following example:
~]# last -n2 -x shutdown reboot
reboot system boot 3.10.0-327.el7.x Tue Sep 20 01:22 - 01:22 (00:00)
shutdown system down 3.10.0-327.el7.x Tue Sep 20 01:21 - 01:21 (00:00)
In contrast, an ungraceful shutdown can be inferred by the omission of shutdown; instead there will be 2 reboot lines in a row, as in this example:
~]# last -n2 -x shutdown reboot
reboot system boot 3.10.0-327.el7.x Tue Sep 20 01:11 - 01:20 (00:08)
reboot system boot 3.10.0-327.el7.x Tue Sep 20 01:10 - 01:20 (00:09)
(2) Inspect auditd logs with ausearch
auditd is amazing and all the different events that it logs can be seen by checking ausearch -m
. Apropos to the problem at hand, it logs system shutdown and system boot as above. The command ausearch -i -m system_boot,system_shutdown | tail -4
will report the 2 most recent shutdowns or boots. If this reports a SYSTEM_SHUTDOWN followed by a SYSTEM_BOOT, all is well; however, if it reports 2 SYSTEM_BOOT lines in a row, then clearly the system did not shutdown gracefully, as in the following example:
~]# ausearch -i -m system_boot,system_shutdown | tail -4
----
type=SYSTEM_BOOT msg=audit(09/20/2016 01:10:32.392:7) : pid=657 uid=root auid=unset ses=unset subj=system_u:system_r:init_t:s0 msg=' comm=systemd-update-utmp exe=/usr/lib/systemd/systemd-update-utmp hostname=? addr=? terminal=? res=success'
----
type=SYSTEM_BOOT msg=audit(09/20/2016 01:11:41.134:7) : pid=656 uid=root auid=unset ses=unset subj=system_u:system_r:init_t:s0 msg=' comm=systemd-update-utmp exe=/usr/lib/systemd/systemd-update-utmp hostname=? addr=? terminal=? res=success'
(3) Create a custom service unit
This approach is great because it allows for complete control. Here's an example of how to do it.
-
Create a service that runs only at shutdown
(Optionally customize the service name and the graceful_shutdown file)~]# cat /etc/systemd/system/set_gracefulshutdown.service [Unit] Description=Set flag for graceful shutdown DefaultDependencies=no RefuseManualStart=true Before=shutdown.target [Service] Type=oneshot ExecStart=/bin/touch /root/graceful_shutdown [Install] WantedBy=shutdown.target ~]# systemctl daemon-reload ~]# systemctl enable set_gracefulshutdown
-
Create a service that runs only at startup and only IF the graceful_shutdown file created by the above service exists
(Optionally customize the service name and ensure the graceful_shutdown file matches the above service)~]# cat /etc/systemd/system/check_graceful.service [Unit] Description=Check if previous system shutdown was graceful ConditionPathExists=/root/graceful_shutdown RefuseManualStart=true RefuseManualStop=true [Service] Type=oneshot RemainAfterExit=true ExecStart=/bin/rm /root/graceful_shutdown [Install] WantedBy=multi-user.target ~]# systemctl daemon-reload ~]# systemctl enable check_graceful
-
Any time after a graceful reboot,
systemctl is-active check_graceful
would be able to confirm the previous reboot was graceful.
Example output:~]# systemctl is-active check_graceful && echo GOOD || echo BAD active GOOD ~]# systemctl status check_graceful ● check_graceful.service - Check if system booted after a graceful shutdown Loaded: loaded (/etc/systemd/system/check_graceful.service; enabled; vendor preset: disabled) Active: active (exited) since Tue 2016-09-20 01:10:32 EDT; 20s ago Process: 669 ExecStart=/bin/rm /root/graceful_shutdown (code=exited, status=0/SUCCESS) Main PID: 669 (code=exited, status=0/SUCCESS) CGroup: /system.slice/check_graceful.service Sep 20 01:10:32 a72.example.com systemd[1]: Starting Check if system booted after a graceful shutdown... Sep 20 01:10:32 a72.example.com systemd[1]: Started Check if system booted after a graceful shutdown.
-
After a crash or otherwise ungraceful shutdown, the following would be seen:
~]# systemctl is-active check_graceful && echo GOOD || echo BAD inactive BAD ~]# systemctl status check_graceful ● check_graceful.service - Check if system booted after a graceful shutdown Loaded: loaded (/etc/systemd/system/check_graceful.service; enabled; vendor preset: disabled) Active: inactive (dead) Condition: start condition failed at Tue 2016-09-20 01:11:41 EDT; 16s ago ConditionPathExists=/root/graceful_shutdown was not met Sep 20 01:11:41 a72.example.com systemd[1]: Started Check if system booted after a graceful shutdown.
(4) Inspect previous boots with journalctl
-
Configure
systemd-journald
to keep a persistent journal on-disk~]# mkdir /var/log/journal ~]# systemctl -s SIGUSR1 kill systemd-journald ~]# reboot
-
Use
journalctl -b -1 -n
to look at the last few (10 by default) lines of the previous boot
(Note that-b -2
is the boot before that, etc.)
The following example output shows that the previous system reboot was graceful~]# journalctl -b -1 -n -- Logs begin at Tue 2016-09-20 01:01:15 EDT, end at Tue 2016-09-20 01:21:33 EDT. -- Sep 20 01:21:19 a72.example.com systemd[1]: Stopped Create Static Device Nodes in /dev. Sep 20 01:21:19 a72.example.com systemd[1]: Stopping Create Static Device Nodes in /dev... Sep 20 01:21:19 a72.example.com systemd[1]: Reached target Shutdown. Sep 20 01:21:19 a72.example.com systemd[1]: Starting Shutdown. Sep 20 01:21:19 a72.example.com systemd[1]: Reached target Final Step. Sep 20 01:21:19 a72.example.com systemd[1]: Starting Final Step. Sep 20 01:21:19 a72.example.com systemd[1]: Starting Reboot... Sep 20 01:21:19 a72.example.com systemd[1]: Shutting down. Sep 20 01:21:19 a72.example.com systemd-shutdown[1]: Sending SIGTERM to remaining processes... Sep 20 01:21:19 a72.example.com systemd-journal[483]: Journal stopped
Note from the author: In my experience, this is not perfect. When bad things happen, I've seen the indexing in journald screw up to where the journalctl -b -1
command only gives an error.