When restarting a service outside of systemd, why are the processes killed in a non-graceful manner

本文探讨了systemd如何通过cgroups管理进程和服务,确保服务在系统关闭时能够优雅地停止,避免非正常终止导致的问题。文章详细介绍了systemd与cgroups的工作原理,以及如何正确使用systemd接口来启动和停止服务,从而实现服务的优雅关闭。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

 

https://access.redhat.com/articles/3079091

Updated 2017年六月14日02:52 - 

English 

TABLE OF CONTENTS

systemd and cgroups

During normal operation, systemd maintains an association between a unit abstraction and the underlying processes active on the system. This is documented in the man systemd documentation.

Raw

       Processes systemd spawns are placed in individual Linux control groups named after the unit
       which they belong to in the private systemd hierarchy. (see cgroups.txt[1] for more information
       about control groups, or short "cgroups"). systemd uses this to effectively keep track of
       processes. Control group information is maintained in the kernel, and is accessible via the
       file system hierarchy (beneath /sys/fs/cgroup/systemd/), or in tools such as ps(1) (ps xawf -eo
       pid,user,cgroup,args is particularly useful to list all processes and the systemd units they
       belong to.).

When a process forks itself, it inherits the cgroup of the creating process. With this being the case, all of the processes associated with a given unit can be verified by reading the contents of the applicable cgroup.procs file. Similar to the following:

Raw

# cat /sys/fs/cgroup/systemd/system.slice/httpd.service/cgroup.procs 
1253
1254
1255
1256
1257
1258

This output will match the CGroup information returned during a systemctl status <unit> operation:

Raw

# systemctl status httpd
● httpd.service - The Apache HTTP Server
   Loaded: loaded (/usr/lib/systemd/system/httpd.service; disabled; vendor preset: disabled)
   Active: active (running) since Tue 2017-06-13 13:45:47 EDT; 899ms ago
     Docs: man:httpd(8)
           man:apachectl(8)
 Main PID: 1253 (httpd)
   Status: "Processing requests..."
   CGroup: /system.slice/httpd.service
           ├─1253 /usr/sbin/httpd -DFOREGROUND
           ├─1254 /usr/sbin/httpd -DFOREGROUND
           ├─1255 /usr/sbin/httpd -DFOREGROUND
           ├─1256 /usr/sbin/httpd -DFOREGROUND
           ├─1257 /usr/sbin/httpd -DFOREGROUND
           └─1258 /usr/sbin/httpd -DFOREGROUND

<DATE> <TIME> host.example.com systemd[1]: Starting The Apache HTTP Server...
<DATE> <TIME> host.example.com systemd[1]: Started The Apache HTTP Server.

To directly view these groupings of processes system-wide, the systemd-cgls utility can be used:

Raw

# systemd-cgls | head -17
├─1 /usr/lib/systemd/systemd --switched-root --system --deserialize 21
├─user.slice
│ └─user-0.slice
│   └─session-2.scope
│     ├─1206 sshd: root@pts/0
│     ├─1209 -bash
│     ├─1332 systemd-cgls
│     └─1333 head -17
└─system.slice
  ├─httpd.service
  │ ├─1253 /usr/sbin/httpd -DFOREGROUND
  │ ├─1254 /usr/sbin/httpd -DFOREGROUND
  │ ├─1255 /usr/sbin/httpd -DFOREGROUND
  │ ├─1256 /usr/sbin/httpd -DFOREGROUND
  │ ├─1257 /usr/sbin/httpd -DFOREGROUND
  │ └─1258 /usr/sbin/httpd -DFOREGROUND
  ├─atd.service

Issue example

In the above example output, the five httpd processes are logically included in the httpd.service unit. This results in that unit files directives being used during system shutdown.

Specifically, the following configuration from a Red Hat Enterprise Linux 7.3 system:

Raw

# systemctl cat httpd.service
# /usr/lib/systemd/system/httpd.service
[Unit]
Description=The Apache HTTP Server
After=network.target remote-fs.target nss-lookup.target
Documentation=man:httpd(8)
Documentation=man:apachectl(8)

[Service]
Type=notify
EnvironmentFile=/etc/sysconfig/httpd
ExecStart=/usr/sbin/httpd $OPTIONS -DFOREGROUND
ExecReload=/usr/sbin/httpd $OPTIONS -k graceful
ExecStop=/bin/kill -WINCH ${MAINPID}
# We want systemd to give httpd some time to finish gracefully, but still want
# it to kill httpd after TimeoutStopSec if something went wrong during the
# graceful stop. Normally, Systemd sends SIGTERM signal right after the
# ExecStop, which would kill httpd. We are sending useless SIGCONT here to give
# httpd time to finish.
KillSignal=SIGCONT
PrivateTmp=true

[Install]
WantedBy=multi-user.target

It is imperative that the service be started/stopped via the systemd system in order to maintain the correct process to unit grouping. Any operation that takes external action results in the necessary cgroup structure not being created. This is simply due to systemd not being aware of the special nature of the processes being started.

As an example, when the above httpd processes are stopped and then started from a local shell, note the outcome in terms of process and unit grouping.

Raw

# killall httpd           # Stop the currently running httpd processes
# /usr/sbin/httpd  # Start a new instance that will daemonize itself

# systemd-cgls | head -17
├─1 /usr/lib/systemd/systemd --switched-root --system --deserialize 21
├─user.slice
│ └─user-0.slice
│   └─session-2.scope
│     ├─1206 sshd: root@pts/0
│     ├─1209 -bash
│     ├─1407 /usr/sbin/httpd
│     ├─1408 /usr/sbin/httpd
│     ├─1409 /usr/sbin/httpd
│     ├─1410 /usr/sbin/httpd
│     ├─1411 /usr/sbin/httpd
│     ├─1412 /usr/sbin/httpd
│     ├─1413 systemd-cgls
│     └─1414 head -17
└─system.slice
  ├─atd.service
  │ └─1143 /usr/sbin/atd -f

The httpd processes have been started, however, the are now associated with the session-2.scope unit. These units are created during user login via an interaction between systemd-logind and the pam_systemd module.

From the man pam_systemd documentation:

Raw

       pam_systemd registers user sessions with the systemd login manager systemd-logind.service(8),
       and hence the systemd control group hierarchy.

       On login, this module ensures the following:

        1. If it does not exist yet, the user runtime directory /run/user/$USER is created and its
           ownership changed to the user that is logging in.

        2. The $XDG_SESSION_ID environment variable is initialized. If auditing is available and
           pam_loginuid.so was run before this module (which is highly recommended), the variable is
           initialized from the auditing session id (/proc/self/sessionid). Otherwise, an independent
           session counter is used.

        3. A new systemd scope unit is created for the session. If this is the first concurrent
           session of the user, an implicit slice below user.slice is automatically created and the
           scope placed into it.

When a process is interacted with outside of systemd, the result is generally that the individual processes are associated with the shutdown operations defined in the user session scope. These units are more ephemeral in nature as they are instantiated on login and are expected to only last the duration of that particular session:

Raw

# systemctl cat session-2.scope
# /run/systemd/system/session-2.scope
# Transient stub

# /run/systemd/system/session-2.scope.d/50-After-systemd-logind\x2eservice.conf
[Unit]
After=systemd-logind.service
# /run/systemd/system/session-2.scope.d/50-After-systemd-user-sessions\x2eservice.conf
[Unit]
After=systemd-user-sessions.service
# /run/systemd/system/session-2.scope.d/50-Description.conf
[Unit]
Description=Session 2 of user root
# /run/systemd/system/session-2.scope.d/50-SendSIGHUP.conf
[Scope]
SendSIGHUP=yes
# /run/systemd/system/session-2.scope.d/50-Slice.conf
[Scope]
Slice=user-0.slice

# systemctl show session-2.scope | grep Kill
KillMode=control-group
KillSignal=15

The above configuration determines that a SIGTERM signal, followed by a SIGHUP, to each of the processes found within the applicable cgroup will take place during that session stop operation. By default, without this external interaction, the httpd process would be to gracefully shut down after being sent a SIGWINCH which httpd is written to shutdown in a specific manner.

How to avoid non-graceful service shutdowns

In order to avoid this behaviour, one of two separate strategies must be used.

1 - Alter service management operations to only make use of systemd provided interfaces. Operations such as systemctl start/stop/restart <service> as well as the use of the underlying dbus API are available for these purposes.

2 - Alter process signal handling so that they can respond in a graceful manner when the indicated SIGTERM and SIGHUP signals are delivered.

In the event that the application encountering this behaviour is provided by a 3rd party, it is recommended that an issue be raised with their respective support organization. This will allow those teams to verify compatibility and operation intent between the application and the surrounding Operating System when using systemd as an init system.

Additional details

Please see the following documentation:

Red Hat Enterprise Linux - 7- System Administrator's Guide - Chapter 9. Managing Services with systemd

Table of Contents

Automatically generate a table of contents

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值