Calling fork from Multiple Thread Environment

最新推荐文章于 2022-12-07 15:26:59 发布

转载最新推荐文章于 2022-12-07 15:26:59 发布 · 2.7k 阅读

文章标签：

#UNIX #fork #pthread #c

开源软件同时被 2 个专栏收录

45 篇文章

订阅专栏

UNIX

23 篇文章

订阅专栏

本文深入探讨了在并发环境下使用Pthreads时，如何正确地使用fork和pthread_atfork来管理进程与线程之间的资源共享与互斥问题，避免常见的死锁和资源泄露问题。

Threads and Process Management


	On a Pthreads-compliant system, calls that manipulate processes, like fork and exec, still behave in the way they always have for nonthreaded programs. Let's see what happens when we make these calls from a multithreaded process.


	Calling fork from a Thread


	A process creates another process by issuing a fork call. The newly created child process has a new process ID but starts with the same memory image and state as its parent. At its birth it's an exact clone of its parent, starting execution at the point of its parent's fork call in the same program. Often, the new process immediately calls exec to replace its parent's program with a new program. It then sets out on its own business.


	In a Pthreads-compliant implementation, the fork call always creates a new child process with a single thread, regardless of how many threads its parent may have had at the time of the call. Furthermore, the child's thread is a replica of the thread in the parent that calledfork≈including a process address space shared by all of its parent's threads and its parent thread's per-thread stack.


	Consider the headaches:


	∙		The new single-threaded child process could inherit held locks from threads in the parent that don't exist in the child. It may have no idea what these locks mean, let alone realize that it holds one of them. Confusion and deadlock are in the forecast.


	∙		The child process could inherit heap areas that were allocated by threads in the parent that don't exist in the child. Here we see memory leaks, data loss, and bug reports.


	The Pthreads standard defines the pthread_atfork call to help you manage these problems. The pthread_atfork function allows a parent process to specify preparation and cleanup routines that parent and child processes run as part of the fork operation. Using these routines a parent or child process can manage the release and reacquisition of locks and resources before and after the fork.


	This is pretty complex stuff, so please bear with us.


	Fork-handling stacks


	To perform its magic, the pthread_atfork call pushes addresses of preparation and cleanup routines on any of three fork-handling stacks:


	∙		Routines placed on the prepare stack are run in the parent before the fork.


	∙		Routines placed on the parent stack are run in the parent after the fork.


	∙		Routines placed on the child stack are run in the child after the fork.


	A single call to pthread_atfork places a routine on one or more of these stacks. With multiple calls you can place routines on any given stack in a first-in last-out order. Because the fork-handling stacks are a processwide resource, any thread≈not just the one that will callfork≈can push routines on them.


	In those carefree times when we throw caution to the winds and decide to fork from the middle of a multithread program, we typically use pthread_atfork to push mutex-locking calls on the prepare fork-handling stack and mutex-unlocking calls on the parent and child stacks. We might also place routines that release resources and reset variables on the child stack.


	Let's demonstrate what would happen if we did not use pthread_atfork's capabilities in one of those fork-crazy programs of ours. In Figure 5-1, we have two threads: a mutex (Lock L) and the data the mutex protects. Thread A acquires Lock L and starts to modify the data. Meanwhile, Thread B decides to fork. Now, the fork creates a child process that's a clone of its parent process, and this child shows a locked Lock L. The child process has a single thread, a replica of Thread B (the thread in the parent process that called fork). The assortment of clones and replicas that result from the fork has little effect on the threads in the parent process. However, things are not okay in the child. The locked Lock L is an utter mystery to the new Thread B in the child. If it tries to acquire Lock L, it will deadlock. (There's no Thread A in the child that will ever release Lock L in the child process's context.) If it tries to access the data without first obtaining Lock L, it may see the data in an inconsistent form. Life's never easy for our kids.


	Figure 5-1: Results of a fork when pthread_atfork is not used


	Now, let's use pthread_atfork to control Lock L's state at the time of the fork. The program we show in Figure 5-2 also has Threads A and B, Lock L, and scrupulously guarded data. However, we've added an initialization routine that pushes a routine that locks L on the prepare fork-handling stack, and a routine that unlocks L on the child and parent fork-handling stacks. We've taken care to do this in a routine that executes before any thread actually uses the lock.


	Figure 5-2: Results of a fork when pthread_atfork is used


	Sometime later, Thread A acquires the lock and starts to modify the data. When Thread B calls fork, the routine on the prepare stack runs in Thread B's context. This routine tries to obtain Lock L and will block; Lock L is still held by Thread A. Ultimately, the fork is delayed until Thread A releases Lock L. When this happens, the prepare routine succeeds, Thread B will become the owner of the lock, and the fork proceeds. As expected, a child process is created that's a replica of its parent. However, inthis case, the newly cloned Thread B in the child knows about the locked lock it finds in the child's context. At this point, the routine we placed on the child fork-handling stack runs and releases Lock L. The same routine runs from the parent fork-handling stack and releases the lock in the parent process. When the dust settles, the lock is unowned in both parent and child, and the data it protects is in a consistent state. Who could ask for more?


	Even given the capabilities of pthread_atfork, forking from a multithreaded program is no picnic. We kept our example simple. Imagine having to track every lock and every resource that may be held by every thread in your program and in every library call it makes! Before pursuing this course, you should consider a less complex alternative:


	∙		If possible, fork before you've created any threads.


	Instead of forking, create a new thread. If you are forking to exec a binary image, can you convert the image to a callable shared library to which you could simply link?


	∙		Consider the surrogate parent model.

In the surrogate parent model, a program forks a child process at initialization time. The sole purpose of the child is to serve as a sort of "surrogate parent" for the original process should it ever need to fork another child. After initialization, the original parent can proceed to create its additional threads. When it wants to exec an image, it communicates this to its child (which has remained single-threaded). The child then performs the fork and exec on behalf of the original process.

Refer to : http://maxim.int.ru/bookshelf/PthreadsProgram/htm/r_44.html#1137916