Learn the Ins and Outs of Implementing Signals in the Solaris Operating Environment
Abstract
Signals are a process event notification mechanism that has been part of the UNIX® system from the earliest days. The APIs and underlying behavioral characteristics of signals have evolved over the years, at times diverging between the BSD and SVR4 releases of UNIX. Fortunately, industry standards brought things together, and you now have a well-understood and consistent foundation for signals.
Rather than work through a tutorial on writing code with signals (W. Richard Stevens's Advanced Programming in the UNIX Environment (see Resources) is an outstanding source for learning to program with signals), this article opts instead to help you build a solid foundation around signals with detailed background and implementation discussions.
Signals are used to notify a process or thread of a particular event. Many engineers compare signals with hardware interrupts, which occur when a hardware subsystem such as a disk I/O interface (an SCSI host adapter, for example) generates an interrupt to a processor as a result of a completed I/O. This event in turn causes the processor to enter an interrupt handler, so subsequent processing can be done in the operating system based on the source and cause of the interrupt.
UNIX® guru W. Richard Stevens, however, aptly describes signals as software interrupts. When a signal is sent to a process or thread, a signal handler may be entered (depending on the current disposition of the signal), which is similar to the system entering an interrupt handler as the result of receiving an interrupt.
There is quite a bit of history related to signals, design changes in the signal code, and various implementations of UNIX. This was due in part to some deficiencies in the early implementation of signals, as well as the parallel development work done on different versions of UNIX, primarily BSD UNIX and AT&T System V. W. Richard Stevens, James Cox, and Berny Goodheart (see Resources) cover these details in their respective books. What does warrant mention is that early implementations of signals were deemed unreliable. The unreliability stemmed from the fact that in the old days the kernel would reset the signal handler to its default if a process caught a signal and invoked its own handler, and the reset occurred before the handler was invoked. Attempts to address this issue in user code by having the signal handler first reinstall itself did not always solve the problem, as successive occurrences of the same signal resulted in race conditions, where the default action was invoked before the user-defined handler was reinstalled. For signals that had a default action of terminating the process, this created severe problems. This problem (and some others) were addressed in 4.3BSD UNIX and SVR3 in the mid-'80s.
The implementation of reliable signals has been in place for many years now, where an installed signal handler remains persistent and is not reset by the kernel. The POSIX standards provided a fairly well-defined set of interfaces for using signals in code, and today the Solaris Operating Environment implementation of signals is fully POSIX-compliant. Note that reliable signals require the use of the newer sigaction(2)
interface, as opposed to the traditional signal(3C)
call.
The occurrence of a signal may be synchronous or asynchronous to the process or thread, depending on the source of the signal and the underlying reason or cause. Synchronous signals occur as a direct result of the executing instruction stream, where an unrecoverable error (such as an illegal instruction or illegal address reference) requires an immediate termination of the process. Such signals are directed to the thread whose execution stream caused the error. Because an error of this type causes a trap into a kernel trap handler, synchronous signals are sometimes referred to as traps. Asynchronous signals are external to (and in some cases unrelated to) the current execution context. One obvious example is the sending of a signal to a process from another process or thread, via a kill(2)
, _lwp_kill(2)
, or sigsend(2)
system call, or a thr_kill(3T)
, pthread_kill(3T)
, or sigqueue(3R)
library invocation. Asynchronous signals are also referred to as interrupts.
Every signal has a unique signal name, an abbreviation that begins with SIG
(SIGINT
for interrupt signal, for example) and a corresponding signal number. Additionally, for all possible signals, the system defines a default disposition, or action to take when a signal occurs. There are four possible default dispositions:
- Exit: Forces the process to exit
- Core: Forces the process to exit, and creates a core file
- Stop: Stops the process
- Ignore: Ignores the signal; no action taken
A signal's disposition within a process's context defines what action the system will take on behalf of the process when a signal is delivered. All threads and LWPs (lightweight processes) within a process share the signal disposition, which is processwide and cannot be unique among threads within the same process. The table below provides a complete list of signals, along with a description and default action.
|
Signal description and default action
Note that SIGLOST
first appeared in Solaris release 2.6. Solaris 2.5 and 2.5.1 do not define this signal, and instead have SIGRTMIN
and SIGRTMAX
at signal numbers 37 and 44, respectively. The kernel defines MAXSIG
(available for user code in /usr/include/sys/signal.h
) as a symbolic constant used in various places in kernel signal support code. MAXSIG
is 44 in Solaris 2.5 and 2.5.1, and 45 in Solaris 2.6 and 7.
The disposition of a signal can be changed from its default, and a process can arrange to catch a signal and invoke a signal handling routine of its own, or ignore a signal that may not have a default disposition of Ignore. The only exceptions are SIGKILL
and SIGSTOP
, whose default dispositions cannot be changed. The interfaces for defining and changing signal disposition are the signal(3C)
and sigset(3C)
libraries, and the sigaction(2)
system call. Signals can also be blocked, which means the process has temporarily prevented delivery of a signal. The generation of a signal that has been blocked will result in the signal remaining pending to the process until it is explicitly unblocked, or the disposition is changed to Ignore. The sigprocmask(2)
system call will set or get a process's signal mask, the bit array that is inspected by the kernel to determine if a signal is blocked or not. thr_setsigmask(3T)
and pthread_sigmask(3T)
are the equivalent interfaces for setting and retrieving the signal mask at the user-threads level.
I mentioned earlier that a signal may originate from several different places, for a variety of different reasons. The first three signals listed in the table above - SIGHUP
, SIGINT
, and SIGQUIT
- are generated by a keyboard entry from the controlling terminal (SIGINT
and SIGHUP
), or they are generated if the control terminal becomes disconnected (SIGHUP
- use of the nohup(1)
command makes processes "immune" from hangups by setting the disposition of SIGHUP
to Ignore). Other terminal I/O-related signals include SIGSTOP
, SIGTTIN
, SIGTTOU
, and SIGTSTP
. For the signals that originate from a keyboard command, the actual key sequence that generates the signals, usually Ctrl-C, is defined within the parameters of the terminal session, typically via stty(1)
, which results in a SIGINT
being sent to a process, and has a default disposition of Exit.
Signals generated as a direct result of an error encountered during instruction execution start with a hardware trap on the system. Different processor architectures define various traps that result in an immediate vectored transfer of control to a kernel trap-handling function. The Solaris kernel builds a trap table and inserts trap-handling routines in the appropriate locations based on the architecture specification of the processors that Solaris supports: SPARC V7 (early Sun-4 architectures), SPARC V8 (SuperSPARC - Sun-4m and Sun-4d architectures), SPARC V9 (UltraSPARC), and x86 (in Intel parlance they're called interrupt descriptor tables or IDTs; on SPARC, they're called trap tables). The kernel-installed trap handler will ultimately generate a signal to the thread that caused the trap. The signals that result from hardware traps are SIGILL
, SIGFPE
, SIGSEGV
, SIGTRAP
, SIGBUS
, and SIGEMT
.
In addition to terminal I/O and error trap conditions, signals can originate from sources such as an explicit send programmatically via kill(2)
or thr_kill(3T)
, or from a shell issuing a kill(1)
command. Parent processes are notified of status change in a child process via SIGCHLD
. The alarm(2)
system call sends a SIGALRM
when the timer expires. Applications can create user-defined signals as a somewhat crude form of interprocess communication by defining handlers for SIGUSR1
or SIGUSR2
and then sending those signals between processes. The kernel sends SIGXCPU
if a process exceeds its processor time resource limit or SIGXFSZ
if a file write exceeds the file size resource limit. A SIGABRT
is sent as a result of an invocation of the abort(3C)
library. If a process is writing to a pipe and the reader has terminated, SIGPIPE
is generated.
These examples of signals generated as a result of events beyond hard errors and terminal I/O do not represent the complete list, but rather provide you with a well-rounded set of examples of the process-induced and external events that can generate signals. You can find a complete list in any number of texts on UNIX programming.
In terms of actual implementation, a signal is represented as a bit in a data structure (several data structures, actually, as you'll see shortly). More succinctly, the posting of a signal by the kernel results in a bit getting set in a structure member at either the process or thread level. Because each signal has a unique signal number, a structure member of sufficient width is used, which allows every signal to be represented by simply setting the bit that corresponds to the signal number of the signal you wish to post (for example, setting the 17th bit to post signal 17, SIGUSR1
).
Because Solaris includes more than 32 possible signals, a long or int data type is not sufficiently wide to represent each possible signal as a unique bit, so a data structure is required. The k_sigset_t data
structure defined in /usr/include/signal.h
is used in several of the process data structures to store the posted signal bits. It's an array of two unsigned long data types (array members 0 and 1), providing a bit width of 64 bits.
![]() Figure 1. k_sigset_t data structure (Click image to enlarge.) |
Signals in Solaris
The multithreaded architecture of Solaris made for some interesting challenges in developing a means of supporting signals that comply with the UNIX signal semantics, as defined by industry standards such as POSIX. Signals traditionally go through two well-defined stages: generation and delivery. Signal generation is the point of origin of the signal, or the sending phase. A signal is said to be delivered when whatever disposition that has been established for the signal is invoked, even if it is to be ignored. If a signal is being blocked, thus postponing delivery, it is considered pending.
User threads in Solaris, created via explicit calls to either thr_create(3T)
or pthread_create(3T)
, all have their own signal masks. Threads can choose to block signals independent of other threads executing in the same process, which allows different threads to take delivery of different signals at various times during process execution. The thread's libraries (POSIX and Solaris threads) provide thr_sigsetmask(3T)
and pthread_sigmask(3T)
interfaces for establishing per-user thread signal masks. The disposition and handlers for all signals are shared by all the threads in a process. So, for example, a SIGINT
with the default disposition in place will cause the entire process to exit.
Signals generated as a result of a trap (SIGFPE
, SIGILL
, etc) are sent to the thread that caused the trap. Asynchronous signals are delivered to the first thread that is found not blocking the signal.
The difficulty in implementing semantically correct signals in Solaris arises from the fact that user-level threads are not visible to the kernel; the low-level kernel signal code has no way of knowing which threads have which signals blocked and, thus, which thread a signal should be sent to. Some sort of intermediary phase needed to be implemented, something that had visibility to the user-thread signal masks as well as to the kernel. The solution comes in the form of a special LWP that is created by the thread's library for programs that are linked to libthread
, called the aslwp
(it's actually an LWP/kthread pair). The implementation of the aslwp
extends the traditional signal generation and delivery phases by adding two additional steps: notification and redirection.
Generation -> Notification -> Redirection -> Delivery
When a signal (generation) is sent to a process, the aslwp
is notified, at which point the aslwp
will look for a thread that can take delivery of the signal. Once such a thread is located, the signal is redirected and delivered to that thread.
Figure 2 shows the LWP/kthread and user-thread structures used to support signals in the process.
![]() Figure 2. LWP/kthread and user-thread structures used to support signals in the process (Click image to enlarge.) |
Resources
- Advanced Programming in the UNIX Environment (Addison-Wesley, ISBN 0-201-56317-7) Stevens, W. Richard.
- Multithreaded Programming with Pthreads (Sun Microsystems Press/Prentice Hall ISBN 0-13-680729-1) Berg, Daniel, J. and Lewis, Bill.
- Programming with Threads (Sun Microsystems Press/Prentice Hall, ISBN 0-13-172389-8) Kleiman, Steve, Shah, Devang, and Smaalders, Bart.
- The Magic Garden Explained: The Internals of UNIX System V Release 4 (Prentice Hall, ISBN 0-13-098138-9) Goodheart, Berny, Cox, James, and Mashey, John, R.
- UNIX Internals: The New Frontiers (Prentice Hall, ISBN 0-13-101908-2) Vahalia, Uresh.
About the author
Jim Mauro is a Senior Staff Engineer in the Performance and Availability Engineering group at Sun Microsystems, where he focuses on system availability and failure recovery. When not working or writing, Jim enjoys building Legos with his 2 sons, reading a wide variety of fiction and non-fiction, listening to music, and drooling over the next upgrade of his stereo system.
2001
Reprinted with permission from the April 1999 edition of SunWorld magazine. Copyright Web Publishing Inc., an IDG Communications company.