一、进程控制块PCB
- 概念:
进程控制块是操作系统核心的一种数据结构,主要用来表示进程的状态,它能够使程序成为一个能够独立运行的基本单位,并且可以并发执行的进程。也就是说,系统是根据PCB来对并发执行的进程进行控制和管理。PCB中通常存放着操作系统用于描述进程的情况和控制进程运行的全部信息。 进程控制块中存放的信息
进程标识符
内部标识符:操作系统为每一个子进程创建的进程标识符,也就是进程的PID
外部标识符:由进程的创建者提供,供用户在访问该进程的时候使用。描述进程的家族关系,设置父进程标识以及子进程标识,还可以设置用户标识,用来说明拥有该进程的用户。处理机状态
主要是由处理机的各种寄存器中的内容组成的,处理机被中断的时候,所有的这些信息都必须保存在PCB中,以便进程在重新执行时,能从断点继续执行。进程的调度信息
进程状态
进程的优先级
进程由执行状态转变为阻塞状态时,所造成这种状态转换的时间,也就是阻塞原因
进程调度所需要的其他信息,例如:进程相应的时间片等- 进程的控制信息
程序的数据和地址
进程同步和通信机制
资源清单
链接指针
二、task_struct结构体
task_struct是Linux内核的一种数据结构,它会被装载到RAM中并且包含着进程的信息。每个进程都把它的信息放在这个数据结构体中,它被定义在usr/include/linux/sched.h中。
task_struct结构体中的主要信息:
1、进程状态:记录进程是处于运行状态还是等待状态
2、调度信息:进程由哪个函数调度,具体怎样调度等
3、进程之间的通讯状况
4、进程之间的亲属关系:在父进程和子进程之间有task_struct类型的指针,将父进程和子进程联系起来
5、时间数据信息:每个进程执行所占用CPU的时间
6、进程的标志
7、进程的标识符:该进程唯一的标识符用来区别其他进程
8、信号处理信息
9、文件信息:可以进行读写操作的一些文件的信息
10、页面管理信息
11、优先级:相对于其他进程的优先级
12、ptrace系统调用
13、虚拟内存处理
内核为2.6.32版的task_struct结构体源码:
struct task_struct {
volatile long state; /* -1 unrunnable, 0 runnable, >0 stopped */
void *stack;
atomic_t usage;
unsigned int flags; /* per process flags, defined below */
unsigned int ptrace;
int lock_depth; /* BKL lock depth */
#ifdef CONFIG_SMP
#ifdef __ARCH_WANT_UNLOCKED_CTXSW
int oncpu;
#endif
#endif
int prio, static_prio, normal_prio;
unsigned int rt_priority;
const struct sched_class *sched_class;
struct sched_entity se;
struct sched_rt_entity rt;
#ifdef CONFIG_PREEMPT_NOTIFIERS
/* list of struct preempt_notifier: */
struct hlist_head preempt_notifiers;
#endif
/*
* fpu_counter contains the number of consecutive context switches
* that the FPU is used. If this is over a threshold, the lazy fpu
* saving becomes unlazy to save the trap. This is an unsigned char
* so that after 256 times the counter wraps and the behavior turns
* lazy again; this to deal with bursty apps that only use FPU for
* a short time
*/
unsigned char fpu_counter;
#ifdef CONFIG_BLK_DEV_IO_TRACE
unsigned int btrace_seq;
#endif
unsigned int policy;
cpumask_t cpus_allowed;
#ifdef CONFIG_TREE_PREEMPT_RCU
int rcu_read_lock_nesting;
char rcu_read_unlock_special;
struct rcu_node *rcu_blocked_node;
struct list_head rcu_node_entry;
#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
#if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
struct sched_info sched_info;
#endif
struct list_head tasks;
struct plist_node pushable_tasks;
struct mm_struct *mm, *active_mm;
/* task state */
int exit_state;
int exit_code, exit_signal;
int pdeath_signal; /* The signal sent when the parent dies */
unsigned int personality;
unsigned did_exec:1;
unsigned in_execve:1; /* Tell the LSMs that the process is doing an
* execve */
unsigned in_iowait:1;
/* Revert to default priority/policy when forking */
unsigned sched_reset_on_fork:1;
pid_t pid;
pid_t tgid;
#ifdef CONFIG_CC_STACKPROTECTOR
/* Canary value for the -fstack-protector gcc feature */
unsigned long stack_canary;
#endif
/*
* pointers to (original) parent process, youngest child, younger sibling,
* older sibling, respectively. (p->father can be replaced with
* p->real_parent->pid)
*/
struct task_struct *real_parent; /* real parent process */
struct task_struct *parent; /* recipient of SIGCHLD, wait4() reports */
/*
* children/sibling forms the list of my natural children
*/
struct list_head children; /* list of my children */
struct list_head sibling; /* linkage in my parent's children list */
struct task_struct *group_leader; /* threadgroup leader */
/*
* ptraced is the list of tasks this task is using ptrace on.
* This includes both natural children and PTRACE_ATTACH targets.
* p->ptrace_entry is p's link on the p->parent->ptraced list.
*/
struct list_head ptraced;
struct list_head ptrace_entry;
/*
* This is the tracer handle for the ptrace BTS extension.
* This field actually belongs to the ptracer task.
*/
struct bts_context *bts;
/* PID/PID hash table linkage. */
struct pid_link pids[PIDTYPE_MAX];
struct list_head thread_group;
struct completion *vfork_done; /* for vfork() */
int __user *set_child_tid; /* CLONE_CHILD_SETTID */
int __user *clear_child_tid; /* CLONE_CHILD_CLEARTID */
cputime_t utime, stime, utimescaled, stimescaled;
cputime_t gtime;
cputime_t prev_utime, prev_stime;
unsigned long nvcsw, nivcsw; /* context switch counts */
struct timespec start_time; /* monotonic time */
struct timespec real_start_time; /* boot based time */
/* mm fault and swap info: this can arguably be seen as either mm-specific or thread-specific */
unsigned long min_flt, maj_flt;
struct task_cputime cputime_expires;
struct list_head cpu_timers[3];
/* process credentials */
const struct cred *real_cred; /* objective and real subjective task
* credentials (COW) */
const struct cred *cred; /* effective (overridable) subjective task
* credentials (COW) */
struct mutex cred_guard_mutex; /* guard against foreign influences on
* credential calculations
* (notably. ptrace) */
struct cred *replacement_session_keyring; /* for KEYCTL_SESSION_TO_PARENT */
char comm[TASK_COMM_LEN]; /* executable name excluding path
- access with [gs]et_task_comm (which lock
it with task_lock())
- initialized normally by flush_old_exec */
/* file system info */
int link_count, total_link_count;
#ifdef CONFIG_SYSVIPC
/* ipc stuff */
struct sysv_sem sysvsem;
#endif
#ifdef CONFIG_DETECT_HUNG_TASK
/* hung task detection */
unsigned long last_switch_count;
#endif
/* CPU-specific state of this task */
struct thread_struct thread;
/* filesystem information */
struct fs_struct *fs;
/* open file information */
struct files_struct *files;
/* namespaces */
struct nsproxy *nsproxy;
/* signal handlers */
struct signal_struct *signal;
struct sighand_struct *sighand;
sigset_t blocked, real_blocked;
sigset_t saved_sigmask; /* restored if set_restore_sigmask() was used */
struct sigpending pending;
unsigned long sas_ss_sp;
size_t sas_ss_size;
int (*notifier)(void *priv);
void *notifier_data;
sigset_t *notifier_mask;
struct audit_context *audit_context;
#ifdef CONFIG_AUDITSYSCALL
uid_t loginuid;
unsigned int sessionid;
#endif
seccomp_t seccomp;
/* Thread group tracking */
u32 parent_exec_id;
u32 self_exec_id;
/* Protection of (de-)allocation: mm, files, fs, tty, keyrings, mems_allowed,
* mempolicy */
spinlock_t alloc_lock;
#ifdef CONFIG_GENERIC_HARDIRQS
/* IRQ handler threads */
struct irqaction *irqaction;
#endif
/* Protection of the PI data structures: */
spinlock_t pi_lock;
#ifdef CONFIG_RT_MUTEXES
/* PI waiters blocked on a rt_mutex held by this task */
struct plist_head pi_waiters;
/* Deadlock detection and priority inheritance handling */
struct rt_mutex_waiter *pi_blocked_on;
#endif
#ifdef CONFIG_DEBUG_MUTEXES
/* mutex deadlock detection */
struct mutex_waiter *blocked_on;
#endif
#ifdef CONFIG_TRACE_IRQFLAGS
unsigned int irq_events;
int hardirqs_enabled;
unsigned long hardirq_enable_ip;
unsigned int hardirq_enable_event;
unsigned long hardirq_disable_ip;
unsigned int hardirq_disable_event;
int softirqs_enabled;
unsigned long softirq_disable_ip;
unsigned int softirq_disable_event;
unsigned long softirq_enable_ip;
unsigned int softirq_enable_event;
int hardirq_context;
int softirq_context;
#endif
#ifdef CONFIG_LOCKDEP
# define MAX_LOCK_DEPTH 48UL
u64 curr_chain_key;
int lockdep_depth;
unsigned int lockdep_recursion;
struct held_lock held_locks[MAX_LOCK_DEPTH];
gfp_t lockdep_reclaim_gfp;
#endif
/* journalling filesystem info */
void *journal_info;
/* stacked block device info */
struct bio *bio_list, **bio_tail;
/* VM state */
struct reclaim_state *reclaim_state;
struct backing_dev_info *backing_dev_info;
struct io_context *io_context;
unsigned long ptrace_message;
siginfo_t *last_siginfo; /* For ptrace use. */
struct task_io_accounting ioac;
#if defined(CONFIG_TASK_XACCT)
u64 acct_rss_mem1; /* accumulated rss usage */
u64 acct_vm_mem1; /* accumulated virtual memory usage */
cputime_t acct_timexpd; /* stime + utime since last update */
#endif
#ifdef CONFIG_CPUSETS
nodemask_t mems_allowed; /* Protected by alloc_lock */
int cpuset_mem_spread_rotor;
#endif
#ifdef CONFIG_CGROUPS
/* Control Group info protected by css_set_lock */
struct css_set *cgroups;
/* cg_list protected by css_set_lock and tsk->alloc_lock */
struct list_head cg_list;
#endif
#ifdef CONFIG_FUTEX
struct robust_list_head __user *robust_list;
#ifdef CONFIG_COMPAT
struct compat_robust_list_head __user *compat_robust_list;
#endif
struct list_head pi_state_list;
struct futex_pi_state *pi_state_cache;
#endif
#ifdef CONFIG_PERF_EVENTS
struct perf_event_context *perf_event_ctxp;
struct mutex perf_event_mutex;
struct list_head perf_event_list;
#endif
#ifdef CONFIG_NUMA
struct mempolicy *mempolicy; /* Protected by alloc_lock */
short il_next;
#endif
atomic_t fs_excl; /* holding fs exclusive resources */
struct rcu_head rcu;
/*
* cache last used pipe for splice
*/
struct pipe_inode_info *splice_pipe;
#ifdef CONFIG_TASK_DELAY_ACCT
struct task_delay_info *delays;
#endif
#ifdef CONFIG_FAULT_INJECTION
int make_it_fail;
#endif
struct prop_local_single dirties;
#ifdef CONFIG_LATENCYTOP
int latency_record_count;
struct latency_record latency_record[LT_SAVECOUNT];
#endif
/*
* time slack values; these are used to round up poll() and
* select() etc timeout values. These are in nanoseconds.
*/
unsigned long timer_slack_ns;
unsigned long default_timer_slack_ns;
struct list_head *scm_work_list;
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
/* Index of current stored adress in ret_stack */
int curr_ret_stack;
/* Stack of return addresses for return function tracing */
struct ftrace_ret_stack *ret_stack;
/* time stamp for last schedule */
unsigned long long ftrace_timestamp;
/*
* Number of functions that haven't been traced
* because of depth overrun.
*/
atomic_t trace_overrun;
/* Pause for the tracing */
atomic_t tracing_graph_pause;
#endif
#ifdef CONFIG_TRACING
/* state flags for use by tracers */
unsigned long trace;
/* bitmask of trace recursion */
unsigned long trace_recursion;
#endif /* CONFIG_TRACING */
unsigned long stack_start;
};
三、task_struct结构体解析
- 进程的状态
Linux中的进程由多种状态,在运行的过程中,进程会随着调度在多种情况下转换,进程的信息是进程进行调度的对换的依据。
volatile long state; /* -1 unrunnable, 0 runnable, >0 stopped */
表示进程的运行状态,-1为不可运行,0可以运行,大于0表示停止。
内核中状态的表示有以下几种:
2、进程的标志
unsigned int flags; /* per process flags, defined below */
当前进程的标志,用于内核识别当前进程的状态,以备下一步操作。
flags的取值有以下几种情况:
3、进程的标识符
每一个进程都拥有自己的进程标识符、用户标识符、组标识符
进程标识符PID是用来表示不同进程的,每一个进程都有唯一的标识符,内核就是通过这个标识符来识别不同的进程的。
pid_t pid;
pid_t tgid;
pid:进程标识符
tpid:线程的组号
4、进程间的亲属关系
进程的创建是具有继承关系的,一个进程可以创建多个子进程,该进程是这些子进程的父进程,这些子进程之间具有兄弟的关系。
在创建子进程的时候,子进程会继承父进程的大部分信息,也就是说子进程会将父进程的task_struct结构体中的大部分信息拷贝过来,除过pid,因而系统需要记录这些亲属关系,以便进程之间的协作。
每个进程的task_struct结构体中有许多指针,这些指针将所有的进程的task _struct结构连接起来,构成了一棵进程树。
struct task_struct *real_parent; /* real parent process */
struct task_struct *parent; /* recipient of SIGCHLD, wait4() reports */
/*
* children/sibling forms the list of my natural children
*/
struct list_head children; /* list of my children */
struct list_head sibling; /* linkage in my parent's children list */
struct task_struct *group_leader; /* threadgroup leader */
5、ptrace系统调用
unsigned int ptrace;
ptrace系统调用提供了父进程可以观察和控制子进程执行的能力,并允许父进程检查和替换子进程的内核镜像(包括寄存器)的值。基本原理:当使用了ptrace跟踪后,所有发送给被跟踪的子进程的信号,都会被转发给父进程,而子进程被阻塞。而父进程收到信号后,就可以对停下来的子进程进行检查和修改,然后让子进程继续运行。请我们常用的调试工具gdb就是基于ptrace来实现的。
6、进程的调度信息
const struct sched_class *sched_class;
struct sched_entity se;
struct sched_rt_entity rt;
sched_class:调度类
se:普通进程的调用实体,每个进程都有其中之一的实体
rt:实时进程的调用实体,每个进程都有其中之一的实体
进程调度是利用这部分信息来决定进程执行的有限次序,结合着进程的状态信息来保证进程合理有序的运行。进程有多种调度信息,如下所示:
7、进程的优先级
int prio, static_prio, normal_prio;
unsigned int rt_priority;
8、时间数据信息
cputime_t utime, stime, utimescaled, stimescaled;
cputime_t gtime;
cputime_t prev_utime, prev_stime;
unsigned long nvcsw, nivcsw; /* context switch counts */
struct timespec start_time; /* monotonic time */
struct timespec real_start_time; /* boot based time */
/* mm fault and swap info: this can arguably be seen as either mm-specific or thread-specific */
unsigned long min_flt, maj_flt;
struct task_cputime cputime_expires;
struct list_head cpu_timers[3];
9、进程之间的通信
#ifdef CONFIG_SYSVIPC
/* ipc stuff */
struct sysv_sem sysvsem;
#endif
如果多个进程在一个任务上执行协作,那么就需要这些进城可以相互访问对方的资源,相互通信。
Linux中的主要进程通信方式有:管道、信号量、内存共享、信号和消息队列
10、文件信息
/* file system info */
int link_count, total_link_count;
/* filesystem information */
struct fs_struct *fs;
/* open file information */
struct files_struct *files;
进程可以打开或者关闭文件,文件属于系统资源,Linux内核要对进程使用文件的情况进行记录。task_struct结构体中有两个数据结构用于描述进程预文件相关的信息。其中fs _struct中描述了两个VFS索引节点,这两个索引节点叫做root和pwd,分别指向进程的可执行影响所对应的根目录和当前目录或者工作目录。file _struct结构用来记录了进程打开的文件的描述符。
11、信号处理信息
struct signal_struct *signal;
struct sighand_struct *sighand;
sigset_t blocked, real_blocked;
sigset_t saved_sigmask; /* restored if set_restore_sigmask() was used */
struct sigpending pending;
unsigned long sas_ss_sp;
size_t sas_ss_size;
int (*notifier)(void *priv);
void *notifier_data;
sigset_t *notifier_mask;
12、虚拟内存处理
struct mm_struct *mm, *active_mm;
mm _struct用来描述每个进程的地址空间(虚拟空间),active _mm是为内核线程引入的,因为内核线程没有自己的地址空间,为了让内核线程与普通进程具有统一的上下文切换方式,当内核线程进行上下文切换的时候,让切换进来的线程的active _mm指向刚被调度出去的进程的active _mm.
13、页面管理信息
当物理内存不足时,Linux内存管理系统需要将内存的部分页面转到外存,其交换是以页为单位的。
这里主要讲了一些较为主要的信息,还有一些其他信息就不再赘述了。