Linux时间管理(六)

1      NOHZ模式(动态时钟)

dynamic tick引入之前,内核一直使用周期性的基于HZtick。传统的tick机制在系统进入空闲状态时仍然会产生周期性的中断,这种频繁的中断迫使CPU无法进入更深的睡眠。如果放开这个限制,在系统进入空闲时停止tick,有工作时恢复tick,实现完全自由的,根据需要产生tick的机制,可以使CPU获得更多的睡眠机会以及更深的睡眠,从而进一步节能。dynamic tick的出现,就是为彻底替换掉周期性的tick机制而产生的。周期性运行的tick机制需要完成诸如进程时间片的计算,更新profile,协助CPU进行负载均衡等诸多工作,这些工作dynamic tick都提供了相应的模拟机制来完成。

从上文中可知内核时钟子系统支持低精度和高精度两种模式,因此dynamic tick也必须有两套对应的处理机制。

其核心数据结构为:

/**

 * struct tick_sched - sched tick emulation and no idle tick control/stats

 * @sched_timer:  hrtimer to schedule the periodic tick in high

 *                  resolution mode

 * @idle_tick:              Store the last idle tick expiry time when the tick

 *                  timer is modified for idle sleeps. This is necessary

 *                  to resume the tick timer operation in the timeline

 *                  when the CPU returns from idle

 * @tick_stopped:  Indicator that the idle tick has been stopped

 * @idle_jiffies:    jiffies at the entry to idle for idle time accounting

 * @idle_calls:             Total number of idle calls

 * @idle_sleeps:    Number of idle calls, where the sched tick was stopped

 * @idle_entrytime:      Time when the idle call was entered

 * @idle_waketime:      Time when the idle was interrupted

 * @idle_exittime: Time when the idle state was left

 * @idle_sleeptime:      Sum of the time slept in idle with sched tick stopped

 * @iowait_sleeptime:   Sum of the time slept in idle with sched tick stopped, with IO outstanding

 * @sleep_length:  Duration of the current idle sleep

 * @do_timer_lst:  CPU was the last one doing do_timer before going idle

 */

struct tick_sched {

       struct hrtimer                sched_timer;

       unsigned long               check_clocks;

       enum tick_nohz_mode          nohz_mode;

       ktime_t                        idle_tick;

       int                        inidle;

       int                        tick_stopped;

       unsigned long               idle_jiffies;

       unsigned long               idle_calls;

       unsigned long               idle_sleeps;

       int                        idle_active;

       ktime_t                        idle_entrytime;

       ktime_t                        idle_waketime;

       ktime_t                        idle_exittime;

       ktime_t                        idle_sleeptime;

       ktime_t                        iowait_sleeptime;

       ktime_t                        sleep_length;

       unsigned long               last_jiffies;

       unsigned long               next_jiffies;

       ktime_t                        idle_expires;

       int                        do_timer_last;

};

 

/*

 * Per cpu nohz control structure

 */

static DEFINE_PER_CPU(struct tick_sched, tick_cpu_sched);

1.1      低精度NOHZ模式

在低精度模式下,每次tick都会触发TIMER_SOFTIRQ软中断,软中断处理函数run_time_softirq这个函数里可能使得时钟模式切换到NOHZ模式。切换过程如下:

run_timer_softirq

       hrtimer_run_pending

              tick_check_oneshot_change

                     tick_nohz_switch_to_nohz();

                            tick_switch_to_oneshot(tick_nohz_handler)

发生上述调用流程的前提是没有设置CONFIG_HIGH_RES_TIMERS选项,即没有启用高精度模式但是内核使能了NOHZ模式

低精度模式下dynamic tick的核心处理函数tick_nohz_handler,其核心处理函数下所示。

static void tick_nohz_handler(struct clock_event_device *dev)

 {

    struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);

    struct pt_regs *regs = get_irq_regs();

    int cpu = smp_processor_id();

    ktime_t now = ktime_get();

 

    dev->next_event.tv64 = KTIME_MAX;

 

    if (unlikely(tick_do_timer_cpu == TICK_DO_TIMER_NONE))

        tick_do_timer_cpu = cpu;

 

    /* Check, if the jiffies need an update */

    if (tick_do_timer_cpu == cpu)

        tick_do_update_jiffies64(now);

    /*

     * When we are idle and the tick is stopped, we have to touch

     * the watchdog as we might not schedule for a really long

     * time. This happens on complete idle SMP systems while

     * waiting on the login prompt. We also increment the "start

     * of idle" jiffy stamp so the idle accounting adjustment we

     * do when we go busy again does not account too much ticks.

     */

    if (ts->tick_stopped) {

        touch_softlockup_watchdog();

        ts->idle_jiffies++;

    }

 

    update_process_times(user_mode(regs));

    profile_tick(CPU_PROFILING);

//设置下次超时事件

    while (tick_nohz_reprogram(ts, now)) {

        now = ktime_get();

        tick_do_update_jiffies64(now);

    }

 }

在这个函数中首先模拟周期性tick device完成类似的工作:如果当前CPU负责全局tick device的工作,则更新jiffies,同时完成对本地CPU的进程时间统计等工作。如果当前tick device在此之前已经处于停止状态,为了防止tick停止时间过长造成 watchdog 超时,从而引发soft-lockdep的错误,需要通过调用touch_softlockup_watchdog复位软件看门狗防止其溢出。正如代码中注释所描述,这种情况有可能出现在启动完毕后完全空闲等待登录的SMP 系统上。最后需要设置下一次tick的超时时间。如果tick_nohz_reprogram执行时间超过了一个jiffy,会导致设置的下一次超时时间已经过期,因此需要重新设置,相应的也需要再次更新jiffies。这里虽然设置了下一次的超时事件,但是由于系统空闲时会停止tick,因此下一次的超时事件可能发生,也可能不发生。这也正是dynamic tick根本特性。

1.2      高精度NOHZ模式

其具体的流程为:

hrtimer_switch_to_hres();

                     tick_init_highres

                            tick_switch_to_oneshot(hrtimer_interrupt);

                     tick_setup_sched_timer();

                            ts->sched_timer.function = tick_sched_timer;

高精度NOHZ模式下的核心处理函数是tick_sched_timer,具体实现如下:

/*

 * We rearm the timer until we get disabled by the idle code.

 * Called with interrupts disabled and timer->base->cpu_base->lock held.

 */

static enum hrtimer_restart tick_sched_timer(struct hrtimer *timer)

{

       struct tick_sched *ts =

              container_of(timer, struct tick_sched, sched_timer);

       struct pt_regs *regs = get_irq_regs();

       ktime_t now = ktime_get();

       int cpu = smp_processor_id();

 

#ifdef CONFIG_NO_HZ

       /*

        * Check if the do_timer duty was dropped. We don't care about

        * concurrency: This happens only when the cpu in charge went

        * into a long sleep. If two cpus happen to assign themself to

        * this duty, then the jiffies update is still serialized by

        * xtime_lock.

        */

       if (unlikely(tick_do_timer_cpu == TICK_DO_TIMER_NONE))

              tick_do_timer_cpu = cpu;

#endif

 

       /* Check, if the jiffies need an update */

       if (tick_do_timer_cpu == cpu)

              tick_do_update_jiffies64(now);

 

       /*

        * Do not call, when we are not in irq context and have

        * no valid regs pointer

        */

       if (regs) {

              /*

               * When we are idle and the tick is stopped, we have to touch

               * the watchdog as we might not schedule for a really long

               * time. This happens on complete idle SMP systems while

               * waiting on the login prompt. We also increment the "start of

               * idle" jiffy stamp so the idle accounting adjustment we do

               * when we go busy again does not account too much ticks.

               */

              if (ts->tick_stopped) {

                     touch_softlockup_watchdog();

                     ts->idle_jiffies++;

              }

              update_process_times(user_mode(regs));

              profile_tick(CPU_PROFILING);

       }

       //设置下次超时事件

       hrtimer_forward(timer, now, tick_period);

 

       return HRTIMER_RESTART;

}

hrtimer高精度模式下模拟周期运行的tick device的简化实现中可以看到,在高精度模式下tick_sched_timer用来模拟周期性tick device的功能。需要注意的是tick_sched_timer又是在hrtimer_interrupt中调用的。dynamic tick的实现也使用了这个函数。这是因为hrtimer在高精度模式时必须使用one-shot模式的tick device,这也同时符合dynamic tick的要求。虽然使用同样的函数,表面上都会触发周期性的 tick 中断,但是使用dynamic tick的系统在空闲时会停止tick工作,因此tick中断不会是周期产生的。

1.3      Dynamic tick 的开始和停止

CPU进入空闲时是最好的启动dynamic tick机制时机,停止tick;反之在CPU从空闲中恢复到工作状态时,则可以停止dynamic tick,如下所示:

CPU idle dynamic tick 的启动/停止设置

void cpu_idle(void)

 {

 . . . .

        while (1) {

                tick_nohz_stop_sched_tick(1);

                while (!need_resched()) {

                           . . . .

                }

 

                tick_nohz_restart_sched_tick();

        }

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值