X86平台采用rdtsc指令读取时间戳计数器完成高精度计时

最新推荐文章于 2024-07-14 23:25:18 发布

转载最新推荐文章于 2024-07-14 23:25:18 发布 · 531 阅读

0 ·

CC 4.0 BY-SA版权

原文链接：http://www.cnblogs.com/cnmaizi/archive/2011/01/17/1937772.html

文章标签：

#php #操作系统

本文介绍了80x86微处理器中的时间戳计数器(TSC)，它可用于计算CPU主频及测试处理单元速度。文章详细解释了rdtsc指令的使用方法，并提供了Linux环境下获取TSC值的具体代码实例。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

从pentium开始，很多80x86微处理器都引入TSC，一个用于时间戳计数器的64位的寄存器，它在每个时钟信号(CLK, CLK是微处理器中一条用于接收外部振荡器的时钟信号输入引线）到来时加一。
通过它可以计算CPU的主频，比如：如果微处理器的主频是1MHZ的话，那么TSC就会在1秒内增加1000000。除了计算CPU的主频外，还可以通过TSC来测试微处理器其他处理单元的运算速度，http://www.h52617221.de/dictionary.php?id=278 介绍了这个内容。
那么如何获取TSC的值呢？rdtsc，一条读取TSC的指令，它把TSC的低32位存放在eax寄存器中，把TSC的高32位存放在edx中，更详细的描述见资料[1]。
下面来看看rdtsc的具体用法，在linux源代码include/asm-i386/msr.h中，可以找到这么三个关于rdtsc的宏定义：

#define rdtsc(low,high) \
      __asm__ __volatile__("rdtsc" : "=a" (low), "=d" (high))
#define rdtscl(low) \
       __asm__ __volatile__("rdtsc" : "=a" (low) : : "edx")
#define rdtscll(val) \
       __asm__ __volatile__("rdtsc" : "=A" (val))

如果我们想准确的知道一段程序，一个函数的执行时间，可以连续执行2次rdtsc，之间没有一行代码，来计算这两段指令执行过程会有的cycle数，不同机器可能都会有不同，和机器的性能有关系，但和负载没关系，也就是多进程，多线程情况下，连续两个rdtsc之间不会插入很多cycle。

static unsigned cyc_hi = 0;
static unsigned cyc_lo = 0;

/* Set *hi and *lo to the high and low order bits of the cycle counter.
 *    Implementation requires assembly code to use the rdtsc instruction. */
void access_counter(unsigned *hi, unsigned *lo)
{
        asm("rdtsc; movl %%edx, %0; movl %%eax, %1"
                        : "=r" (*hi), "=r" (*lo)
                        : /* No input */
                        : "%edx", "%eax");
        return;
}

/* Record the current value of the cycle counter. */
void start_counter(void)
{
        access_counter(&cyc_hi, &cyc_lo);
        return;
}

RDTSC只在X86下有效，其余平台会有类似指令来做准确计数，RDTSC指令的精度是可以接受的，里面能插得cycle是很有限的。如果对计数要求没那么高，可以采用一些通用库函数，当然你也可以用类似的方法来考察这些库函数的精度，连续执行2次就行。

/* Return the number of cycles since the last call to start_counter. */
double get_counter(void)
{
        unsigned int    ncyc_hi, ncyc_lo;
        unsigned int    hi, lo, borrow;
        double  result;

        /* Get cycle counter */
        access_counter(&ncyc_hi, &ncyc_lo);

        /* Do double precision subtraction */
        lo = ncyc_lo - cyc_lo;
        borrow = lo > ncyc_lo;
        hi = ncyc_hi - cyc_hi - borrow;

        result = (double)hi * (1 << 30) * 4 + lo;
        if (result < 0) {
                printf("Error: counter returns neg value: %.0f\n", result);
        }
        return result;
}

函数调用方式：

start_counter();
tmp = fast_log(input);
cnt = get_counter();
printf("tmp = %f. clk counter = %lf.\n",tmp,cnt);

转载于:https://www.cnblogs.com/cnmaizi/archive/2011/01/17/1937772.html