Linux时间子系统1：gettimeofday和clock_gettime实现分析

致守

于 2025-02-09 22:17:06 发布

阅读量191

点赞数

分类专栏： c c++ 文章标签： linux 运维服务器

原文链接：https://blog.youkuaiyun.com/Bluetangos/article/details/136721196

版权

c++ 同时被 2 个专栏收录

427 篇文章

订阅专栏

257 篇文章

订阅专栏

1. Linux用户态获取时间的函数

a. 秒级别的时间函数：time和stime

time和stime函数的定义如下：

#include <time.h>
time_t time(time_t *t);
int stime(time_t *t);

time函数返回了当前时间点到linux epoch的秒数（内核中timekeeper模块保存了这个值，timekeeper->xtime_sec）。stime是设定当前时间点到linux epoch的秒数。对于linux kernel，设定时间的进程必须拥有CAP_SYS_TIME的权利，否则会失败。

b. 微秒级别的时间函数：gettimeofday和settimeofday

#include <sys/time.h>
int gettimeofday(struct timeval *tv, struct timezone *tz);
int settimeofday(const struct timeval *tv, const struct timezone *tz);

这两个函数和上一小节秒数的函数类似，只不过时间精度可以达到微秒级别。gettimeofday函数可以获取从linux epoch到当前时间点的秒数以及微秒数

显然，sys_gettimeofday和sys_settimeofday这两个系统调用是用来支持上面两个函数功能的，值得一提的是：这些系统调用在新的POSIX标准中 gettimeofday和settimeofday接口函数被标注为obsolescent，取而代之的是clock_gettime和clock_settime接口函数

实际上上面的说法并不完全准确，在《Linux多线程服务端编程》一书5.1节中提到过，在x86-64的Linux上，gettimeofday不是系统调用，不会陷入内核。这种说法也有问题，因为gettimeofday确实是个系统调用，但是linux的vdso（virtual dynamic shared object）机制帮我们做到了在调用这些系统调用时不陷入内核，从而提高了性能。我们后面分析代码时会看到。

c. 纳秒级别的时间函数：clock_gettime和clock_settime

#include <time.h>
int clock_getres(clockid_t clk_id, struct timespec *res);
int clock_gettime(clockid_t clk_id, struct timespec *tp);
int clock_settime(clockid_t clk_id, const struct timespec *tp);

如果不是clk_id这个参数，clock_gettime和clock_settime基本上是不用解释的，其概念和gettimeofday和settimeofday接口函数是完全类似的，除了精度是纳秒。Linux 5.10 定义了如下的clkid

/*
* The IDs of the various system clocks (for POSIX.1b interval timers):
*/
#define CLOCK_REALTIME 0
#define CLOCK_MONOTONIC 1
#define CLOCK_PROCESS_CPUTIME_ID 2
#define CLOCK_THREAD_CPUTIME_ID 3
#define CLOCK_MONOTONIC_RAW 4
#define CLOCK_REALTIME_COARSE 5
#define CLOCK_MONOTONIC_COARSE 6
#define CLOCK_BOOTTIME 7
#define CLOCK_REALTIME_ALARM 8
#define CLOCK_BOOTTIME_ALARM 9

但是以上ID并没有包含全部的clock类型，时钟类型，以及时间与时钟源的关系，我们后面再来分析

2. gettimeofday和clock_gettime的实现

a. gettimeofday的实现

我们先看gettimeofday的实现，使用gettimeofday的示例代码如下：

#include <sys/time.h>
#include <stdio.h>
#include <unistd.h>
int main(int argc, char* argv[])
{
struct timeval tv_begin, tv_end;
gettimeofday(&tv_begin, NULL);
printf("start tv_sec %ld tv_usec %ld\n", tv_begin.tv_sec, tv_begin.tv_usec);
usleep(1000);
gettimeofday(&tv_end, NULL);
printf("end tv_sec %ld tv_usec %ld\n", tv_end.tv_sec, tv_end.tv_usec);
}

在Linux kernel中，kernel/time/time.c目录下有如下代码：

SYSCALL_DEFINE2(gettimeofday, struct __kernel_old_timeval __user *, tv,
struct timezone __user *, tz)
{
if (likely(tv != NULL)) {
struct timespec64 ts;
ktime_get_real_ts64(&ts);
if (put_user(ts.tv_sec, &tv->tv_sec) ||
put_user(ts.tv_nsec / 1000, &tv->tv_usec))
return -EFAULT;
}
if (unlikely(tz != NULL)) {
if (copy_to_user(tz, &sys_tz, sizeof(sys_tz)))
return -EFAULT;
}
return 0;
}

如果不看gettimeofday的C库实现，肯定会认为gettimeofday就是直接使用上面的系统调用，实际上我一开始就是这么认为的。我们去glibc/musl看一下，这个函数在musl 1.2.3中的定义如下

int gettimeofday(struct timeval *restrict tv, void *restrict tz)
{
struct timespec ts;
if (!tv) return 0;
clock_gettime(CLOCK_REALTIME, &ts);
tv->tv_sec = ts.tv_sec;
tv->tv_usec = (int)ts.tv_nsec / 1000;
return 0;
}

gettimeofday并没有直接使用系统调用，而是调用了clock_gettime，并且clockid直接填写了CLOCK_REALTIME，那么接下来我们就要分析clock_gettime函数了。

b. clock_gettime的实现

再看musl 1.2.3中的clock_gettime的代码

int __clock_gettime(clockid_t clk, struct timespec *ts)
{
int r;
#ifdef VDSO_CGT_SYM
int (*f)(clockid_t, struct timespec *) =
(int (*)(clockid_t, struct timespec *))vdso_func;
if (f) {
r = f(clk, ts);
if (!r) return r;
if (r == -EINVAL) return __syscall_ret(r);
/* Fall through on errors other than EINVAL. Some buggy
* vdso implementations return ENOSYS for clocks they
* can't handle, rather than making the syscall. This
* also handles the case where cgt_init fails to find
* a vdso function to use. */
}
#endif
#ifdef SYS_clock_gettime64
r = -ENOSYS;
if (sizeof(time_t) > 4)
r = __syscall(SYS_clock_gettime64, clk, ts);
if (SYS_clock_gettime == SYS_clock_gettime64 || r!=-ENOSYS)
return __syscall_ret(r);
long ts32[2];
r = __syscall(SYS_clock_gettime, clk, ts32);
if (r==-ENOSYS && clk==CLOCK_REALTIME) {
r = __syscall(SYS_gettimeofday, ts32, 0);
ts32[1] *= 1000;
}
if (!r) {
ts->tv_sec = ts32[0];
ts->tv_nsec = ts32[1];
return r;
}
return __syscall_ret(r);
#else
r = __syscall(SYS_clock_gettime, clk, ts);
if (r == -ENOSYS) {
if (clk == CLOCK_REALTIME) {
__syscall(SYS_gettimeofday, ts, 0);
ts->tv_nsec = (int)ts->tv_nsec * 1000;
return 0;
}
r = -EINVAL;
}
return __syscall_ret(r);
#endif
}
weak_alias(__clock_gettime, clock_gettime);

很明显有2个分支，我们先看第一个分支，包含宏定义VDSO_CGT_SYM，这里不详细介绍vdso了，放在后面单独讲，vdso简而言之就是为了避免系统调用的开销，使用内存映射的办法，将内核数据映射到用户空间。

那么数据是如何更新到vdso数据的呢？在内核的时间更新函数timekeeping_update函数中调用update_vsyscall更新了vdso数据结构

那么clock_gettime是否在所有情况下都能从用户态获取到时间呢，其实并不是，即使在使能了vdso的情况下，也还是有一些场景需要trap进内核，比如访问phc clock的时间。所以内核还是支持正常的系统调用，内核实现如下：

SYSCALL_DEFINE2(clock_gettime, const clockid_t, which_clock,
struct __kernel_timespec __user *, tp)
{
const struct k_clock *kc = clockid_to_kclock(which_clock);
struct timespec64 kernel_tp;
int error;
if (!kc)
return -EINVAL;
error = kc->clock_get_timespec(which_clock, &kernel_tp);
if (!error && put_timespec64(&kernel_tp, tp))
error = -EFAULT;
return error;
}

同样，我们可以看到根据clockid的不同可以获取到不同的时间，如下：

static const struct k_clock * const posix_clocks[] = {
[CLOCK_REALTIME] = &clock_realtime,
[CLOCK_MONOTONIC] = &clock_monotonic,
[CLOCK_PROCESS_CPUTIME_ID] = &clock_process,
[CLOCK_THREAD_CPUTIME_ID] = &clock_thread,
[CLOCK_MONOTONIC_RAW] = &clock_monotonic_raw,
[CLOCK_REALTIME_COARSE] = &clock_realtime_coarse,
[CLOCK_MONOTONIC_COARSE] = &clock_monotonic_coarse,
[CLOCK_BOOTTIME] = &clock_boottime,
[CLOCK_REALTIME_ALARM] = &alarm_clock,
[CLOCK_BOOTTIME_ALARM] = &alarm_clock,
[CLOCK_TAI] = &clock_tai,
};

这里的时间的含义是什么，我们获取到的是什么时间，这个问题下面再讨论。

3. 遗留问题

a. vdso的机制：vdso是如何让用户态不必陷入到内核获取到时间的？

b. clock_gettime能够获取到的各类时间有什么不同？

这两个问题可参考下一篇：Linux时间子系统2： clock_gettime的VDSO机制分析-优快云博客

Linux时间子系统1：gettimeofday和clock_gettime实现分析_linux gettimeofday-优快云博客

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。