why define size_t ssize_t ?

本文探讨了Linux中size_t和ssize_t类型的定义及其在不同平台上的应用。size_t用于表示大小信息,如字符串长度,ssize_t则用于表示带符号的数据块大小。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

"William Xuuu" <william_xuuu@163.com> wrote in message
news:87oefozb8g.fsf@163.com...[color=blue]
> This's a way of defining size_t and ssize_t in Linux:
>
> //"linux/types.h"
> typedef __kernel_size_t size_t;
> typedef __kernel_ssize_t ssize_t;
>
> //"asm/posix_types.h"
> typedef unsigned int __kernel_size_t;
> typedef int __kernel_ssize_t;
>
> It seems so tricky to me. What benefits do we get from such a tricky
> typedef ? better name (size_t is shorter than unsigned int) ?[/color]

On some 64-bit platforms, 'int' and 'unsigned int' may be 32-bits,
while addresses and even memory sizes could require 64 bits
(e.g. for the size of an allocated memory block > 4GB).
On such platforms, size_t could be typedef'd to a different
type, such as 'unsigned long long'.

Note also that 'size_t' is a typedef required by the ISO C standard
(it must be available if <stddef.h> is included). However, 'ssize_t'
does not exist in the C standard -- the standard 'ptrdiff_t'
typedef is nearly equivalent.


# 3
Old November 14th, 2005, 04:37 PM
 
Default Re: why define size_t ssize_t ?

William Xuuu wrote:[color=blue]
>
> Hi,
>
> This's a way of defining size_t and ssize_t in Linux:
>
> //"linux/types.h"
> typedef __kernel_size_t size_t;
> typedef __kernel_ssize_t ssize_t;
>
> //"asm/posix_types.h"
> typedef unsigned int __kernel_size_t;
> typedef int __kernel_ssize_t;
>
> It seems so tricky to me. What benefits do we get from such a tricky
> typedef ? better name (size_t is shorter than unsigned int) ?[/color]

On ssize_t I have no comment, since the C language's definition
does not incorporate such a type.

The size_t type is defined by ISO for use by functions such as
malloc and strlen (among others, of course), for representing
size information. It is required to be an unsigned integral
type, but need not be unsigned int. It could equally be
unsigned long or unsigned long long on some systems. It could
even (perhaps a touch pathologically) be unsigned char!

Consider the following code:

size_t len = strlen(s);

If we didn't have size_t, and wanted our code to run on the Zag,
Zeg, Zig, Zog, and Zug compilers, all of which have different
types to represent the size returned by strlen, we'd have to do
something like:

#if _ZAG
unsigned char len = strlen(s);
#elif _ZEG
unsigned short len = strlen(s);
#elif _ZIG
unsigned int len = strlen(s);
#elif _ZOG
unsigned long len = strlen(s);
#elif _ZUG
unsigned long long len = strlen(s);
#else
#error Unsupported platform.
#endif

The obvious step would be to add a typedef to handle this:

#if _ZAG
typedef unsigned char size_type;
#elif _ZEG
typedef unsigned short size_type;
#elif _ZIG
typedef unsigned int size_type;
#elif _ZOG
typedef unsigned long size_type;
#elif _ZUG
typedef unsigned long long size_type;
#else
#error Unsupported platform.
#endif

size_type len = strlen(s);

This is still a nuisance, although not such a bad nuisance, because
we can hide the cpp nonsense in a header.

If only we could persuade each implementor to include a standard
definition for strlen's return type, and put it in, say, stddef.h.
Then we wouldn't have to put all this nonsense in our own header,
and our code would be simpler as a result (and easier to port).

We're in luck. Fortunately, the C Standard actually /requires/
conforming implementations to do this; the name they settled on
was size_t.

I hope that answers your question.
  
Default Re: why define size_t ssize_t ?

William Xuuu wrote:[color=blue]
> Hi,
>
> This's a way of defining size_t and ssize_t in Linux:
>
> //"linux/types.h"
> typedef __kernel_size_t size_t;
> typedef __kernel_ssize_t ssize_t;
>
> //"asm/posix_types.h"
> typedef unsigned int __kernel_size_t;
> typedef int __kernel_ssize_t;
>
> It seems so tricky to me. What benefits do we get from such a tricky
> typedef ? better name (size_t is shorter than unsigned int) ?
>
> thanks.
>[/color]

Remember to use `size_t` as the type for indexing variables
used to index into C-strings. For example,

#include <stdio.h>

int
main ()
{
size_t index;
char * foo = "Example string.";

for (index = 0; index < (size_t) 5; ++index)
(void) putchar (foo[ index ]);

return 0;
}

Just my two cents. :)

Regards,
Jonathan.

--
"If unsigned integers look like two's complement (signed) integers,
it's because two's complement integers look like unsigned integers."
-Peter Nilsson
# 5
Old November 14th, 2005, 04:38 PM
 
Default Re: why define size_t ssize_t ?

Jonathan Burd wrote:[color=blue]
>
> Remember to use `size_t` as the type for indexing variables
> used to index into C-strings. For example,
>
> #include <stdio.h>
>
> int
> main ()
> {
> size_t index;
> char * foo = "Example string.";[/color]

char const *foo = "Example string.";
[color=blue]
>
> for (index = 0; index < (size_t) 5; ++index)[/color]

Obfuscatory cast. 5 will be converted to size_t anyway.
Admittedly there are inferior compilers that will
issue a warning for signed-unsigned comparison in the
case of (index < 5).

[color=blue]
> (void) putchar (foo[ index ]);[/color]

useless cast. Sorry, but I disagree with the philosophy
of putting (void) before every unused return value. It
adds volumes of useless crap to your source code and
achieves almost nothing. If you want to search your code
for failure to check malloc's return value then you can
grep for malloc, etc.
[color=blue]
>
> return 0;
> }[/color]
# 6
Old November 14th, 2005, 04:38 PM
 
Default Re: why define size_t ssize_t ?

infobahn <infobahn@btinternet.com> writes:
[...][color=blue]
> Consider the following code:
>
> size_t len = strlen(s);
>
> If we didn't have size_t, and wanted our code to run on the Zag,
> Zeg, Zig, Zog, and Zug compilers, all of which have different
> types to represent the size returned by strlen, we'd have to do
> something like:
>
> #if _ZAG
> unsigned char len = strlen(s);
> #elif _ZEG
> unsigned short len = strlen(s);
> #elif _ZIG
> unsigned int len = strlen(s);
> #elif _ZOG
> unsigned long len = strlen(s);
> #elif _ZUG
> unsigned long long len = strlen(s);
> #else
> #error Unsupported platform.
> #endif[/color]

Actually, if strlen() returned some unknown unsigned type, we could
just use:

unsigned long long len = strlen(s);

and let the result be converted implicitly. (It gets slightly more
complex if we need to worry about implementations that don't support
long long).

But of course size_t is still a useful thing to have.

--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
# 7
Old November 14th, 2005, 04:38 PM
 
Default Re: why define size_t ssize_t ?

Old Wolf wrote:[color=blue]
> Jonathan Burd wrote:
>[color=green]
>>Remember to use `size_t` as the type for indexing variables
>>used to index into C-strings. For example,
>>
>>#include <stdio.h>
>>
>>int
>>main ()
>>{
>> size_t index;
>> char * foo = "Example string.";[/color]
>
>
> char const *foo = "Example string.";
>
>[color=green]
>> for (index = 0; index < (size_t) 5; ++index)[/color]
>
>
> Obfuscatory cast. 5 will be converted to size_t anyway.
> Admittedly there are inferior compilers that will
> issue a warning for signed-unsigned comparison in the
> case of (index < 5).
>
>
>[color=green]
>> (void) putchar (foo[ index ]);[/color]
>
>
> useless cast. Sorry, but I disagree with the philosophy
> of putting (void) before every unused return value. It
> adds volumes of useless crap to your source code and
> achieves almost nothing. If you want to search your code
> for failure to check malloc's return value then you can
> grep for malloc, etc.
>
>[color=green]
>> return 0;
>>}[/color]
>
>[/color]

Perhaps, splint should be a bit more forgiving in that case.

--
"If unsigned integers look like two's complement (signed) integers,
it's because two's complement integers look like unsigned integers."
-Peter Nilsson
# 8
Old November 14th, 2005, 04:50 PM
 
Default Re: why define size_t ssize_t ?

infobahn <infobahn@btinternet.com> writes:
[color=blue]
> On ssize_t I have no comment, since the C language's definition
> does not incorporate such a type.
>
> The size_t type is defined by ISO for use by functions such as
> malloc and strlen (among others, of course), for representing[/color]

Hmm, thanks.

So i.e, for portable reasons, some special functions concerning with
memeory or address manipulations need size_t.

[...][color=blue]
> We're in luck. Fortunately, the C Standard actually /requires/
> conforming implementations to do this; the name they settled on
> was size_t.
>
> I hope that answers your question.[/color]

--
William

size_t 是为了方便系统之间的移植而定义的

在32位系统上 定义为 unsigned int
在64位系统上 定义为 unsigned long

更准确地说法是 在 32位系统上是32位无符号整形
在 64位系统上是64位无符号整形

size_t一般用来表示一种计数,比如有多少东西被拷贝等
sizeof操作符的结果类型是size_t,
该类型保证能容纳实现所建立的最大对象的字节大小。

它的意义大致是“适于计量内存中可容纳的数据项目个数的无符号整数类型”。
所以,它在数组下标和内存管理函数之类的地方广泛使用。


ssize_t:

这个数据类型用来表示可以被执行读写操作的数据块的大小.它和size_t类似,但必需是signed.

 

 

 

 

 

 

 

#include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <sys/wait.h> #include <sys/types.h> #include <signal.h> #define MAX_INPUT_SIZE 1024 #define MAX_ARGS 64 void execute_command(char **args); /* 执行命令 */ int main() { char input[MAX_INPUT_SIZE] = {0}; /* 存储用户输入缓冲区 */ char *args[MAX_ARGS] = {NULL}; /* 命令参数 */ char *token = NULL; /* 分割字符串 */ int arg_count = 0; /* 参数计数器 */ while (1) { printf("myshell > "); fflush(stdout); /* 获取输入, 处理 EOF */ if (fgets(input, sizeof(input), stdin) == NULL) { break; } /* 移除换行符 */ input[strcspn(input, "\n")] = '\0'; /* 处理 exit 命令 */ if (strcmp(input, "exit") == 0) { printf("Exiting shell...\n"); break; } /* 解析命令和参数 */ token = strtok(input, " "); while (token != NULL && arg_count < MAX_ARGS - 1) { args[arg_count++] = token; token = strtok(NULL, " "); } args[arg_count] = NULL; /* 执行命令 */ execute_command(args); } return 0; } /* 执行命令函数 */ void execute_command(char **args) { /* 检查是否为内置命令 */ if (strcmp(args[0], "stop") == 0) { printf("Error: 'stop' is not a command. It should appear in program output.\n"); return; } pid_t pid = 0; int pipefd[2] = {0}; if (pipe(pipefd) == -1) { perror("pipe"); exit(EXIT_FAILURE); } pid = fork(); if (pid == -1) { perror("fork"); exit(EXIT_FAILURE); } else if (pid == 0)/* 子进程 */ { /* 关闭读取端 */ close(pipefd[0]); /* 重定向输出到管道 */ dup2(pipefd[1], STDOUT_FILENO); close(pipefd[1]); /* 执行命令 */ execvp(args[0], args); /* 执行失败 */ fprintf(stderr, "myshell: %s: command not found\n", args[0]); exit(EXIT_FAILURE); } else /* 父进程 */ { close(pipefd[1]); /* 关闭写入端 */ char buffer[MAX_INPUT_SIZE] = {0}; /* 输出缓冲区 */ ssize_t bytes_read = 0; /* 读取字节数 */ ssize_t total_printed = 0; /* 已打印字节数 */ char *stop_pos = strstr(buffer, "stop"); /* 检测输出中是否包含 stop */ /* 读取子进程输出 */ while ((bytes_read = read(pipefd[0], buffer, sizeof(buffer)-1)) > 0) { /* 确保字符串终止 */ buffer[bytes_read] = '\0'; /* 检测输出中是否包含 stop */ if (stop_pos) { /* 1、打印 stop 之前的内容 */ size_t keep_len = stop_pos - buffer; if (keep_len > 0) { printf("%.*s", (int)keep_len, buffer); fflush(stdout); } /* 2、打印stop本身 */ printf("stop"); fflush(stdout); /* 3、终止子进程并打印消息 */ kill(pid, SIGTERM); printf("\nReceived 'stop' in output. Terminating child process.\n"); fflush(stdout); /* 跳过后续处理 */ continue; } else /* 未检测到 stop 时正常打印输出 */ { printf("%s", buffer); fflush(stdout); } } close(pipefd[0]); /* 关闭读取端 */ waitpid(pid, NULL, 0); /* 等待子进程结束 */ } } 这是我的主要功能程序 #include <stdio.h> int main() { for (int i = 0; i < 5; i++) { printf("count: %d\n", i); if (i == 3) { printf("stop"); } } printf("This line should not be printed.\n"); return 0; } 这是我的测试程序 现在运行了./main后,输入以下内容: myshell > ./te myshell: ./te: command not found myshell > ./test count: 0 count: 1 count: 2 count: 3 stop Received 'stop' in output. Terminating child process. myshell > cat test.c cat: cat: No such file or directory cat: cat: No such file or directory #include <stdio.h> int main() { for (int i = 0; i < 5; i++) { printf("count: %d\n", i); if (i == 3) { printf("stop Received 'stop' in output. Terminating child process. myshell > 为什么会输出两个cat再输出test.c中的内容
最新发布
08-08
<think>嗯,用户现在想了解C语言中size_t和ssize_t的区别。之前已经讨论过%d和%zd的格式说明符问题,看来用户对C语言的数据类型体系很感兴趣。这次的问题涉及到系统级编程中的关键数据类型,需要从底层实现和应用场景两个维度展开。用户可能正在开发跨平台应用或学习Linux系统编程,才会关注这种底层类型差异。考虑到ta之前提问的细致程度,应该需要包含具体代码示例和平台兼容性说明。首先需要明确size_t和ssize_t的本质区别:size_t是ISOC标准定义的无符号类型,而ssize_t是POSIX扩展的有符号类型。这个核心差异决定了它们的使用场景——size_t用于内存尺寸表示(如sizeof返回值),ssize_t用于可能返回负数的系统调用(如read/write的错误处理)。在32位和64位系统中的具体表现差异值得详细说明:虽然size_t总是匹配当前平台的地址总线宽度,但ssize_t的符号处理需要特别注意。比如在文件读写时,用size_t接收read返回值可能导致负数错误被解释为超大正整数。用户可能没意识到的问题是:使用printf输出这些类型时的格式说明符选择。虽然上次讨论过%zd,但ssize_t的格式化输出应该用%zd还是%ld?这需要结合平台特性说明(答案:推荐%zd,但Windows下需用%ld)。最后需要提醒边界条件的注意事项:size_t最大值比较要用减法而非直接比较,避免循环中i>=0的经典陷阱。这些实际编码经验比单纯讲理论更有价值。</think> 好的,我们来详细讲解 `size_t` 和 `ssize_t` 在C语言中的区别,这是系统编程中非常重要的两个数据类型。 ### 核心区别概览 | **特性** | **`size_t`** | **`ssize_t`** | | :---------------- | :------------------------------------------------ | :------------------------------------------------ | | **定义标准** | ISO C标准 (在 `<stddef.h>`、`<stdio.h>`等中定义) | **POSIX标准** (主要在类Unix系统,定义在 `<sys/types.h>` ) | | **符号性** | **无符号整数** (`unsigned`) | **有符号整数** (`signed`) | | **主要用途** | 表示内存中对象的大小、数组索引、循环计数 | 表示从可能返回错误(负值)的操作中获取的大小/数量 | | **返回值含义** | **仅表示非负数** ($\geq 0$) | **表示非负数或错误** ($\geq 0$:大小; $< 0$:错误) | | **最小值** | `0` | 平台定义,通常等于 `-1` | | **打印格式说明符** | `%zu` (C99/C11) | `%zd` 或 `%ld` (需注意平台宽度) | --- ### 1. `size_t`:表示大小的无符号类型 * **本质:** 一种**无符号**整数类型。 * **目的:** 专门用于表示内存中对象的大小和数组索引。它是 `sizeof` 操作符返回值的类型。 * **关键特性:** * **非负性:** 其值**总是** $\geq 0$。不能表示负数。 * **可移植性:** 其大小(位数)由实现(编译器和目标平台)定义,确保它能表示给定平台上**可能存在的最大对象的大小**。在32位系统上通常是32位(相当于 `unsigned int`),在64位系统上通常是64位(相当于 `unsigned long`)。 * **典型使用场景:** * 接收 `sizeof` 运算符的结果:`size_t size = sizeof(int);` * 作为数组索引:`for (size_t i = 0; i < array_len; i++) { ... }` * 表示内存分配、复制等操作涉及的数量:`memcpy(dest, src, num_bytes);` 其中 `num_bytes` 通常声明为 `size_t` 类型。 * 表示字符串长度:`size_t len = strlen(str);` * **输出:** C99/C11引入了 `%zu` 作为 `printf` 和 `scanf` 家族函数中专门用于 `size_t` 的格式说明符: ```c size_t file_size = 1024; printf("File size: %zu bytes\n", file_size); // 正确用法 ``` --- ### 2. `ssize_t`:表示大小或错误的有符号类型 * **本质:** 一种**有符号**整数类型。 * **目的 (POSIX):** 被POSIX标准(例如Linux, macOS等类Unix系统)引入,用于表示**可能返回错误(需要负值表示)的系统调用或库函数**所涉及的大小或字节计数。 * **关键特性:** * **非负或错误:** 如果函数执行成功且返回了一个大小或计数值,则返回值 $\geq 0$(含义与 `size_t` 相同)。如果函数执行**失败**,则**返回一个负值**(通常是 `-1`)来表示错误类型(具体的错误代码存储在全局变量 `errno` 中)。 * **符号位:** 其“符号位”被用来区分成功(非负)和失败(负),实际存储的非负大小值部分与 `size_t` 类似(在相同平台下通常和 `size_t` 有相同的位数,去掉符号位)。 * **典型使用场景 (POSIX系统):** * 用于 `read`, `write`, `send`, `recv` 等系统调用的返回值,表示实际读取/写入的字节数,或者错误 (`-1`)。 ```c ssize_t bytes_read = read(fd, buffer, sizeof(buffer)); if (bytes_read == -1) { perror("Read failed"); // 处理错误 } else { // 成功读取了 'bytes_read' 字节 } ``` * 某些返回列表中元素数量或其他计数值的函数,在出错时也可能返回 `(ssize_t)-1`。 * **输出:** POSIX没有为 `ssize_t` 指定特定的格式说明符。实践中常用 `%zd`(如果编译器支持且 `ssize_t` 与 `size_t` 宽度一致)或 `%ld`(并进行强制类型转换,但需注意 `ssize_t` 可能与 `long` 宽度不同): ```c printf("Bytes read: %zd\n", bytes_read); // 常见做法,假定ssize_t与size_t同宽 // 或更谨慎的做法(检查类型宽度): printf("Bytes read: %" PRIdPTR "\n", bytes_read); // 使用inttypes.h的宏(稍复杂) ``` --- ### 3. 核心区别总结 1. **有符号 vs. 无符号:** 这是最根本的区别。`size_t` **永不为负**,专注于表示大小/尺寸(`0`表示大小为0)。`ssize_t` **可正可负**,非负值表示成功操作的大小,负值(特别是 `-1`)专用于表示失败。 2. **标准不同:** * `size_t` 是 **ISO C标准的核心部分**,用于基本的大小表示。 * `ssize_t` 是 **POSIX标准的扩展**,主要服务于需要报告错误的系统级操作(尤其是I/O)。 3. **用途语义:** * `size_t` 仅负责传达 **“是多少”** (How much/big?)。 * `ssize_t` 不仅要传达 **“是多少”** (成功时),还要传达 **“操作是否成功”** (失败时返回负值)。 4. **相互关联:** 在底层实现上(不考虑符号位),`ssize_t` 能够表示的范围通常覆盖了 `size_t` 的最大值(例如在64位Linux上,两者通常都是64位宽度的整数)。简单地说(不严格精确):`ssize_t ≈ signed size_t`,加上专门用负值表示错误的约定。 --- ### 4. 重要注意事项 1. **可移植性:** `ssize_t` **不是** ISO C标准的一部分。如果你的代码需要**完全符合C标准**或在**非POSIX平台**(如Windows原生API)上运行,通常应避免使用 `ssize_t`。在Windows上,类似功能通常使用 `DWORD`(无符号,配合专门的错误检查函数如 `GetLastError()`)或 `int`/`long` 配合负值/特殊值(如 `INVALID_FILE_SIZE`)来模拟类似 `ssize_t` 的行为。 2. **比较:** 比较 `size_t` 和 `ssize_t` 变量时需小心符号问题。`size_t` 值在比较时会被提升为非常大的无符号数,可能导致意外结果。 3. **返回值检查:** 使用返回 `ssize_t` 类型的函数时,**务必检查**是否返回了负值(通常是 `-1`)以处理错误情况。 4. **正确打印:** 尽量使用正确的格式说明符(`%zu` 用于 `size_t`,`%zd` 或 `%ld` 用于 `ssize_t`,注意后者非标准化)以避免潜在的未定义行为(如溢出或截断)。 ### `size_t` 与 `ssize_t` 的选择指南 * 当操作**本身不会失败**或失败不是通过负返回值报告,并且值**永远不会为负**时(比如获取对象大小、数组索引),使用 `size_t`。 * 当操作**可能失败**,并且POSIX函数通过**返回负值**(特别是 `-1`)来报告该失败时,使用 `ssize_t`(在支持POSIX的系统上)。如果希望表示一个理论上可以为负的“偏移量”,即使在非POSIX平台,也可能需要定义类似 `ssize_t` 的类型。 理解 `size_t` 和 `ssize_t` 的区别对于编写健壮、可移植的系统级C代码至关重要。[^1][^2] --- **相关引文:** [^1]: `<stddef.h>` and `<stdio.h>` headers define size_t as an unsigned type for sizes and indices. [^2]: POSIX `<sys/types.h>` defines ssize_t as a signed type primarily for I/O operations that may indicate errors with negative return values. **相关问题:** 1. 在非POSIX平台(如Windows)上,如何安全地处理可能返回错误的大小或计数值? 2. `size_t` 类型变量的最大值是多少?如何在代码中获取它?`(SIZE_MAX)` 3. 如果函数返回的 `ssize_t` 值是 `-1`,如何获取具体的错误原因?`(errno)` 4. 使用 `%zu` 或 `%zd` 打印 `size_t` / `ssize_t` 时,在哪些旧编译器上可能不兼容?如何兼容?
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值