Linux基础IO-优快云博客

本文链接：https://blog.youkuaiyun.com/m0_74447750/article/details/146378493

Linux基础IO

C语言文件I/O与系统文件I/O的深入解析

C语言文件I/O与系统文件I/O的深入解析

文件操作是C语言编程的核心技能之一，也是理解操作系统底层机制的重要入口。本篇博客将从C标准库的文件I/O接口开始，逐步深入到Linux系统调用，覆盖文件描述符、重定向、缓冲区、文件系统等内容，并结合大量代码示例和详细解释，帮助读者从理论到实践全面掌握文件操作的精髓。

一、C语言文件I/O基础

1.1 C语言文件接口汇总

C语言提供了一套标准库函数，用于操作文件。这些函数封装了底层系统调用，提供了跨平台的便利性。以下是常用文件操作函数及其功能的详细列表：

文件操作函数	功能描述	返回值类型	典型使用场景
`fopen`	打开文件，支持多种模式（如读、写、追加）	`FILE*`	文件读写操作的起点
`fclose`	关闭文件，释放相关资源	`int`（0成功，-1失败）	文件操作结束时调用
`fputc`	向文件写入单个字符	`int`（写入的字符或EOF）	逐字符写入
`fgetc`	从文件读取单个字符	`int`（读取的字符或EOF）	逐字符读取
`fputs`	向文件写入字符串（不含结束符）	`int`（成功非负，失败EOF）	字符串写入
`fgets`	从文件读取一行或指定长度的字符串（含换行符）	`char*`（成功返回缓冲区指针，失败返回NULL）	按行读取
`fprintf`	按格式化字符串写入数据	`int`（写入的字符数）	格式化输出（如日志）
`fscanf`	按格式化字符串读取数据	`int`（成功读取的参数个数）	格式化输入
`fwrite`	向文件写入二进制数据（指定元素大小和个数）	`size_t`（写入的元素数）	二进制文件操作
`fread`	从文件读取二进制数据（指定元素大小和个数）	`size_t`（读取的元素数）	二进制文件操作
`fseek`	设置文件指针位置（支持相对起始、当前、末尾的偏移）	`int`（0成功，-1失败）	随机读写
`ftell`	返回当前文件指针相对于文件起始的偏移量	`long`（偏移量或-1）	获取文件位置
`rewind`	将文件指针重置到文件起始位置	`void`	重置文件指针
`ferror`	检查文件操作是否出错	`int`（0无错，非0有错）	错误检测
`feof`	检查文件指针是否到达文件末尾	`int`（0未到，非0已到）	文件结束判断

这些函数是C语言文件操作的核心工具。本文将重点讲解部分函数的用法，并通过示例展示其实际应用。

1.2 文件的打开与关闭（`fopen` 和 `fclose`）

`fopen` 的详细用法

fopen 是文件操作的入口，函数原型为：

FILE *fopen(const char *pathname, const char *mode);

pathname：可以是相对路径（如 "log.txt"）或绝对路径（如 "/home/user/log.txt"）。若文件不存在且模式允许，文件会被创建。
mode：指定打开方式，常见选项包括：
- "r"：只读，文件必须存在。
- "w"：只写，若文件存在则清空，不存在则创建。
- "a"：追加，若文件存在则追加，不存在则创建。
- "r+"：读写，从文件开头操作。
- "w+"：读写，清空或创建文件。
- "a+"：读写，写入追加到末尾，读取从开头开始。
- 添加 "b"（如 "rb"）：二进制模式，防止文本换行符转换。

返回值：

成功：返回 FILE* 指针。
失败：返回 NULL，可用 perror 查看错误原因（如权限不足或路径错误）。

示例：写入文件

#include <stdio.h>
int main() {
    FILE *fp = fopen("log.txt", "w"); // 以写模式打开，若文件存在则清空
    if (fp == NULL) {
        perror("fopen failed"); // 输出错误信息，如 "fopen failed: No such file or directory"
        return 1;
    }
    int count = 5;
    while (count--) {
        fputs("hello world\n", fp); // 写入字符串
    }
    fclose(fp); // 关闭文件
    return 0;
}

运行后，当前目录下生成 log.txt，内容为：

hello world
hello world
hello world
hello world
hello world

`fclose` 的注意事项

fclose 会刷新缓冲区并释放文件资源。若不调用，缓冲区数据可能丢失，尤其在程序异常退出时。返回值 0 表示成功，-1 表示失败（可能是文件描述符无效或写操作未完成）。

1.3 文件的顺序读写

写入操作（`fputc` 和 `fputs`）

fputc：逐字符写入，适合精确控制。

FILE *fp = fopen("log.txt", "w");
fputc('H', fp); // 写入字符 'H'
fclose(fp);

fputs：写入字符串，效率更高。

FILE *fp = fopen("log.txt", "w");
fputs("Hello, C!\n", fp);
fclose(fp);

读取操作（`fgetc` 和 `fgets`）

fgetc：逐字符读取，返回 int 类型以区分 EOF。

FILE *fp = fopen("log.txt", "r");
int ch;
while ((ch = fgetc(fp)) != EOF) {
    putchar(ch); // 输出到屏幕
}
fclose(fp);

fgets：按行读取，适合文本文件。

#include <stdio.h>
int main() {
    FILE *fp = fopen("log.txt", "r");
    if (fp == NULL) {
        perror("fopen");
        return 1;
    }
    char buffer[64];
    while (fgets(buffer, sizeof(buffer), fp) != NULL) {
        printf("%s", buffer); // 输出每一行
    }
    fclose(fp);
    return 0;
}

1.4 什么是当前路径？

定义与验证

在 fopen("log.txt", "w") 中，若未指定绝对路径，文件会在“当前路径”创建。那么，当前路径是什么？

当前路径（Current Working Directory, CWD）是进程运行时的工作目录，而非可执行文件所在目录。例如：

在 /home/user/BasicIO 下运行 ./myproc，log.txt 生成在 /home/user/BasicIO。
在 /home/user 下运行 BasicIO/myproc，log.txt 生成在 /home/user。

验证方法：

获取进程 PID（用 ps 或 getpid()）。
查看 /proc/<PID>/cwd，这是一个指向当前工作目录的软链接。

示例验证

#include <stdio.h>
#include <unistd.h>
int main() {
    printf("PID: %d\n", getpid());
    FILE *fp = fopen("log.txt", "w");
    fputs("test\n", fp);
    fclose(fp);
    pause(); // 暂停以便查看 /proc/<PID>/cwd
    return 0;
}

运行后，用 ls -l /proc/<PID>/cwd 检查当前路径。

1.5 默认打开的三个流

背景知识

Linux 下“一切皆文件”，键盘、显示器等设备也被抽象为文件。每个进程启动时，操作系统默认打开三个流：

stdin（文件描述符 0）：标准输入，通常关联键盘。
stdout（文件描述符 1）：标准输出，通常关联显示器。
stderr（文件描述符 2）：标准错误，通常关联显示器。

这些流是 FILE* 类型，定义在 stdio.h 中：

extern FILE *stdin;
extern FILE *stdout;
extern FILE *stderr;

使用示例

#include <stdio.h>
int main() {
    fputs("To stdout\n", stdout); // 输出到显示器
    fputs("To stderr\n", stderr); // 输出到显示器（错误流）
    char buf[64];
    fgets(buf, sizeof(buf), stdin); // 从键盘读取
    fputs(buf, stdout); // 回显输入
    return 0;
}

运行后，输入内容会回显到屏幕，stderr 输出通常用于错误信息。

为什么默认打开？

操作系统在进程创建时会初始化 files_struct，将 0、1、2 分配给标准输入输出流。这确保了程序无需显式打开即可使用键盘和显示器。

二、系统文件I/O

2.1 系统调用与库函数的区别

C 标准库函数（如 fopen）是对系统调用的封装：

Linux：封装 POSIX 接口（如 open、read）。
Windows：封装 Windows API（如 CreateFile）。
系统调用更底层，直接与内核交互，效率更高但可移植性较差。

2.2 `open` 函数详解

函数原型

#include <fcntl.h>
int open(const char *pathname, int flags, mode_t mode);

参数解析

pathname：
- 相对路径：基于当前工作目录。
- 绝对路径：从根目录开始。
- 示例："log.txt" 或 "/tmp/log.txt"。
flags：
- 常用标志：
  - O_RDONLY：只读。
  - O_WRONLY：只写。
  - O_RDWR：读写。
  - O_CREAT：不存在时创建。
  - O_APPEND：追加写入。
- 组合方式：用 | 运算符，如 O_WRONLY | O_CREAT。
- 实现原理：flags 是 32 位整数，每位代表一个选项。例如：
```
#define O_RDONLY  00   // 00000000
#define O_WRONLY  01   // 00000001
#define O_CREAT  0100  // 00000100
```
  通过位运算检查选项：
```
if (flags & O_CREAT) { /* 创建文件 */ }
```
mode：
- 指定新文件的权限（如 0666），表示 rw-rw-rw-。
- 受 umask 影响，最终权限为 mode & ~umask。默认 umask 通常为 0002，因此 0666 & ~0002 = 0664。
- 若不需创建文件，可省略此参数。

返回值

成功：返回文件描述符（非负整数）。
失败：返回 -1，设置 errno。

示例：连续打开文件

#include <stdio.h>
#include <fcntl.h>
#include <sys/stat.h>
int main() {
    umask(0); // 清除 umask 影响
    int fd1 = open("log1.txt", O_RDONLY | O_CREAT, 0666);
    int fd2 = open("log2.txt", O_RDONLY | O_CREAT, 0666);
    int fd3 = open("log3.txt", O_RDONLY | O_CREAT, 0666);
    printf("fd1: %d, fd2: %d, fd3: %d\n", fd1, fd2, fd3);
    close(fd1); close(fd2); close(fd3);
    return 0;
}

输出可能是 fd1: 3, fd2: 4, fd3: 5，因为 0、1、2 被默认流占用。

2.3 `close` 函数

函数原型

int close(int fd);

关闭指定文件描述符，释放资源。
返回值：0（成功），-1（失败，如 fd 无效）。

2.4 `write` 和 `read`

`write`：写入数据

ssize_t write(int fd, const void *buf, size_t count);

fd：目标文件描述符。
buf：数据缓冲区。
count：写入字节数。
返回值：成功写入的字节数，-1 表示失败。

示例：

#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
int main() {
    int fd = open("log.txt", O_WRONLY | O_CREAT, 0666);
    if (fd < 0) {
        perror("open");
        return 1;
    }
    const char *msg = "hello syscall\n";
    ssize_t written = write(fd, msg, strlen(msg));
    printf("Wrote %zd bytes\n", written);
    close(fd);
    return 0;
}

`read`：读取数据

ssize_t read(int fd, void *buf, size_t count);

返回值：读取的字节数，0（文件末尾），-1（失败）。

示例：

#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
int main() {
    int fd = open("log.txt", O_RDONLY);
    if (fd < 0) {
        perror("open");
        return 1;
    }
    char buf[64];
    ssize_t n;
    while ((n = read(fd, buf, sizeof(buf) - 1)) > 0) {
        buf[n] = '\0'; // 添加字符串结束符
        printf("%s", buf);
    }
    close(fd);
    return 0;
}

2.5 文件描述符（`fd`）

原理

文件描述符是进程访问文件的索引。内核为每个进程维护一个 task_struct，其中 files_struct 包含 fd_array 数组：

下标：文件描述符（如 0、1、2）。
值：指向 struct file 的指针，表示打开的文件。

默认打开的 0、1、2 分别对应键盘和显示器。新文件的 fd 从 3 开始。

分配规则

从最小未使用的下标分配。
示例：关闭 fd 0，新文件可能分配到 0。

#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
int main() {
    close(0); // 关闭 stdin
    int fd = open("log.txt", O_RDONLY | O_CREAT, 0666);
    printf("fd: %d\n", fd); // 输出 0
    close(fd);
    return 0;
}

2.6 重定向原理

输出重定向

将标准输出（fd 1）重定向到文件：

#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
int main() {
    close(1); // 关闭 stdout
    int fd = open("log.txt", O_WRONLY | O_CREAT | O_TRUNC, 0666);
    printf("hello redirect\n"); // 输出到 log.txt
    fflush(stdout); // 刷新缓冲区
    close(fd);
    return 0;
}

close(1) 释放 fd 1。
open 分配 fd 1 给新文件。
printf 数据写入 fd 1，即 log.txt。

追加重定向

添加 O_APPEND：

int fd = open("log.txt", O_WRONLY | O_CREAT | O_APPEND, 0666);

输入重定向

将标准输入（fd 0）重定向：

#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
int main() {
    close(0);
    int fd = open("log.txt", O_RDONLY);
    char buf[64];
    scanf("%s", buf); // 从 log.txt 读取
    printf("%s\n", buf);
    close(fd);
    return 0;
}

`stdout` 与 `stderr` 的区别

stdout（fd 1）和 stderr（fd 2）都默认输出到显示器，但重定向时仅影响 fd 1：

#include <stdio.h>
int main() {
    printf("To stdout\n"); // fd 1
    fprintf(stderr, "To stderr\n"); // fd 2
    return 0;
}

运行 ./a.out > log.txt，log.txt 只包含 To stdout，而 To stderr 仍显示在屏幕。

2.7 `dup2` 函数

函数原型

int dup2(int oldfd, int newfd);

将 oldfd 的文件描述符表项复制到 newfd，若 newfd 已打开则先关闭。
返回值：成功返回 newfd，失败返回 -1。

示例

#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
int main() {
    int fd = open("log.txt", O_WRONLY | O_CREAT, 0666);
    dup2(fd, 1); // 重定向 stdout 到 log.txt
    printf("hello dup2\n");
    close(fd);
    return 0;
}

2.8 为 minishell 添加重定向

实现思路

在简单 shell 中解析命令，识别 >、>> 和 <，并使用 dup2 实现重定向：

解析输入，分离命令和重定向目标。
根据重定向类型（输出、追加、输入）设置文件打开模式。
使用 dup2 修改标准流。

完整代码

#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#include <ctype.h>
#include <sys/wait.h>
#define LEN 1024
#define NUM 32
int main() {
    char cmd[LEN], *myargv[NUM];
    while (1) {
        printf("$ ");
        fgets(cmd, LEN, stdin);
        cmd[strlen(cmd) - 1] = '\0';

        int type = 0; // 0: >, 1: >>, 2: <
        char *file = NULL;
        char *p = cmd;
        while (*p) {
            if (*p == '>') {
                type = (*(p + 1) == '>') ? 1 : 0;
                *p = '\0';
                file = p + (type ? 2 : 1);
                break;
            } else if (*p == '<') {
                type = 2;
                *p = '\0';
                file = p + 1;
                break;
            }
            p++;
        }
        while (file && isspace(*file)) file++;

        int i = 0;
        myargv[i] = strtok(cmd, " ");
        while (myargv[i] && (myargv[++i] = strtok(NULL, " ")));

        pid_t pid = fork();
        if (pid == 0) {
            if (file) {
                int fd;
                if (type == 0) {
                    fd = open(file, O_WRONLY | O_CREAT | O_TRUNC, 0664);
                    dup2(fd, 1);
                } else if (type == 1) {
                    fd = open(file, O_WRONLY | O_CREAT | O_APPEND, 0664);
                    dup2(fd, 1);
                } else {
                    fd = open(file, O_RDONLY);
                    dup2(fd, 0);
                }
                if (fd < 0) {
                    perror("open");
                    exit(1);
                }
            }
            execvp(myargv[0], myargv);
            perror("execvp");
            exit(1);
        }
        wait(NULL);
    }
    return 0;
}

测试用例

echo hello > out.txt：将 hello 输出到 out.txt。
echo world >> out.txt：追加 world 到 out.txt。
cat < out.txt：从 out.txt 读取并输出。

三、文件系统与缓冲区

3.1 `FILE` 结构体剖析

文件描述符

FILE 是 C 库对文件的抽象，定义为 struct _IO_FILE 的别名（见 /usr/include/libio.h）。关键成员包括：

_fileno：封装的文件描述符。

示例：

FILE *fp = fopen("log.txt", "r");
printf("fd: %d\n", fileno(fp)); // 获取 _fileno

缓冲区

FILE 包含缓冲区相关指针：

_IO_write_base：写缓冲区起始。
_IO_write_ptr：当前写位置。
_IO_write_end：写缓冲区结束。

缓冲类型：

无缓冲：直接写入设备。
行缓冲：遇 \n 刷新（如 stdout 到显示器）。
全缓冲：缓冲满或手动刷新（如文件操作）。

3.2 缓冲区行为分析

示例

#include <stdio.h>
#include <unistd.h>
int main() {
    printf("printf\n"); // 行缓冲
    fputs("fputs\n", stdout);
    write(1, "write\n", 6); // 无缓冲
    fork();
    return 0;
}