Rust 练习册 81：Grep与文件处理-优快云博客

在Unix/Linux系统中，grep是一个强大的文本搜索工具，它能使用正则表达式搜索文件并输出匹配的行。在 Exercism 的 “grep” 练习中，我们需要实现一个简化版的grep命令，支持多种搜索选项。这不仅能帮助我们掌握文件处理和正则表达式，还能深入学习Rust中的错误处理、命令行参数解析和系统编程。

什么是Grep问题？

Grep（Global Regular Expression Print）是Unix/Linux系统中一个经典的文本搜索工具。它能够：

在文件中搜索指定的模式
支持多种选项，如大小写不敏感、显示行号、反转匹配等
输出匹配的行或文件名

让我们先看看练习提供的结构和函数签名：

use anyhow::Error;

/// While using `&[&str]` to handle flags is convenient for exercise purposes,
/// and resembles the output of [`std::env::args`], in real-world projects it is
/// both more convenient and more idiomatic to contain runtime configuration in
/// a dedicated struct. Therefore, we suggest that you do so in this exercise.
///
/// In the real world, it's common to use crates such as [`clap`] or
/// [`structopt`] to handle argument parsing, and of course doing so is
/// permitted in this exercise as well, though it may be somewhat overkill.
///
/// [`clap`]: https://crates.io/crates/clap
/// [`std::env::args`]: https://doc.rust-lang.org/std/env/fn.args.html
/// [`structopt`]: https://crates.io/crates/structopt
#[derive(Debug)]
pub struct Flags;

impl Flags {
    pub fn new(flags: &[&str]) -> Self {
        unimplemented!(
            "Given the flags {:?} implement your own 'Flags' struct to handle flags-related logic",
            flags
        );
    }
}

pub fn grep(pattern: &str, flags: &Flags, files: &[&str]) -> Result<Vec<String>, Error> {
    unimplemented!(
        "Search the files '{:?}' for '{}' pattern and save the matches in a vector. Your search logic should be aware of the given flags '{:?}'",
        files,
        pattern,
        flags
    );
}

我们需要实现一个完整的grep工具，支持多种搜索选项和文件处理功能。

设计分析

1. 核心组件

Flags结构体：存储命令行选项
grep函数：执行搜索逻辑
文件处理：读取和搜索多个文件
模式匹配：支持正则表达式和普通字符串匹配

2. 支持的选项

根据测试用例，我们需要支持以下选项：

-n：显示行号
-l：只显示包含匹配项的文件名
-i：大小写不敏感匹配
-x：整行匹配
-v：反转匹配（显示不匹配的行）

完整实现

1. Flags结构体实现

use anyhow::Error;
use std::fs;
use regex::Regex;

#[derive(Debug, Clone)]
pub struct Flags {
    print_line_numbers: bool,
    print_file_names_only: bool,
    case_insensitive: bool,
    match_entire_line: bool,
    invert_match: bool,
}

impl Flags {
    pub fn new(flags: &[&str]) -> Self {
        let mut result = Flags {
            print_line_numbers: false,
            print_file_names_only: false,
            case_insensitive: false,
            match_entire_line: false,
            invert_match: false,
        };
        
        for flag in flags {
            match *flag {
                "-n" => result.print_line_numbers = true,
                "-l" => result.print_file_names_only = true,
                "-i" => result.case_insensitive = true,
                "-x" => result.match_entire_line = true,
                "-v" => result.invert_match = true,
                _ => {} // 忽略未知选项
            }
        }
        
        result
    }
    
    pub fn print_line_numbers(&self) -> bool {
        self.print_line_numbers
    }
    
    pub fn print_file_names_only(&self) -> bool {
        self.print_file_names_only
    }
    
    pub fn case_insensitive(&self) -> bool {
        self.case_insensitive
    }
    
    pub fn match_entire_line(&self) -> bool {
        self.match_entire_line
    }
    
    pub fn invert_match(&self) -> bool {
        self.invert_match
    }
}

2. grep函数实现

pub fn grep(pattern: &str, flags: &Flags, files: &[&str]) -> Result<Vec<String>, Error> {
    let mut results = Vec::new();
    let mut file_names_with_matches = Vec::new();
    
    // 根据选项构建正则表达式
    let regex_pattern = if flags.case_insensitive() {
        format!("(?i){}", pattern)
    } else {
        pattern.to_string()
    };
    
    let regex = Regex::new(&regex_pattern)?;
    
    // 处理每个文件
    for &file_name in files {
        let content = fs::read_to_string(file_name)?;
        let lines: Vec<&str> = content.lines().collect();
        
        let mut file_has_match = false;
        
        for (line_index, &line) in lines.iter().enumerate() {
            let line_number = line_index + 1;
            
            // 检查是否匹配
            let is_match = if flags.match_entire_line() {
                regex.is_match(line)
            } else {
                regex.is_match(line)
            };
            
            // 根据-v选项反转匹配结果
            let should_include = if flags.invert_match() {
                !is_match
            } else {
                is_match
            };
            
            if should_include {
                file_has_match = true;
                
                // 如果只需要文件名，且已经记录过该文件，则跳过
                if flags.print_file_names_only() {
                    break;
                }
                
                // 构建输出行
                let output_line = if files.len() > 1 {
                    if flags.print_line_numbers() {
                        format!("{}:{}:{}", file_name, line_number, line)
                    } else {
                        format!("{}:{}", file_name, line)
                    }
                } else {
                    if flags.print_line_numbers() {
                        format!("{}:{}", line_number, line)
                    } else {
                        line.to_string()
                    }
                };
                
                results.push(output_line);
            }
        }
        
        // 如果只需要文件名且当前文件有匹配项
        if flags.print_file_names_only() && file_has_match {
            file_names_with_matches.push(file_name.to_string());
        }
    }
    
    // 如果只需要文件名，返回文件名列表
    if flags.print_file_names_only() {
        Ok(file_names_with_matches)
    } else {
        Ok(results)
    }
}

3. 完整实现（包含依赖）

// 在Cargo.toml中添加依赖
// [dependencies]
// anyhow = "1.0"
// regex = "1.0"

use anyhow::Error;
use std::fs;
use regex::Regex;

#[derive(Debug, Clone)]
pub struct Flags {
    print_line_numbers: bool,
    print_file_names_only: bool,
    case_insensitive: bool,
    match_entire_line: bool,
    invert_match: bool,
}

impl Flags {
    pub fn new(flags: &[&str]) -> Self {
        let mut result = Flags {
            print_line_numbers: false,
            print_file_names_only: false,
            case_insensitive: false,
            match_entire_line: false,
            invert_match: false,
        };
        
        for flag in flags {
            match *flag {
                "-n" => result.print_line_numbers = true,
                "-l" => result.print_file_names_only = true,
                "-i" => result.case_insensitive = true,
                "-x" => result.match_entire_line = true,
                "-v" => result.invert_match = true,
                _ => {} // 忽略未知选项
            }
        }
        
        result
    }
    
    pub fn print_line_numbers(&self) -> bool {
        self.print_line_numbers
    }
    
    pub fn print_file_names_only(&self) -> bool {
        self.print_file_names_only
    }
    
    pub fn case_insensitive(&self) -> bool {
        self.case_insensitive
    }
    
    pub fn match_entire_line(&self) -> bool {
        self.match_entire_line
    }
    
    pub fn invert_match(&self) -> bool {
        self.invert_match
    }
}

pub fn grep(pattern: &str, flags: &Flags, files: &[&str]) -> Result<Vec<String>, Error> {
    let mut results = Vec::new();
    let mut file_names_with_matches = Vec::new();
    
    // 根据选项构建正则表达式
    let regex_pattern = if flags.case_insensitive() {
        format!("(?i){}", pattern)
    } else {
        pattern.to_string()
    };
    
    let regex = Regex::new(&regex_pattern)?;
    
    // 处理每个文件
    for &file_name in files {
        let content = fs::read_to_string(file_name)?;
        let lines: Vec<&str> = content.lines().collect();
        
        let mut file_has_match = false;
        
        for (line_index, &line) in lines.iter().enumerate() {
            let line_number = line_index + 1;
            
            // 检查是否匹配
            let is_match = if flags.match_entire_line() {
                regex.is_match(line)
            } else {
                regex.is_match(line)
            };
            
            // 根据-v选项反转匹配结果
            let should_include = if flags.invert_match() {
                !is_match
            } else {
                is_match
            };
            
            if should_include {
                file_has_match = true;
                
                // 如果只需要文件名，且已经记录过该文件，则跳过
                if flags.print_file_names_only() {
                    break;
                }
                
                // 构建输出行
                let output_line = if files.len() > 1 {
                    if flags.print_line_numbers() {
                        format!("{}:{}:{}", file_name, line_number, line)
                    } else {
                        format!("{}:{}", file_name, line)
                    }
                } else {
                    if flags.print_line_numbers() {
                        format!("{}:{}", line_number, line)
                    } else {
                        line.to_string()
                    }
                };
                
                results.push(output_line);
            }
        }
        
        // 如果只需要文件名且当前文件有匹配项
        if flags.print_file_names_only() && file_has_match {
            file_names_with_matches.push(file_name.to_string());
        }
    }
    
    // 如果只需要文件名，返回文件名列表
    if flags.print_file_names_only() {
        Ok(file_names_with_matches)
    } else {
        Ok(results)
    }
}

测试用例分析

通过查看测试用例，我们可以更好地理解需求：

#[test]
fn test_nonexistent_file_returns_error() {
    let pattern = "Agamemnon";
    let flags = Flags::new(&[]);
    let files = vec!["test_nonexistent_file_returns_error_iliad.txt"];
    assert!(grep(&pattern, &flags, &files).is_err());
}

不存在的文件应该返回错误。

#[test]
fn test_one_file_one_match_no_flags() {
    let pattern = "Agamemnon";
    let flags = Flags::new(&[]);
    let files = vec!["iliad.txt"];
    let expected = vec!["Of Atreus, Agamemnon, King of men."];
    // 应该返回匹配的行
}

单文件单匹配项，无选项时应返回匹配行。

#[test]
fn test_one_file_one_match_print_line_numbers_flag() {
    let pattern = "Forbidden";
    let flags = Flags::new(&["-n"]);
    let files = vec!["paradise_lost.txt"];
    let expected = vec!["2:Of that Forbidden Tree, whose mortal tast"];
    // 应该返回带行号的匹配行
}

使用-n选项时应显示行号。

#[test]
fn test_one_file_one_match_caseinsensitive_flag() {
    let pattern = "FORBIDDEN";
    let flags = Flags::new(&["-i"]);
    let files = vec!["paradise_lost.txt"];
    let expected = vec!["Of that Forbidden Tree, whose mortal tast"];
    // 应该进行大小写不敏感匹配
}

使用-i选项时应进行大小写不敏感匹配。

#[test]
fn test_one_file_one_match_print_file_names_flag() {
    let pattern = "Forbidden";
    let flags = Flags::new(&["-l"]);
    let files = vec!["paradise_lost.txt"];
    let expected = vec!["paradise_lost.txt"];
    // 应该只返回文件名
}

使用-l选项时应只返回包含匹配项的文件名。

#[test]
fn test_one_file_several_matches_inverted_flag() {
    let pattern = "Of";
    let flags = Flags::new(&["-v"]);
    let files = vec!["paradise_lost.txt"];
    // 应该返回不包含"Of"的行
}

使用-v选项时应返回不匹配的行。

#[test]
fn test_multiple_files_one_match_no_flags() {
    let pattern = "Agamemnon";
    let flags = Flags::new(&[]);
    let files = vec!["iliad.txt", "midsummer_night.txt", "paradise_lost.txt"];
    // 多文件时应包含文件名前缀
}

多文件搜索时应包含文件名前缀。

性能优化版本

考虑性能的优化实现：

use anyhow::Error;
use std::fs;
use regex::Regex;

#[derive(Debug, Clone)]
pub struct Flags {
    print_line_numbers: bool,
    print_file_names_only: bool,
    case_insensitive: bool,
    match_entire_line: bool,
    invert_match: bool,
}

impl Flags {
    pub fn new(flags: &[&str]) -> Self {
        let mut result = Flags {
            print_line_numbers: false,
            print_file_names_only: false,
            case_insensitive: false,
            match_entire_line: false,
            invert_match: false,
        };
        
        for flag in flags {
            match *flag {
                "-n" => result.print_line_numbers = true,
                "-l" => result.print_file_names_only = true,
                "-i" => result.case_insensitive = true,
                "-x" => result.match_entire_line = true,
                "-v" => result.invert_match = true,
                _ => {} // 忽略未知选项
            }
        }
        
        result
    }
}

pub fn grep(pattern: &str, flags: &Flags, files: &[&str]) -> Result<Vec<String>, Error> {
    let mut results = Vec::new();
    
    // 预编译正则表达式以提高性能
    let regex = if flags.case_insensitive {
        Regex::new(&format!("(?i){}", regex::escape(pattern)))?
    } else {
        Regex::new(&regex::escape(pattern))?
    };
    
    let is_multi_file = files.len() > 1;
    
    // 为结果预分配容量
    results.reserve(files.len() * 10); // 估计值
    
    for &file_name in files {
        let content = fs::read_to_string(file_name)?;
        
        if flags.print_file_names_only {
            // 对于-l选项，一旦找到匹配就停止
            for line in content.lines() {
                let is_match = if flags.match_entire_line {
                    regex.is_match(line)
                } else {
                    regex.is_match(line)
                };
                
                let should_include = if flags.invert_match {
                    !is_match
                } else {
                    is_match
                };
                
                if should_include {
                    results.push(file_name.to_string());
                    break;
                }
            }
        } else {
            // 处理每一行
            for (line_index, line) in content.lines().enumerate() {
                let line_number = line_index + 1;
                
                let is_match = if flags.match_entire_line {
                    regex.is_match(line)
                } else {
                    regex.is_match(line)
                };
                
                let should_include = if flags.invert_match {
                    !is_match
                } else {
                    is_match
                };
                
                if should_include {
                    let output = if is_multi_file {
                        if flags.print_line_numbers {
                            format!("{}:{}:{}", file_name, line_number, line)
                        } else {
                            format!("{}:{}", file_name, line)
                        }
                    } else {
                        if flags.print_line_numbers {
                            format!("{}:{}", line_number, line)
                        } else {
                            line.to_string()
                        }
                    };
                    
                    results.push(output);
                }
            }
        }
    }
    
    // 对于-l选项，去重
    if flags.print_file_names_only {
        results.sort();
        results.dedup();
    }
    
    Ok(results)
}

错误处理和边界情况

考虑更多边界情况的实现：

use anyhow::Error;
use std::fs;
use regex::Regex;

#[derive(Debug, Clone)]
pub struct Flags {
    print_line_numbers: bool,
    print_file_names_only: bool,
    case_insensitive: bool,
    match_entire_line: bool,
    invert_match: bool,
}

impl Flags {
    pub fn new(flags: &[&str]) -> Self {
        let mut result = Flags {
            print_line_numbers: false,
            print_file_names_only: false,
            case_insensitive: false,
            match_entire_line: false,
            invert_match: false,
        };
        
        for flag in flags {
            match *flag {
                "-n" => result.print_line_numbers = true,
                "-l" => result.print_file_names_only = true,
                "-i" => result.case_insensitive = true,
                "-x" => result.match_entire_line = true,
                "-v" => result.invert_match = true,
                _ => {} // 忽略未知选项
            }
        }
        
        result
    }
}

#[derive(Debug)]
pub enum GrepError {
    FileNotFound(String),
    InvalidPattern(String),
    EmptyPattern,
}

pub fn grep(pattern: &str, flags: &Flags, files: &[&str]) -> Result<Vec<String>, Error> {
    // 验证输入
    if pattern.is_empty() {
        return Err(anyhow::anyhow!("Pattern cannot be empty"));
    }
    
    let mut results = Vec::new();
    
    // 编译正则表达式
    let regex_pattern = if flags.case_insensitive {
        format!("(?i){}", regex::escape(pattern))
    } else {
        regex::escape(pattern)
    };
    
    let regex = Regex::new(&regex_pattern)
        .map_err(|e| anyhow::anyhow!("Invalid pattern '{}': {}", pattern, e))?;
    
    let is_multi_file = files.len() > 1;
    
    for &file_name in files {
        // 读取文件内容
        let content = fs::read_to_string(file_name)
            .map_err(|e| anyhow::anyhow!("Cannot read file '{}': {}", file_name, e))?;
        
        if flags.print_file_names_only {
            // 对于-l选项，一旦找到匹配就停止
            let mut file_has_match = false;
            for line in content.lines() {
                let is_match = if flags.match_entire_line {
                    regex.is_match(line)
                } else {
                    regex.is_match(line)
                };
                
                let should_include = if flags.invert_match {
                    !is_match
                } else {
                    is_match
                };
                
                if should_include {
                    file_has_match = true;
                    break;
                }
            }
            
            if file_has_match {
                results.push(file_name.to_string());
            }
        } else {
            // 处理每一行
            for (line_index, line) in content.lines().enumerate() {
                let line_number = line_index + 1;
                
                let is_match = if flags.match_entire_line {
                    regex.is_match(line)
                } else {
                    regex.is_match(line)
                };
                
                let should_include = if flags.invert_match {
                    !is_match
                } else {
                    is_match
                };
                
                if should_include {
                    let output = if is_multi_file {
                        if flags.print_line_numbers {
                            format!("{}:{}:{}", file_name, line_number, line)
                        } else {
                            format!("{}:{}", file_name, line)
                        }
                    } else {
                        if flags.print_line_numbers {
                            format!("{}:{}", line_number, line)
                        } else {
                            line.to_string()
                        }
                    };
                    
                    results.push(output);
                }
            }
        }
    }
    
    // 对于-l选项，去重并排序
    if flags.print_file_names_only {
        results.sort();
        results.dedup();
    }
    
    Ok(results)
}

扩展功能

基于基础实现，我们可以添加更多功能：

use anyhow::Error;
use std::fs;
use regex::Regex;

#[derive(Debug, Clone)]
pub struct Flags {
    print_line_numbers: bool,
    print_file_names_only: bool,
    case_insensitive: bool,
    match_entire_line: bool,
    invert_match: bool,
    quiet: bool,           // -q 选项：静默模式
    max_matches: Option<usize>, // 限制匹配数量
}

impl Flags {
    pub fn new(flags: &[&str]) -> Self {
        let mut result = Flags {
            print_line_numbers: false,
            print_file_names_only: false,
            case_insensitive: false,
            match_entire_line: false,
            invert_match: false,
            quiet: false,
            max_matches: None,
        };
        
        let mut i = 0;
        while i < flags.len() {
            match flags[i] {
                "-n" => result.print_line_numbers = true,
                "-l" => result.print_file_names_only = true,
                "-i" => result.case_insensitive = true,
                "-x" => result.match_entire_line = true,
                "-v" => result.invert_match = true,
                "-q" => result.quiet = true,
                "-m" => {
                    if i + 1 < flags.len() {
                        if let Ok(num) = flags[i + 1].parse::<usize>() {
                            result.max_matches = Some(num);
                            i += 1; // 跳过下一个参数
                        }
                    }
                }
                _ => {} // 忽略未知选项
            }
            i += 1;
        }
        
        result
    }
}

pub struct GrepEngine;

impl GrepEngine {
    pub fn new() -> Self {
        GrepEngine
    }
    
    pub fn grep(&self, pattern: &str, flags: &Flags, files: &[&str]) -> Result<Vec<String>, Error> {
        if pattern.is_empty() {
            return Err(anyhow::anyhow!("Pattern cannot be empty"));
        }
        
        let mut results = Vec::new();
        let mut match_count = 0;
        
        // 编译正则表达式
        let regex_pattern = if flags.case_insensitive {
            format!("(?i){}", regex::escape(pattern))
        } else {
            regex::escape(pattern)
        };
        
        let regex = Regex::new(&regex_pattern)
            .map_err(|e| anyhow::anyhow!("Invalid pattern '{}': {}", pattern, e))?;
        
        let is_multi_file = files.len() > 1;
        
        for &file_name in files {
            let content = fs::read_to_string(file_name)
                .map_err(|e| anyhow::anyhow!("Cannot read file '{}': {}", file_name, e))?;
            
            if flags.print_file_names_only {
                let mut file_has_match = false;
                for line in content.lines() {
                    let is_match = if flags.match_entire_line {
                        regex.is_match(line)
                    } else {
                        regex.is_match(line)
                    };
                    
                    let should_include = if flags.invert_match {
                        !is_match
                    } else {
                        is_match
                    };
                    
                    if should_include {
                        file_has_match = true;
                        break;
                    }
                }
                
                if file_has_match {
                    results.push(file_name.to_string());
                }
            } else {
                for (line_index, line) in content.lines().enumerate() {
                    let line_number = line_index + 1;
                    
                    let is_match = if flags.match_entire_line {
                        regex.is_match(line)
                    } else {
                        regex.is_match(line)
                    };
                    
                    let should_include = if flags.invert_match {
                        !is_match
                    } else {
                        is_match
                    };
                    
                    if should_include {
                        match_count += 1;
                        
                        // 检查是否达到最大匹配数
                        if let Some(max) = flags.max_matches {
                            if match_count > max {
                                break;
                            }
                        }
                        
                        // 静默模式下不收集结果
                        if !flags.quiet {
                            let output = if is_multi_file {
                                if flags.print_line_numbers {
                                    format!("{}:{}:{}", file_name, line_number, line)
                                } else {
                                    format!("{}:{}", file_name, line)
                                }
                            } else {
                                if flags.print_line_numbers {
                                    format!("{}:{}", line_number, line)
                                } else {
                                    line.to_string()
                                }
                            };
                            
                            results.push(output);
                        }
                    }
                }
            }
        }
        
        if flags.print_file_names_only {
            results.sort();
            results.dedup();
        }
        
        Ok(results)
    }
    
    // 统计匹配信息
    pub fn grep_with_stats(&self, pattern: &str, flags: &Flags, files: &[&str]) 
        -> Result<(Vec<String>, GrepStats), Error> {
        let results = self.grep(pattern, flags, files)?;
        
        // 这里可以添加统计信息的计算
        let stats = GrepStats {
            files_searched: files.len(),
            matches_found: results.len(),
        };
        
        Ok((results, stats))
    }
}

pub struct GrepStats {
    pub files_searched: usize,
    pub matches_found: usize,
}

// 便利函数
pub fn grep(pattern: &str, flags: &Flags, files: &[&str]) -> Result<Vec<String>, Error> {
    GrepEngine::new().grep(pattern, flags, files)
}

实际应用场景

Grep在实际开发中有以下应用：

日志分析：在大量日志文件中搜索特定模式
代码搜索：在代码库中查找特定函数或变量
数据处理：从大型数据文件中提取特定信息
系统管理：在配置文件中查找特定设置
安全审计：在系统文件中搜索可疑模式
文本处理：批量处理和过滤文本文件
DevOps工具：构建自动化工具和脚本

算法复杂度分析

时间复杂度：
- 文件读取：O(n)，其中n是文件总大小
- 正则表达式匹配：O(m×k)，其中m是总行数，k是平均每行长度
- 总体：O(n + m×k)
空间复杂度：
- 文件内容存储：O(n)
- 结果存储：O(p×q)，其中p是匹配行数，q是平均每行长度
- 总体：O(n + p×q)

与其他实现方式的比较

// 使用标准库实现（不使用正则表达式）
use anyhow::Error;
use std::fs;

pub fn grep_simple(pattern: &str, flags: &Flags, files: &[&str]) -> Result<Vec<String>, Error> {
    let mut results = Vec::new();
    
    for &file_name in files {
        let content = fs::read_to_string(file_name)?;
        
        for (line_index, line) in content.lines().enumerate() {
            let line_number = line_index + 1;
            
            // 使用简单字符串匹配而不是正则表达式
            let is_match = if flags.case_insensitive {
                line.to_lowercase().contains(&pattern.to_lowercase())
            } else {
                line.contains(pattern)
            };
            
            let should_include = if flags.invert_match {
                !is_match
            } else {
                is_match
            };
            
            if should_include {
                let output = if flags.print_line_numbers {
                    format!("{}:{}", line_number, line)
                } else {
                    line.to_string()
                };
                
                results.push(output);
            }
        }
    }
    
    Ok(results)
}

// 使用内存映射文件的高性能实现
use anyhow::Error;
use memmap::Mmap;
use std::fs::File;

pub fn grep_mmap(pattern: &str, flags: &Flags, files: &[&str]) -> Result<Vec<String>, Error> {
    let mut results = Vec::new();
    
    for &file_name in files {
        let file = File::open(file_name)?;
        let mmap = unsafe { Mmap::map(&file)? };
        let content = std::str::from_utf8(&mmap)?;
        
        for (line_index, line) in content.lines().enumerate() {
            let line_number = line_index + 1;
            
            let is_match = line.contains(pattern);
            let should_include = if flags.invert_match {
                !is_match
            } else {
                is_match
            };
            
            if should_include {
                let output = if flags.print_line_numbers {
                    format!("{}:{}", line_number, line)
                } else {
                    line.to_string()
                };
                
                results.push(output);
            }
        }
    }
    
    Ok(results)
}

// 流式处理大文件的实现
use anyhow::Error;
use std::fs::File;
use std::io::{BufRead, BufReader};

pub fn grep_streaming(pattern: &str, flags: &Flags, files: &[&str]) -> Result<Vec<String>, Error> {
    let mut results = Vec::new();
    
    for &file_name in files {
        let file = File::open(file_name)?;
        let reader = BufReader::new(file);
        
        for (line_index, line_result) in reader.lines().enumerate() {
            let line = line_result?;
            let line_number = line_index + 1;
            
            let is_match = line.contains(pattern);
            let should_include = if flags.invert_match {
                !is_match
            } else {
                is_match
            };
            
            if should_include {
                let output = if flags.print_line_numbers {
                    format!("{}:{}", line_number, line)
                } else {
                    line
                };
                
                results.push(output);
            }
        }
    }
    
    Ok(results)
}