42、LINQ 与集合操作全解析

原创于 2025-12-04 13:00:29 发布 · 10 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#LINQ #查询表达式 #集合操作

深入C#核心：从入门到精通专栏收录该内容

60 篇文章 ¥499.90

订阅专栏¥69.90

会员秒杀 ¥9.9 重磅福利

超级会员免费看

LINQ 与集合操作全解析

1. 查询表达式基础

查询表达式是一种强大的工具，它能让开发者以类似 SQL 的语法来操作集合。在查询表达式中， select 子句可以定义匿名类型。例如，将 IGrouping<TKey, TElement>.Key 重命名为 IsContextualKeyword ，并将子集合属性命名为 Items 。这样，嵌套的 foreach 循环就可以使用 wordGroup.Items 而不是直接使用 wordGroup 。

// 示例代码省略，可参考原书中相关内容

另外，虽然可以向匿名类型添加子集合中的项数，但由于可以通过 wordGroup.Items.Count() 获取，直接添加到匿名类型的好处不大。

2. 使用 `into` 进行查询延续

into 关键字可用于查询延续，在 groupby 查询之后，可以用它来扩展查询。它允许为 groupby 子句返回的每个项命名一个范围变量，然后作为额外查询命令（如 select 子句）的生成器。

using System; 
using System.Collections.Generic; 
using System.Linq;
// ...
private static void GroupKeywords1()
{
    var selection =
        from word in Keywords
        // ...
        group word by word.Contains('*') 
        into groups
        select new 
        { 
            IsContextualKeyword = groups.Key, 
            Items = groups 
        };
}    
// ...

使用 into 并非 groupby 子句特有的功能，所有查询表达式都可以使用。它提供了一种简写方式，避免编写多个单独的查询表达式，类似于管道运算符，将第一个查询的结果与第二个查询的结果结合起来。

3. 获取不同成员

在集合中，有时需要返回不同的项，即去除重复项。查询表达式没有明确的语法来实现这一点，但可以使用 Distinct() 查询操作符。

using System; 
using System.Collections.Generic; 
using System.Linq;
// ...
public static void ListMemberNames()
{
    IEnumerable<string> enumerableMethodNames = (
        from method in typeof(Enumerable).GetMembers(
            System.Reflection.BindingFlags.Static | 
            System.Reflection.BindingFlags.Public)
        select method.Name).Distinct();
    foreach(string method in enumerableMethodNames)
    {
        Console.Write(" {0},", method);
    }
}
// ...

在这个例子中， typeof(Enumerable).GetMembers() 返回 System.Linq.Enumerable 的所有成员列表，但很多成员是重载的，调用 Distinct() 可以消除列表中的重复名称。

4. 查询表达式的编译

实际上，查询表达式是对底层 API 的一系列方法调用。CIL 本身没有查询表达式的概念，支持查询表达式主要是通过 C# 编译器的更改实现的。编译器会将查询表达式转换为方法调用，例如某个查询表达式可能会转换为对 System.Linq.Enumerable 的 Where() 扩展方法的调用。

graph LR
    A[查询表达式] --> B[编译器]
    B --> C[方法调用]

5. 隐式执行的实现

将选择条件保存到变量中而不是在赋值时立即执行查询，这一功能通过委托实现。编译器将查询表达式转换为以委托为参数的方法调用。委托可以保存要执行的代码信息，因此可以存储起来，在后续需要时再执行。

对于实现 IQueryable<T> 的集合，lambda 表达式会被转换为表达式树。表达式树是一种递归分解为子表达式的层次数据结构，常用于将其转换为其他语言，如 SQL。

graph LR
    A[查询表达式] --> B[编译器]
    B --> C[委托或表达式树]
    C --> D[执行查询]

6. 查询表达式作为方法调用

尽管查询表达式功能强大且相对简单，但 CLR 和 IL 并不要求实现查询表达式。C# 编译器会将查询表达式转换为方法调用。

// 简单查询表达式
private static void ShowContextualKeywords1()
{
    // ...
}
// ...

// 查询表达式转换为标准查询操作符语法
private static void ShowContextualKeywords3()
{
    IEnumerable<string> selection = from word in Keywords
                                    where word.Contains('*')
                                    select word;
    // 转换后
    IEnumerable<string> selection = 
        Keywords.Where(word => word.Contains('*'));      
}
// ...

扩展方法和 lambda 表达式的组合提供了比查询表达式更强大的功能，并非所有方法调用都能转换为查询表达式，但查询表达式总是可以转换为方法表达式。一般来说，尽可能使用查询表达式，但在必要时依赖方法调用。对于复杂的查询，将其重构为多个语句甚至方法通常会更有帮助。

7. 集合接口与类型

7.1 集合接口概述

.NET 框架中有多种集合接口，理解这些接口有助于掌握不同集合的通用功能。主要的集合接口包括：
- IList<T> ：支持通过索引检索元素，类似于数组。
- IDictionary<TKey, TValue> ：支持通过键检索元素，键可以是任意类型。
- IComparable<T> ：用于排序操作，实现该接口的类型需要实现 CompareTo() 方法。
- ICollection<T> ：包含 Count 属性和 CopyTo() 方法，用于获取元素数量和将集合复制到数组。

7.2 `IList<T>` 与 `IDictionary<TKey, TValue>` 的区别

从某种意义上说，列表是字典的特殊情况，列表的“键”总是整数，且键集是从 0 开始的连续非负整数。在选择集合类时，如果需要通过索引检索元素，可使用实现 IList<T> 接口的类；如果需要通过键检索元素，则应选择实现 IDictionary<TKey, TValue> 接口的类。

7.3 `IComparable<T>` 与排序

IComparable<T> 接口对于实现排序操作非常重要。例如，当调用 List<T>.Sort() 方法时，需要元素类型实现 IComparable<T> 接口，以便确定元素的顺序。

7.4 使用 `IComparer<T>` 进行自定义排序

除了实现 IComparable<T> 接口，还可以通过传递实现 IComparer<T> 接口的元素来进行自定义排序。以下是一个示例：

class Contact
{
    public string FirstName { get; set; }
    public string LastName { get; set; } 
}

using System; 
using System.Collections.Generic;
class NameComparison : IComparer<Contact> 
{
    public int Compare(Contact x, Contact y)
    {
        int result;
        if (Contact.ReferenceEquals(x, y))
        {
            result = 0;
        }
        else
        {
            if (x == null)
            {
                result = 1;
            }
            else if (y == null)
            {
                result = -1;
            }
            else
            {
                result = StringCompare(x.LastName, y.LastName);
                if (result == 0)
                {
                    result = 
                        StringCompare(x.FirstName, y.FirstName);
                }
            }
        }
        return result;
    }

    private static int StringCompare(string x, string y)
    {
        int result;
        if (x == null)
        {
            if (y == null) 
            {
                result = 0;
            }
            else 
            {
                result = 1;
            }
        }
        else
        {
            result = x.CompareTo(y);                 
        }
        return result;
    } 
}

使用时，将 NameComparison 实例传递给 List<Contact>.Sort() 方法：

List<Contact> contacts = new List<Contact>();
// 添加联系人
contacts.Sort(new NameComparison());

7.5 `ICollection<T>` 接口

ICollection<T> 接口派生自 IEnumerable<T> ，包含 Count 属性和 CopyTo() 方法。 Count 属性返回集合中的元素总数，但仅靠该属性无法使用 for 循环遍历集合，因为该接口不支持通过索引检索元素。 CopyTo() 方法可将集合转换为数组，使用时需要确保目标数组有足够的容量。

8. 主要集合类

8.1 列表集合： `List<T>`

List<T> 类的属性与数组类似，但它会随着元素数量的增加自动扩展，也可以通过 TrimToSize() 或 Capacity 方法缩小。列表集合的特点是可以通过索引单独访问每个元素。

using System; 
using System.Collections.Generic;
class Program 
{
    static void Main()
    {
        List<string> list = new List<string>();
        // 列表会自动扩展
        list.Add("Sneezy");
        list.Add("Happy");
        list.Add("Dopey");
        list.Add("Doc");
        list.Add("Sleepy");
        list.Add("Bashful");
        list.Add("Grumpy");
        list.Sort();
        Console.WriteLine(
            "In alphabetical order {0} is the "
            + "first dwarf while {1} is the last.", 
            list[0], list[6]);
        list.Remove("Grumpy");
    } 
}

在上述示例中，C# 是基于 0 索引的，因此索引 0 对应第一个元素，索引 6 对应第七个元素。通过索引检索元素不需要搜索，而是直接定位到内存位置。

8.2 列表的搜索方法

List<T> 提供了多种搜索方法，如 Contains() 、 IndexOf() 、 LastIndexOf() 和 BinarySearch() 。前三种方法按顺序遍历数组，直到找到匹配的元素，执行时间与搜索的元素数量成正比。

BinarySearch() 使用二分查找算法，要求元素已排序。如果元素未找到，会返回一个负整数，其按位取反的结果是比要查找的元素大的下一个元素的索引，或元素总数（如果没有更大的值），这为在列表中插入新值以保持排序提供了方便。

using System; 
using System.Collections.Generic;
class Program 
{
    static void Main()
    {
        List<string> list = new List<string>();
        int search;
        list.Add("public");
        list.Add("protected");
        list.Add("private");
        list.Sort();
        search = list.BinarySearch("protected internal");
        if (search < 0)
        {
            list.Insert(~search, "protected internal");
        }
        foreach (string accessModifier in list)
        {
            Console.WriteLine(accessModifier);
        }
    } 
}

8.3 使用 `FindAll()` 查找多个项

当搜索条件比较复杂时，可以使用 List<T> 的 FindAll() 方法。该方法接受一个 Predicate<T> 类型的参数，即委托。

using System; 
using System.Collections.Generic;
class Program 
{
    static void Main()
    {
        List<int> list = new List<int>();
        list.Add(1);
        list.Add(2);
        list.Add(3);
        list.Add(2);
        List<int> results = list.FindAll(Even);
        foreach(int number in results)
        {
            Console.WriteLine(number);
        }
    }

    public static bool Even(int value)
    {
        return (value % 2) == 0;
    } 
}

在上述示例中， Even() 方法作为委托传递给 FindAll() 方法，用于查找列表中的偶数。

综上所述，掌握查询表达式和各种集合类型及接口，能够让开发者更高效地处理集合数据，提升开发效率和代码质量。

9. 集合操作的高级应用与优化

9.1 集合操作的性能考虑

在使用不同的集合类和操作时，性能是一个重要的考虑因素。以下是一些常见集合操作的性能分析：
| 操作 | List<T> | Dictionary<TKey, TValue> |
| ---- | ---- | ---- |
| 插入元素 | 平均 O(1)，可能需要扩容 O(n) | O(1) |
| 通过索引/键查找元素 | O(1) | O(1) |
| 搜索元素 | O(n) | O(1) |
| 删除元素 | O(n) | O(1) |

从表格中可以看出， Dictionary<TKey, TValue> 在插入、查找和删除操作上通常具有更好的性能，而 List<T> 在通过索引访问元素时表现出色。因此，在选择集合类时，需要根据具体的使用场景来决定。

9.2 优化查询表达式

为了提高查询表达式的性能，可以采取以下几个优化措施：
1. 减少不必要的查询操作 ：避免在查询中进行重复的计算和筛选，尽量将不必要的操作提前处理。
2. 使用延迟执行 ：查询表达式通常是延迟执行的，这意味着在实际需要结果时才会执行查询。合理利用这一特性可以避免不必要的计算。
3. 选择合适的集合类型 ：根据查询的需求选择合适的集合类型，如使用 Dictionary<TKey, TValue> 进行快速查找。

9.3 集合的并发操作

在多线程环境下，对集合进行并发操作需要特别注意。以下是一些处理并发集合操作的方法：
- 使用并发集合类 ：.NET 提供了一些并发集合类，如 ConcurrentDictionary<TKey, TValue> 、 ConcurrentBag<T> 等，这些类在多线程环境下可以安全地进行操作。
- 使用锁机制 ：如果需要使用普通的集合类，可以使用锁机制来确保线程安全。例如：

using System;
using System.Collections.Generic;
using System.Threading;

class Program
{
    private static readonly object _lock = new object();
    private static List<int> _list = new List<int>();

    static void Main()
    {
        Thread t1 = new Thread(AddItems);
        Thread t2 = new Thread(AddItems);

        t1.Start();
        t2.Start();

        t1.Join();
        t2.Join();

        foreach (int item in _list)
        {
            Console.WriteLine(item);
        }
    }

    static void AddItems()
    {
        for (int i = 0; i < 10; i++)
        {
            lock (_lock)
            {
                _list.Add(i);
            }
        }
    }
}

10. 自定义集合的创建

10.0 创建自定义集合的步骤

创建自定义集合可以让我们根据特定的需求来设计集合的行为。以下是创建自定义集合的一般步骤：
1. 定义集合类 ：创建一个类来表示自定义集合，并实现相应的集合接口。
2. 实现集合接口方法 ：根据需要实现 IEnumerable<T> 、 ICollection<T> 、 IList<T> 等接口的方法。
3. 提供必要的属性和方法 ：如 Count 属性、 Add() 方法、 Remove() 方法等。

10.1 实现 `IEnumerable<T>` 接口

实现 IEnumerable<T> 接口可以让自定义集合支持 foreach 循环。以下是一个简单的示例：

using System;
using System.Collections;
using System.Collections.Generic;

class CustomCollection<T> : IEnumerable<T>
{
    private T[] _items;

    public CustomCollection(T[] items)
    {
        _items = items;
    }

    public IEnumerator<T> GetEnumerator()
    {
        for (int i = 0; i < _items.Length; i++)
        {
            yield return _items[i];
        }
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }
}

使用示例：

class Program
{
    static void Main()
    {
        int[] numbers = { 1, 2, 3, 4, 5 };
        CustomCollection<int> collection = new CustomCollection<int>(numbers);

        foreach (int number in collection)
        {
            Console.WriteLine(number);
        }
    }
}

10.2 实现 `ICollection<T>` 接口

实现 ICollection<T> 接口可以让自定义集合支持更多的集合操作，如 Count 属性和 Add() 方法。以下是一个示例：

using System;
using System.Collections;
using System.Collections.Generic;

class CustomCollection<T> : ICollection<T>
{
    private List<T> _items = new List<T>();

    public int Count
    {
        get { return _items.Count; }
    }

    public bool IsReadOnly
    {
        get { return false; }
    }

    public void Add(T item)
    {
        _items.Add(item);
    }

    public void Clear()
    {
        _items.Clear();
    }

    public bool Contains(T item)
    {
        return _items.Contains(item);
    }

    public void CopyTo(T[] array, int arrayIndex)
    {
        _items.CopyTo(array, arrayIndex);
    }

    public bool Remove(T item)
    {
        return _items.Remove(item);
    }

    public IEnumerator<T> GetEnumerator()
    {
        return _items.GetEnumerator();
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }
}

11. 总结

通过对查询表达式和各种集合类型及接口的学习，我们可以更高效地处理集合数据。查询表达式提供了一种简洁的语法来操作集合，而不同的集合类型和接口则满足了各种不同的需求。

在实际开发中，我们需要根据具体的场景选择合适的集合类型和操作方法，同时注意性能和并发问题。创建自定义集合可以让我们根据特定的需求来设计集合的行为，进一步提升代码的灵活性和可维护性。希望本文的内容能够帮助你更好地掌握 LINQ 和集合操作，提高开发效率和代码质量。

graph LR
    A[选择合适集合类型] --> B[提高性能]
    B --> C[高效处理数据]
    D[使用查询表达式] --> C
    E[创建自定义集合] --> C

以上就是关于 LINQ 与集合操作的全面解析，希望对你有所帮助。