Linq 用来实现集合(List, DataTable等) 的二次操作十分简便,这里介绍下用 Linq 对集合进行 Distinct 操作的几种方法。
0. 准备数据:
public class User
{
public string A;
public string B;
public string C;
public string D;
public override string ToString()
{
return string.Format("{0},{1},{2},{3}", A, B, C, D);
}
public static List<User> GetData()
{
return new List<User> {
new User { A = "a1", B = "b1", C = "c1", D = "d1" },
new User { A = "a1", B = "b1", C = "c1", D = "d1" },
new User { A = "a2", B = "b1", C = "c1", D = "d1" },
new User { A = "a1", B = "b2", C = "c1", D = "d1" },
new User { A = "a1", B = "b1", C = "c1", D = "d1" },
new User { A = "a1", B = "b1", C = "c2", D = "d1" },
new User { A = "a1", B = "b1", C = "c1", D = "d2" },
};
}
}
1. 使用GroupBy:对需要Distinct的字段进行分组,取组内的第一条记录这样结果就是Distinct的数据了。
Console.WriteLine("Distinct1 By: A");
var query1 = from e in User.GetData()
group e by new { e.A } into g
select g.FirstOrDefault();
foreach (var u in query1)
Console.WriteLine(u.ToString());
2. 使用Distinct()扩展方法:需要实现IEqualityComparer接口。
class UserCompare : IEqualityComparer<User>
{
public bool Equals(User x, User y)
{
return (x.A == y.A && x.B == y.B);
}
public int GetHashCode(User obj)
{
// return obj.GetHashCode();
return obj.ToString().ToLower().GetHashCode();
}
}Console.WriteLine("Distinct2 By: A,B");
var compare = new UserCompare();
var query2 = User.GetData().Distinct(compare);
foreach (var u in query2)
Console.WriteLine(u.ToString());
上面的实现中要注意GetHashCode()方法直接用obj.GetHashCode()的话,Distinct不能正常运行。
3. 自定义扩展方法DistinctBy(this IEnumerable source, Func keySelector)
public static class MyEnumerableExtensions
{
public static IEnumerable<TSource> DistinctBy<TSource, TKey>
(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
HashSet<TKey> seenKeys = new HashSet<TKey>();
foreach (TSource element in source)
{
if (seenKeys.Add(keySelector(element))) { yield return element; }
}
}
}
Console.WriteLine("Distinct3 By: A,B,C");
var query3 = User.GetData().DistinctBy(x => new { x.A, x.B, x.C });
foreach (var u in query3)
Console.WriteLine(u.ToString());
运行结果:
A B C D
a2,b1,c1,d1
a1,b2,c1,d1
a1,b1,c1,d1
a1,b1,c2,d1
a1,b1,c1,d2
----------------
Distinct1 By: A
a1,b1,c1,d1
a2,b1,c1,d1
----------------
Distinct2 By: A,B
a1,b1,c1,d1
a2,b1,c1,d1
a1,b2,c1,d1
a1,b1,c2,d1
a1,b1,c1,d2
----------------
Distinct3 By: A,B,C
a1,b1,c1,d1
a2,b1,c1,d1
a1,b2,c1,d1
a1,b1,c2,d1
----------------
本文介绍了使用Linq实现集合的去重操作方法,包括使用GroupBy、Distinct扩展方法及自定义扩展方法DistinctBy。通过具体示例展示了如何根据不同需求选择合适的去重策略。
535

被折叠的 条评论
为什么被折叠?



