HashMap的八股背了，那HashSet的呢？

最新推荐文章于 2025-12-19 15:39:34 发布

原创最新推荐文章于 2025-12-19 15:39:34 发布 · 464 阅读

7 ·

CC 4.0 BY-SA版权

文章标签：

#java #哈希算法 #开发语言 #HashSet

一、核心关系解析

HashSet是Java集合框架中的重要成员，它的底层实现完全基于HashMap。理解HashMap是掌握HashSet的关键。

// HashSet的类定义
public class HashSet<E>
    extends AbstractSet<E>
    implements Set<E>, Cloneable, java.io.Serializable {
    
    private transient HashMap<E,Object> map; // 关键：使用HashMap存储元素
    
    // 虚拟Object对象，作为HashMap的值
    private static final Object PRESENT = new Object();
    
    // 构造方法都初始化了内部的HashMap
    public HashSet() {
        map = new HashMap<>();
    }
    
    // 其他构造方法...
}

二、实现原理对比

1. 数据结构差异

集合类型	存储结构	元素组成	值处理方式
HashMap	数组+链表+红黑树(JDK8)	键值对Entry	键和值都存储实际数据
HashSet	同HashMap	仅使用Key	值固定为PRESENT虚拟对象

2. 核心操作实现

添加元素对比：

// HashMap的put方法（简化）
public V put(K key, V value) {
    return putVal(hash(key), key, value, false, true);
}

// HashSet的add方法
public boolean add(E e) {
    return map.put(e, PRESENT)==null; // 值固定为PRESENT
}

包含检查对比：

// HashMap的containsKey
public boolean containsKey(Object key) {
    return getNode(hash(key), key) != null;
}

// HashSet的contains
public boolean contains(Object o) {
    return map.containsKey(o); // 直接复用HashMap方法
}

删除元素对比：

// HashMap的remove
public V remove(Object key) {
    Node<K,V> e;
    return (e = removeNode(hash(key), key, null, false, true)) == null ?
        null : e.value;
}

// HashSet的remove
public boolean remove(Object o) {
    return map.remove(o)==PRESENT; // 检查是否返回了PRESENT
}

三、JDK1.8特性实现

1. 红黑树转换机制

HashSet同样受益于HashMap的红黑树优化：

// HashSet添加大量元素时的树化过程示例
Set<Integer> set = new HashSet<>();
// 模拟哈希冲突：所有元素进入同一个桶
for (int i = 0; i < 100; i++) {
    set.add(i * 16); // 精心选择的hash冲突值
}
// 内部会触发HashMap的树化逻辑

树化阈值同样遵循HashMap的规则：

链表长度 > 8 且 table.length ≥ 64 时转为红黑树
树节点数 < 6 时退化为链表

2. 哈希计算优化

共用相同的hash算法：

// HashMap和HashSet共用的hash方法
static final int hash(Object key) {
    int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

四、关键差异总结

特性	HashMap	HashSet
存储内容	键值对	单元素（实际只使用key）
重复判定	基于key的equals和hashCode	基于元素的equals和hashCode
空值支持	允许1个null key和多个null value	允许1个null元素
迭代顺序	不保证	不保证
性能消耗	更高（需存储value）	更低（value为固定对象）
典型用途	键值对关联存储	去重集合操作

五、使用场景示例

1. HashMap典型用例

// 统计单词频率
Map<String, Integer> wordCount = new HashMap<>();
String text = "hello world hello java";
for (String word : text.split(" ")) {
    wordCount.put(word, wordCount.getOrDefault(word, 0) + 1);
}
// 输出：{world=1, java=1, hello=2}

2. HashSet典型用例

// 去重操作
Set<String> uniqueWords = new HashSet<>();
String text = "hello world hello java";
uniqueWords.addAll(Arrays.asList(text.split(" ")));
// 输出：[world, java, hello]

// 集合运算示例
Set<String> set1 = new HashSet<>(Arrays.asList("a", "b", "c"));
Set<String> set2 = new HashSet<>(Arrays.asList("b", "c", "d"));

// 并集
Set<String> union = new HashSet<>(set1);
union.addAll(set2); // [a, b, c, d]

// 交集
Set<String> intersection = new HashSet<>(set1);
intersection.retainAll(set2); // [b, c]

// 差集
Set<String> difference = new HashSet<>(set1);
difference.removeAll(set2); // [a]

六、注意事项

线程安全：两者都不是线程安全的，多线程环境下应使用：

Set<String> safeSet = Collections.synchronizedSet(new HashSet<>());
Map<String, String> safeMap = Collections.synchronizedMap(new HashMap<>());
// 或者
ConcurrentHashMap<String, String> concurrentMap = new ConcurrentHashMap<>();

初始化容量：预先知道元素数量时，应指定初始容量避免频繁扩容

// 预计存储100个元素，负载因子0.75
new HashSet<>(133); // 100/0.75 = 133.33
new HashMap<>(133);

对象可变性：若将可变对象作为key，修改后会导致查找失败

Set<List<String>> set = new HashSet<>();
List<String> list = new ArrayList<>();
list.add("item");
set.add(list);
list.add("modified"); // 修改后hashCode改变
System.out.println(set.contains(list)); // 可能返回false

性能监控：可通过JMX监控集合状态

// 获取HashSet底层HashMap的状态
HashSet<?> set = new HashSet<>();
Field mapField = HashSet.class.getDeclaredField("map");
mapField.setAccessible(true);
HashMap<?,?> internalMap = (HashMap<?,?>) mapField.get(set);

多看看，多理解，多背背，在面试时展示，让面试官目瞪口呆。。。。。。