面试之HashSet之源码剖析

最新推荐文章于 2025-09-11 15:53:09 发布

转载最新推荐文章于 2025-09-11 15:53:09 发布 · 71 阅读

0 ·

CC 4.0 BY-SA版权

原文链接：https://my.oschina.net/xlpapapa/blog/2999974

文章标签：

#面试 #java #python

2019独角兽企业重金招聘Python工程师标准>>>

1、最近的面试都问到了hashset存入实现同一类的两个对象，如果要去重要怎么做：

重写equal方法或hashcode方法，也就是说判断两个对象是否相等用到的是Object类的equals方法，而equals源码是

 public boolean equals(Object obj) {
        return (this == obj);
    }

此时可以直接重写 equals这个方法，还有就是让 this==obj为true

那么就要重写hashcode方法了

* 大意就是 hashcode方法是为了不同的类返回不同的integer类型for不同的对象，就像是每个对象独一无二的id

/**
     * Returns a hash code value for the object. This method is
     * supported for the benefit of hash tables such as those provided by
     * {@link java.util.HashMap}.
     * <p>
     * The general contract of {@code hashCode} is:
     * <ul>
     * <li>Whenever it is invoked on the same object more than once during
     *     an execution of a Java application, the {@code hashCode} method
     *     must consistently return the same integer, provided no information
     *     used in {@code equals} comparisons on the object is modified.
     *     This integer need not remain consistent from one execution of an
     *     application to another execution of the same application.
     * <li>If two objects are equal according to the {@code equals(Object)}
     *     method, then calling the {@code hashCode} method on each of
     *     the two objects must produce the same integer result.
     * <li>It is <em>not</em> required that if two objects are unequal
     *     according to the {@link java.lang.Object#equals(java.lang.Object)}
     *     method, then calling the {@code hashCode} method on each of the
     *     two objects must produce distinct integer results.  However, the
     *     programmer should be aware that producing distinct integer results
     *     for unequal objects may improve the performance of hash tables.
     * </ul>
     * <p>
    
     * As much as is reasonably practical, the hashCode method defined by
     * class {@code Object} does return distinct integers for distinct
     * objects. (This is typically implemented by converting the internal
     * address of the object into an integer, but this implementation
     * technique is not required by the
     * Java&trade; programming language.)
     *
     * @return  a hash code value for this object.
     * @see     java.lang.Object#equals(java.lang.Object)
     * @see     java.lang.System#identityHashCode
     */
    public native int hashCode();

2、既然了解了面试题，那么就要更深入的看看源码

2.1hashset源码底层是hashmap，构造函数初始化也就是new了个hashmap，怪不得也叫hash

/**
 * Constructs a new, empty set; the backing <tt>HashMap</tt> instance has
 * default initial capacity (16) and load factor (0.75).
 */
 public HashSet() {
     map = new HashMap<>();
 }

2.2接下来让我关心的是add方法，毕竟是set可以往里塞数据，那么底层是hashmap了，就要放入key和value，然而用hashset时只有放入一个变量

// Dummy value to associate with an Object in the backing Map
    private static final Object PRESENT = new Object();
/**
     * Adds the specified element to this set if it is not already present.
     * More formally, adds the specified element <tt>e</tt> to this set if
     * this set contains no element <tt>e2</tt> such that
     * <tt>(e==null&nbsp;?&nbsp;e2==null&nbsp;:&nbsp;e.equals(e2))</tt>.
     * If this set already contains the element, the call leaves the set
     * unchanged and returns <tt>false</tt>.
     *
     * @param e element to be added to this set
     * @return <tt>true</tt> if this set did not already contain the specified
     * element
     */
    public boolean add(E e) {
        return map.put(e, PRESENT)==null;
    }

看了源码才发现，传入的参数给成map的key，才能去重，value直接给了一个静态的Object对象常量。

综上所述，hashSet去重即hashMap源码中对key去重

hashmap在执行put方法时会调用putval方法

    /**
     * Associates the specified value with the specified key in this map.
     * If the map previously contained a mapping for the key, the old
     * value is replaced.
     *
     * @param key key with which the specified value is to be associated
     * @param value value to be associated with the specified key
     * @return the previous value associated with <tt>key</tt>, or
     *         <tt>null</tt> if there was no mapping for <tt>key</tt>.
     *         (A <tt>null</tt> return can also indicate that the map
     *         previously associated <tt>null</tt> with <tt>key</tt>.)
     */
    public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }

关键点来了，hash这个参数是 hash(key)，也就是调用Object的hashcode方法

static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }

那在putval中，要同时满足hash相等并且equals相等才能执行e=p的覆盖操作，实现方法，重写equals和hashCode 方法，记住这两个方法是要一起重写的，一个被重写，另一个也要被重写，有两种重写方式，一个是自己重写，一个系统自动生成，hashset=》hashmap的key就能去重。当一个key进到hashset时会先判断hashcode是否相等，若相等再用equals方法判断一遍，故两个方法都要重写

/**
     * Implements Map.put and related methods
     *
     * @param hash hash for key
     * @param key the key
     * @param value the value to put
     * @param onlyIfAbsent if true, don't change existing value
     * @param evict if false, the table is in creation mode.
     * @return previous value, or null if none
     */
    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

HashMap和HashSet中hasCode方法作用都是一样的，就是求出哈希值，然后找到在哈希值在线性数组中的位置。equals方法对于HashSet来说就是重复用的，如果对象A、B的哈希值相同，equals值相同那么对象A、B就是重复对象，去掉一个即可。

转载于:https://my.oschina.net/xlpapapa/blog/2999974