Java基础复习-集合

Unisky大约 11 分钟

Java基础复习-集合

0.什么是Java Collection(集合类)

Java的Collection分为两个大类接口，一类是Map，另一类是Collection，这两个接口又依次往下派生了各种集合类型。最主要的四大类型是List, Set, Queue, Map。具体的派生情况如下图所示

List 以索引来存取元素，有序的，元素是允许重复的，可以插入多个null；
Set 不能存放重复元素，无序的，只允许一个null；
Map 保存键值对映射；List 底层实现有数组、链表两种方式；
Set、Map 容器有基于哈希存储和红黑树两种方式实现；
Set 基于 Map 实现，Set 里的元素值就是 Map的键值

在这些派生集合类中，我们最常用的是ArrayList, LinkedList, HashMap, PriorityQueue, Deque, HashSet

集合方法接口

（1）单元素添加、删除操作

//将对象添加给集合
boolean add(Object o)

//如果集合中有与o相匹配的对象，则删除对象o，若有多个匹配则只删第一个
boolean remove(Object o)

（2）查询操作

//返回当前集合中元素的数量
int size()

//判断结合中是否有任何元素
boolean isEmpty()

//查找集合中是否含有元素o
boolean contains(Object o)

//返回一个迭代器，用来访问集合中的各个元素
Iterator iterator()

（3）组操作：作用于元素或整个结合

//查找集合中是否含有集合c中的所有元素
boolean containsAll(Collection c)

//将集合c中的所有元素添加给该集合
boolean addAll(Collectiion c)

//删除集合中所有的元素
void clear()

//从集合中删除集合c中的所有元素
void removeAll(Collection c)

//从集合中删除集合c中不包含的元素
void retainAll(Collection c)

（4）Collection转换为Object数组

///返回一个内含集合所有元素的数组
Object[] toArray()

//返回一个内含集合所有元素的数组。运行期返回的数组和参数a的类型相同
Object[] toArray(Object[] a)

做完了对集合类的简单介绍，下面将针对这些集合类的底层源码做分析

1.ArrayList

ArrayList是我们最常用的一种数据结构，我们通常将其作为动态数组使用。

ArrayList底层通过一个elementData[]数组用于储存数据，在调用add()方法时，会进行容量判断。如果数组可以容纳新元素，则直接将元素加入数组中，无事发生。而如果数组不能够容纳新元素，那么就需要调用grow()方法进行扩容,一般来说ArrayList扩容是扩1.5倍。

public boolean add(E e) {
    //判断是否可以容纳e，若能，则直接添加在末尾；若不能，则进行扩容，然后再把e添加在末尾
    ensureCapacityInternal(size + 1);  // Increments modCount!!
    //将e添加到数组末尾
    elementData[size++] = e;
    return true;
    }

private void grow(int minCapacity) {
        // 获取elementData数组的内存空间长度
        int oldCapacity = elementData.length;
        // 扩容至原来的1.5倍
        int newCapacity = oldCapacity + (oldCapacity >> 1);
        //校验容量是否够
        if (newCapacity - minCapacity < 0)
            newCapacity = minCapacity;
        //若预设值大于默认的最大值，检查是否溢出
        if (newCapacity - MAX_ARRAY_SIZE > 0)
            newCapacity = hugeCapacity(minCapacity);
        // 调用Arrays.copyOf方法将elementData数组指向新的内存空间
         //并将elementData的数据复制到新的内存空间
        elementData = Arrays.copyOf(elementData, newCapacity);

2.Vector

与ArrayList大致相同，扩容扩2倍，多了同步机制使得线程安全，但效率较低，不推荐使用。

3.Stack

栈数据结构，但由于继承了Vector类及其所有方法，与栈操作严重冲突，目前已被废弃。在jdk8,官方推荐使用同样具有LIFO功能的双端队列代替栈，实际上Java目前不提供严格意义上的Stack数据结构

//使用Deque代替栈
Deque<E> stack = new ArrayDeque<>();

4.LinkedList

LinkedList同时继承了List和Queue两个接口，这种集合是作为双向链表实现的，线程不安全。

对于index访问的方法，链表需要移动指针在数组中逐渐寻找，效果不如ArrayList

而对于元素的增删，LinkedList则更优，因为无需考虑扩容、复制等操作，仅仅需要修改指针即可。

LinkedList的链表结点使用一个Node类实现,一个Node包括他两侧指向的Node和储存的元素

    private static class Node<E> {
        E item;
        Node<E> next;
        Node<E> prev;

        Node(Node<E> prev, E element, Node<E> next) {
            this.item = element;
            this.next = next;
            this.prev = prev;
        }
    }

每个LinkedList都可以直接访问首尾的两个Node，通过调用addFirst(),addLast(),getFirst(),getLast()等方法实现

public class LinkedList<E>
    extends AbstractSequentialList<E>
    implements List<E>, Deque<E>, Cloneable, java.io.Serializable
{
    transient int size = 0;

    /**
     * Pointer to first node.
     */
    transient Node<E> first;

    /**
     * Pointer to last node.
     */
    transient Node<E> last;
    //下略...
}

3.5 Queue

Queue是一个接口，该接口要求提供的方法与List也有较大区别，实际提供的方法如下


插入	add(e)	offer(e)
移除	remove()	poll()
获取	element()	peek()

同时Java为了满足线程安全的需求，提供了以下七种阻塞队列(BlockingQueue)

ArrayBlockingQueue
LinkedBlockingQueue
PriorityBlockingQueue
DelayQueue
SynchronousQueue
LinkedTransferQueue
LinkedBlockingDeque

JDK使用通知模式实现阻塞队列。所谓通知模式，就是当生产者往满的队列里添加元素时会阻塞生产者，当消费者消费了一个队列中的元素后，会通知生产者当前队列可用。

4.Deque

Deque是双端队列接口，同时也继承了Queue接口，主要提供了以下特色方法


插入	addFirst(e)	offerFirst(e)	addLast(e)	offerLast(e)
删除	removeFirst()	pollFirst()	removeLast()	pollLast()
获取	getFirst()	peekFirst()	getLast()	peekLast()

5.ArrayDeque

ArrayDeque是实现了Deque的双端队列实现类，用来当作双端队列或者栈，方法如下

操作	栈顶	队首	队尾
插入	push(e)	offerFirst(e)	offer(e)
删除	pop()	poll()	pollLast()
获取	peek()	peek()	peekLast()

不推荐使用add()或者remove()方法，缺乏异常处理机制，用offer()和poll()代替 pop()也常用poll()替代

ArrayDeque使用一个Object[] elements数组来储存队列数据

public class ArrayDeque<E> extends AbstractCollection<E>
                           implements Deque<E>, Cloneable, Serializable
{
    /*
     * VMs excel at optimizing simple array loops where indices are
     * incrementing or decrementing over a valid slice, e.g.
     *
     * for (int i = start; i < end; i++) ... elements[i]
     *
     * Because in a circular array, elements are in general stored in
     * two disjoint such slices, we help the VM by writing unusual
     * nested loops for all traversals over the elements.  Having only
     * one hot inner loop body instead of two or three eases human
     * maintenance and encourages VM loop inlining into the caller.
     */

    /**
     * The array in which the elements of the deque are stored.
     * All array cells not holding deque elements are always null.
     * The array always has at least one null slot (at tail).
     */
    transient Object[] elements;

    /**
     * 指向队列头部的索引
     * 如果 0 <= head < elements.length 等于 tail 则这个双端队列为空
     */
    transient int head;

    /**
     * 指向下一个加入到队列尾部元素的位置（tail-1是当前最后一个元素的位置）
     * elements[tail] 始终 null.
     */
    transient int tail;

    /**
     * The maximum size of array to allocate.
     * Some VMs reserve some header words in an array.
     * Attempts to allocate larger arrays may result in
     * OutOfMemoryError: Requested array size exceeds VM limit
     */
    private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
//方法略
}

6.PriorityQueue

优先队列，往往起到与小根堆、大根堆这类数据结构相同的作用，主要是在加入数据时会自动按照比较器规则排列元素。

使用二叉堆的形式实现，不能够加入不可比较的元素（如空元素）

线程不安全，但是提供了线程安全的PriorityBlockingQueue集合类

public class PriorityQueue<E> extends AbstractQueue<E>
    implements java.io.Serializable {

    @java.io.Serial
    private static final long serialVersionUID = -7720805057305804111L;

    private static final int DEFAULT_INITIAL_CAPACITY = 11;

    /**
     * Priority queue represented as a balanced binary heap: the two
     * children of queue[n] are queue[2*n+1] and queue[2*(n+1)].  The
     * priority queue is ordered by comparator, or by the elements'
     * natural ordering, if comparator is null: For each node n in the
     * heap and each descendant d of n, n <= d.  The element with the
     * lowest value is in queue[0], assuming the queue is nonempty.
     */
    transient Object[] queue; // non-private to simplify nested class access

    /**
     * The number of elements in the priority queue.
     */
    int size;

    /**
     * 比较器，默认为元素自然顺序（数字从小到大，字符按照字典顺序）
     */
    @SuppressWarnings("serial") // Conditionally serializable
    private final Comparator<? super E> comparator;

    /**
     * The number of times this priority queue has been
     * <i>structurally modified</i>.  See AbstractList for gory details.
     */
    transient int modCount;     // non-private to simplify nested class access

//后略
}

在源码中，优先队列的增加操作是如下实现的：

////add调用offer()
public boolean add(E e) {
        return offer(e);
    }

////offer调用siftUp()对堆进行调整，并在插入空元素时抛出异常
public boolean offer(E e) {
        if (e == null)
            throw new NullPointerException();
        modCount++;
        int i = size;
        if (i >= queue.length)
            grow(i + 1);
        siftUp(i, e);
        size = i + 1;
        return true;
    }

////siftUp()根据比较器选择子方法
 private void siftUp(int k, E x) {
        if (comparator != null)
            siftUpUsingComparator(k, x, queue, comparator);
        else
            siftUpComparable(k, x, queue);
    }

////关键主要的添加环节操作如下
    private static <T> void siftUpComparable(int k, T x, Object[] es) {
        Comparable<? super T> key = (Comparable<? super T>) x;
        while (k > 0) {
            //借助无符号数右移计算堆的父节点
            int parent = (k - 1) >>> 1;
            //获取父节点数据，进行比较，根据比较情况确定是否交换
            Object e = es[parent];
            if (key.compareTo((T) e) >= 0)
                break;
            es[k] = e;
            k = parent;
        }
        es[k] = key;
    }

    private static <T> void siftUpUsingComparator(
        int k, T x, Object[] es, Comparator<? super T> cmp) {
        while (k > 0) {
            int parent = (k - 1) >>> 1;
            Object e = es[parent];
            if (cmp.compare(x, (T) e) >= 0)
                break;
            es[k] = e;
            k = parent;
        }
        es[k] = x;
    }

7.HashSet

HashSet实现了Set接口，内部通过HashMap进行实现，HashSet通过成员对象计算hashCode的值

public class HashSet<E>
    extends AbstractSet<E>
    implements Set<E>, Cloneable, java.io.Serializable
{
    @java.io.Serial
    static final long serialVersionUID = -5024744406713321676L;

    transient HashMap<E,Object> map;

    // Dummy value to associate with an Object in the backing Map
    static final Object PRESENT = new Object();

    /**
     *创建一个新的空集，默认容量为16
     *默认加载因子是0.75(即到达0.75倍容量时会进行扩容)
     */
    public HashSet() {
        map = new HashMap<>();
    }
}

要注意HashSet只存储Value值，不存在键值对关系，但是HashSet的Value值就是底层HashMap的Key值，而map对应的Value值则是一个固定常量对象PRESENT

//HashSet添加元素
public boolean add(E e) {
    return map.put(e, PRESENT)==null;
}

默认初始容量为16，默认的初始化加载因子为0.75，即容量超出75%时会触发扩容机制，每次扩到原来的两倍

    static class Node<K,V> implements Map.Entry<K,V> {
        final int hash;
        final K key;
        V value;
        Node<K,V> next;

        Node(int hash, K key, V value, Node<K,V> next) {
            this.hash = hash;
            this.key = key;
            this.value = value;
            this.next = next;
        }

        public final K getKey()        { return key; }
        public final V getValue()      { return value; }
        public final String toString() { return key + "=" + value; }

        public final int hashCode() {
            return Objects.hashCode(key) ^ Objects.hashCode(value);
        }

        public final V setValue(V newValue) {
            V oldValue = value;
            value = newValue;
            return oldValue;
        }

        public final boolean equals(Object o) {
            if (o == this)
                return true;

            return o instanceof Map.Entry<?, ?> e
                    && Objects.equals(key, e.getKey())
                    && Objects.equals(value, e.getValue());
        }
    }

在没有重写hashCode()方法的情况下，系统会根据数据类型采用相应的策略计算哈希值，例如int返回原数值，String根据字符串内容计算哈希值

在哈希表中，数据储存的index与哈希值相关，实际算法如下

    static final int hash(Object key) {
        int h;
        //通过键值的hashCode与对应hashCode右移16位进行异或得到hash值
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }

12.LinkedHashMap

继承了HashMap,底层数据结构修改为了双向链表，HashMap的Node类也被继承，并且加入了分别指向前后两个Node的指针，通过这种储存方式可以实现LRU(最近最少使用)的淘汰策略

    public class LinkedHashMap<K,V>
    extends HashMap<K,V>
    implements SequencedMap<K,V>
{

    /**
     * 对原HashMap的Node类继承
     */
    static class Entry<K,V> extends HashMap.Node<K,V> {
        Entry<K,V> before, after;
        Entry(int hash, K key, V value, Node<K,V> next) {
            super(hash, key, value, next);
        }
    }

    @java.io.Serial
    private static final long serialVersionUID = 3801124242820219131L;

    /**
     * 旧结点
     */
    transient LinkedHashMap.Entry<K,V> head;

    /**
     * 新结点
     */
    transient LinkedHashMap.Entry<K,V> tail;
}

static final class Entry<K,V> implements Map.Entry<K,V> {
        K key;
        V value;
        Entry<K,V> left;
        Entry<K,V> right;
        Entry<K,V> parent;
        boolean color = BLACK;
}

线程安全

线性安全的集合类：

Vector：比ArrayList多了同步机制。
Hashtable。
ConcurrentHashMap：是一种高效并且线程安全的集合。
Stack：栈，也是线程安全的，继承于Vector。

线性不安全的集合类：

Hashmap
Arraylist
LinkedList
HashSet
TreeSet
TreeMap

Java基础复习-集合

Java基础复习-集合

0.什么是Java Collection(集合类)

集合方法接口

1.ArrayList

2.Vector

3.Stack

4.LinkedList

3.5 Queue

4.Deque

5.ArrayDeque

6.PriorityQueue

7.HashSet

8.LinkedHashSet

9.SortedSet

10.TreeSet

11.HashMap

12.LinkedHashMap

13.HashTable

14.Properties

15.SortedMap

16.TreeMap

线程安全