JKD源代码之HashTable

HashTable是个比较古老的实现,现在已经不推荐使用,虽然它比HashMap多了支持同步,但在同步并发方面ConcurrentHashMap更强,而不考虑同步的话,HashMap效率比它高了不少,正如官方注释一样:“If a thread-safe implementation is not needed, it is recommended to use HashMap in place of Hashtable. If a thread-safe highly-concurrent implementation is desired, then it is recommended to use java.util.concurrent.ConcurrentHashMap in place of Hashtable.”

在get()/put()/remove()等方法上,HashTable都加上了synchronized关键字,以此来实现同步,但熟悉JVM的童鞋都知道,synchronized关键字的代价很高,涉及到内核态和用户态的切换,所有HashTable的效率比较低。

在计算数组下标时,HashTable用了相对比较简单的求模运算,而不像HashMap那样做了移位优化:

1
int index = (hash & 0x7FFFFFFF) % tab.length;

获取:public synchronized V get(Object key)

1
2
3
4
5
6
7
8
9
10
11
12
13
public synchronized V get(Object key) {
Entry<?,?> tab[] = table;
int hash = key.hashCode();
// 计算出数组下标
int index = (hash & 0x7FFFFFFF) % tab.length;
// 遍历该链,找到哈希码和key都相同的元素
for (Entry<?,?> e = tab[index] ; e != null ; e = e.next) {
if ((e.hash == hash) && e.key.equals(key)) {
return (V)e.value;
}
}
return null;
}

可见,HashTable内部在处理冲突时,只是单纯地用链来实现,而不像HashMap一样,在冲突过多时,使用红黑树来优化。

添加:public synchronized V put(K key, V value)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
public synchronized V put(K key, V value) {
// Make sure the value is not null
// 不允许value为空
if (value == null) {
throw new NullPointerException();
}

// Makes sure the key is not already in the hashtable.
Entry<?,?> tab[] = table;
int hash = key.hashCode();
int index = (hash & 0x7FFFFFFF) % tab.length;
@SuppressWarnings("unchecked")
Entry<K,V> entry = (Entry<K,V>)tab[index];
// 遍历现有元素,如果有相同的key,就替换
for(; entry != null ; entry = entry.next) {
if ((entry.hash == hash) && entry.key.equals(key)) {
V old = entry.value;
entry.value = value;
return old;
}
}

// 如果不存在相同的key,就往里边插入一个新的key
addEntry(hash, key, value, index);
return null;
}

HashTable不允许空的key,也不允许空的value。

添加一个新的键值对:private void addEntry(int hash, K key, V value, int index)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
private void addEntry(int hash, K key, V value, int index) {
modCount++;

Entry<?,?> tab[] = table;
if (count >= threshold) {
// Rehash the table if the threshold is exceeded
rehash();

tab = table;
hash = key.hashCode();
index = (hash & 0x7FFFFFFF) % tab.length;
}

// Creates the new entry.
@SuppressWarnings("unchecked")
Entry<K,V> e = (Entry<K,V>) tab[index];
// 将新的键值对插到链头
tab[index] = new Entry<>(hash, key, value, e);
count++;
}

在这里可以看出,当有新的冲突出现时,新的冲突会插在链头(也许是用了程序局部性原理)

删除:public synchronized V remove(Object key)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
public synchronized V remove(Object key) {
Entry<?,?> tab[] = table;
int hash = key.hashCode();
int index = (hash & 0x7FFFFFFF) % tab.length;
@SuppressWarnings("unchecked")
// 找到数组下标
Entry<K,V> e = (Entry<K,V>)tab[index];
for(Entry<K,V> prev = null ; e != null ; prev = e, e = e.next) {
// 寻找待删除的键值对
if ((e.hash == hash) && e.key.equals(key)) {
modCount++;
// 如果不是链头
if (prev != null) {
prev.next = e.next;
// 处在链头
} else {
tab[index] = e.next;
}
count--;
V oldValue = e.value;
e.value = null;
return oldValue;
}
}
return null;
}

重散列算法:protected void rehash()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
protected void rehash() {
int oldCapacity = table.length;
Entry<?,?>[] oldMap = table;

// 同样,容量扩大一倍(默认容量是11)
// overflow-conscious code
int newCapacity = (oldCapacity << 1) + 1;
if (newCapacity - MAX_ARRAY_SIZE > 0) {
if (oldCapacity == MAX_ARRAY_SIZE)
// Keep running with MAX_ARRAY_SIZE buckets
return;
newCapacity = MAX_ARRAY_SIZE;
}
Entry<?,?>[] newMap = new Entry<?,?>[newCapacity];

modCount++;
threshold = (int)Math.min(newCapacity * loadFactor, MAX_ARRAY_SIZE + 1);
table = newMap;

// 从后往前遍历旧数组
for (int i = oldCapacity ; i-- > 0 ;) {
// 遍历该下标处的冲突链
for (Entry<K,V> old = (Entry<K,V>)oldMap[i] ; old != null ; ) {
Entry<K,V> e = old;
old = old.next;

// 计算新下标
int index = (e.hash & 0x7FFFFFFF) % newCapacity;
e.next = (Entry<K,V>)newMap[index];
// 插入新数组
newMap[index] = e;
}
}
}