[Java] Synchronized Collection vs Concurrent Collection

Synchronized Collection

There are representative classes for synchronized collections, as follows.

Vector
Hashtable
Collections.synchronizedXXX

All these classes guarantee concurrency by using the synchronized keyword in their publicly declared methods, thereby ensuring that only one thread can access the internal values at a time.

Vector

```javascript public class Vector extends AbstractList implements List, RandomAccess, Cloneable, java.io.Serializable { ... public synchronized boolean add(E e) { modCount++; add(e, elementData, elementCount); return true; } ... }

Looking at the add() method in the Vector class for adding elements, we can see the synchronized keyword. This means that synchronization is guaranteed when the element insertion operation is performed internally within Vector.

Hashtable

```javascript public class Hashtable<K,V> extends Dictionary<K,V> implements Map<K,V>, Cloneable, java.io.Serializable { ... public synchronized boolean contains(Object value) { if (value == null) { throw new NullPointerException(); }

}

Looking at the contains() method in the Hashtable class, which checks whether the same value exists, we can see that the synchronized keyword is used, just like in the Vector class.

Collections.synchronizedXXX

Let's take a look at the SynchronizedList class created using the Collections.synchronizedList() method.

```javascript static class SynchronizedList extends SynchronizedCollection implements List {

}

Looking at the methods of the SynchronizedList class, we can see that the synchronized keyword is used in all of them. However, it uses a mutex to implement synchronization using a synchronized block. Since all methods share the mutex object, once one thread enters the synchronized block, the synchronized blocks of other methods are all locked.

Problems with Synchronized Collection

Although you may need to use synchronized collections in a multi-threaded environment, it is better to use other synchronization methods whenever possible. There are two main reasons.

When multiple operations are combined and used like a single operation

Synchronized collection classes guarantee concurrency in a multi-threaded environment. However, problems arise when multiple operations need to be combined and used as a single operation. Even using such synchronized collections may not work correctly.

```javascript final List list = Collections.synchronizedList(new ArrayList()); final int nThreads = 2; ExecutorService es = Executors.newFixedThreadPool(nThreads);

for (int i = 0; i < nThreads; i++) { es.execute(new Runnable() {

}

If you execute the above code, an error occurs when Thread A performs remove(0) and Thread B performs clear() at the same time. Therefore, it needs to be wrapped in a synchronized block as follows.

```javascript synchronized (list) { list.clear(); list.add("888"); list.remove(0); }

Performance Degradation

If all methods that want to use shared objects are made synchronized methods, or if the same synchronized block is defined inside the methods, once one thread acquires the lock, other threads cannot use all synchronized methods and become blocked. This repeated phenomenon can lead to performance degradation.

Concurrent Collection

The types of parallel collections provided in the java.util.concurrent package are as follows, and only some of them will be covered in this article.

CopyOnWriteArrayList
- This is a subclass of the List class and is a parallel collection that prioritizes performance for operations that iterate through and query object lists.
ConcurrentMap
- This is a parallel collection, and looking at its interface, operations such as put-if-absent, replace, and conditional remove, which add a new item only if the item to be added does not already exist, are defined.
ConcurrentHashMap
- This is a subclass of ConcurrentMap and is a parallel collection that ensures parallelism by replacing HashMap.
ConcurrentLinkedQueue
- This is a FIFO queue and is a parallel collection that ensures parallelism. If there are no elements to dequeue, it returns immediately and proceeds to perform other tasks.
LinkedBlockingQueue
- This is similar to ConcurrentLinkedQueue. However, if the queue is empty, the operation to dequeue an item from the queue waits until a new item is added, and conversely, if the queue has a specified size, the operation to add a new item to the queue waits until a space becomes available in the queue.
ConcurrentSkipListMap, ConcurrentSkipListSet
- These are advanced forms that improve the parallelism of the SortedMap and SortedSet classes, respectively.

Simply replacing the existing synchronized collection classes with parallel collections can significantly improve overall performance without any significant risk factors.

Let's take a closer look at synchronized collections in contrast to parallel collections.

CopyOnWriteArrayList

There are two ways to create a synchronized ArrayList.

Collections.synchronizedList()
CopyOnWriteArrayList

Collections.synchronizedList() was added in JDK 1.2. This collection is synchronized for all read and write operations, so it can be considered an inflexible design. Therefore, CopyOnWriteArrayList was introduced as an improvement.

Read Operation

SynchronizedList locks the instance itself when performing read and write operations. However, CopyOnWriteArrayList creates a new temporary array by copying the elements in the original array for every write operation, performs the write operation on this temporary array, and then updates the original array. This allows the read operation to be performed without any locking, so it performs better than SynchronizedList.

```javascript public class CopyOnWriteArrayList implements List, RandomAccess, Cloneable, java.io.Serializable {

}

The above is the get() method, which is not locked because it does not have synchronized.

Write Operation

CopyOnWriteArrayList uses an explicit lock when performing a write operation. Ultimately, both types of collections are locked during this operation. In this case, CopyOnWriteArrayList performs array copying, which is relatively expensive, so if a significant number of write operations are performed, performance issues may arise.

```javascript public class CopyOnWriteArrayList implements List, RandomAccess, Cloneable, java.io.Serializable {

}

The above is the add() method, which locks through a synchronized block and performs array copying.

Iterator

CopyOnWriteArrayList iterates based on the collection data at the time the iterator is extracted, and any data added or deleted to the collection during the iteration is reflected in the copy, which is independent of the loop, so there are no issues with concurrent use.

CopyOnWriteArraySet

There are two ways to create a synchronized Set.

Collections.synchronizedSet()
CopyOnWriteArraySet

As the method names suggest, the operation method is almost identical to CopyOnWriteArrayList except for the data structure characteristics.

Read Operation

```javascript public class CopyOnWriteArraySet extends AbstractSet implements java.io.Serializable {

}

The above is the contains() method, and we can see that CopyOnWriteArraySet defines CopyOnWriteArrayList internally and uses its methods.

```javascript public boolean contains(Object o) { return indexOf(o) >= 0; }

public int indexOf(Object o) { Object[] es = getArray(); return indexOfRange(o, es, 0, es.length); }

}

Looking at the contains() method of CopyOnWriteArrayList, we can see that it is not locked.

Write Operation

```javascript public class CopyOnWriteArraySet extends AbstractSet implements java.io.Serializable {

}

The add() method also borrows the method of CopyOnWriteArrayList.

```javascript public boolean addIfAbsent(E e) { Object[] snapshot = getArray(); return indexOfRange(e, snapshot, 0, snapshot.length) < 0 && addIfAbsent(e, snapshot); }

private boolean addIfAbsent(E e, Object[] snapshot) { synchronized (lock) { Object[] current = getArray(); int len = current.length; if (snapshot != current) { // Optimize for lost race to another addXXX operation int common = Math.min(snapshot.length, len); for (int i = 0; i < common; i++) if (current[i] != snapshot[i] && Objects.equals(e, current[i])) return false; if (indexOfRange(e, current, common, len) >= 0) return false; } Object[] newElements = Arrays.copyOf(current, len + 1); newElements[len] = e; setArray(newElements); return true; } }

Looking at the addIfAbsent() method, we can see that it locks during the write operation and performs array copying. Therefore, CopyOnWriteArraySet, like CopyOnWriteArrayList, should not be used for frequent write operations.

ConcurrentHashMap

There are two ways to create a synchronized HashMap.

Collections.synchronizedMap(new HashMap<>())
ConcurrentHashMap

ConcurrentHashMap is a Map based on Hash, just like HashMap. It ensures concurrency in a more efficient way than synchronizedMap.

Prior to Java 8, it used a segment that inherited ReentrantLock, dividing the area and locking each area.

From Java 8 onwards, it uses a method of locking each table bucket independently. If a node is inserted into an empty bucket, it uses a CAS algorithm instead of a lock, and other changes acquire a partial lock (synchronized block) based on the first node of each bucket, minimizing thread contention and ensuring concurrency.

Let's check how concurrency is guaranteed by looking at the code of the putVal() method, which inserts a new node into ConcurrentHashMap. As a reference, the following example code is based on Java 11.

The putVal() method can be broadly divided into two cases (the total number of branches is four).

When inserting a node into an empty hash bucket
When a node already exists in the hash bucket

```javascript final V putVal(K key, V value, boolean onlyIfAbsent) { if (key null || value null) throw new NullPointerException(); int hash = spread(key.hashCode()); int binCount = 0; for (Node<K,V>[] tab = table;;) { Node<K,V> f; int n, i, fh; K fk; V fv; if (tab null || (n = tab.length) 0) tab = initTable(); // (1) else if ((f = tabAt(tab, i = (n - 1) & hash)) null) { // (2) if (casTabAt(tab, i, null, new Node<K,V>(hash, key, value))) break; } else if ((fh = f.hash) MOVED) tab = helpTransfer(tab, f); else if (onlyIfAbsent // check first node without acquiring lock && fh hash && ((fk = f.key) key || (fk != null && key.equals(fk))) && (fv = f.val) != null) return fv; // (3) else { V oldVal = null; synchronized (f) { if (tabAt(tab, i) f) { if (fh >= 0) { binCount = 1; for (Node<K,V> e = f;; ++binCount) { K ek; // (4) if (e.hash hash && ((ek = e.key) key || (ek != null && key.equals(ek)))) { oldVal = e.val; if (!onlyIfAbsent) e.val = value; break; } Node<K,V> pred = e; // (5) if ((e = e.next) null) { pred.next = new Node<K,V>(hash, key, value); break; } } } // (6) else if (f instanceof TreeBin) { Node<K,V> p; binCount = 2; if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key, value)) != null) { oldVal = p.val; if (!onlyIfAbsent) p.val = value; } } else if (f instanceof ReservationNode) throw new IllegalStateException("Recursive update"); } } ... } } ... }

When inserting a node into an empty hash bucket

(1) To insert a new node, it checks whether the bucket value (tabAt()) is empty.

```javascript static final <K,V> Node<K,V> tabAt(Node<K,V>[] tab, int i) { return (Node<K,V>)U.getObjectAcquire(tab, ((long)i << ASHIFT) + ABASE); }

(2) It accesses the volatile variable contained in the node, compares it to the existing value (null), and if it is the same, it stores the new node. If it is not the same, it goes back to the for loop. This method is the CAS algorithm.

```javascript static final <K,V> boolean casTabAt(Node<K,V>[] tab, int i, Node<K,V> c, Node<K,V> v) { return U.compareAndSetObject(tab, ((long)i << ASHIFT) + ABASE, c, v); }

The CAS algorithm is used to solve atomicity and visibility issues, thus ensuring concurrency.

When a node already exists in the hash bucket

(3) If a node already exists, it uses a synchronized block to control access to only one thread. In this case, since it locks the non-empty Node<K, V> type hash bucket, any thread accessing the same bucket becomes blocked.

(4) Replace with a new node.

(5) Add to Separate Chaining if a hash collision occurs.

(6) Add to a tree if a hash collision occurs.

Reference

Expected Interview Questions and Answers

What are the problems with Vector, HashTable, and Collections.SynchronziedXXX?

The Vector, HashTable, and SynchronziedXxx classes use synchronized methods or blocks and share one lock object. Therefore, when one thread acquires a lock on the collection, other threads cannot use any method and become blocked. This can lead to performance degradation in the application.

What is the difference between SynchronizedList and CopyOnArrayList?

SynchronizedList locks the instance itself when performing read and write operations. However, CopyOnArrayList locks the corresponding block when performing a write operation, creates a new temporary array by copying the elements in the original array, performs the write operation on this temporary array, and then updates the original array. This allows the read operation to be performed without any locking, so it performs better than SynchronizedList for reading. However, write operations are more expensive because they perform array copying, so they perform worse than SynchronizedList for writing.

Therefore, it is more effective to use CopyOnArrayList when there are more read operations than changes.

What is the difference between SynchronizedMap and ConcurrentHashMap?

SynchronziedMap locks the instance itself when performing read and write operations. However, ConcurrentHashMap uses a method of locking each table bucket independently. If a node is inserted into an empty bucket, it uses a CAS algorithm instead of a lock, and other changes acquire a partial lock (synchronized block) based on the bucket that was accessed, minimizing thread contention and ensuring concurrency.

Synchronized Collection

Vector

Hashtable

Collections.synchronizedXXX

Problems with Synchronized Collection

Concurrent Collection

CopyOnWriteArrayList

CopyOnWriteArraySet

ConcurrentHashMap

Reference

Expected Interview Questions and Answers

What are the problems with Vector, HashTable, and Collections.SynchronziedXXX?

What is the difference between SynchronizedList and CopyOnArrayList?

What is the difference between SynchronizedMap and ConcurrentHashMap?

Comments0