How to ensure data consistency in lock-free concurrency programming?

At a time when multi-core computing and distributed systems are popular, lock-free concurrent programming has become a key technology to improve system performance.Unlike traditional lock-based concurrent programming, lock-free concurrent programming abandons the lock mechanism to avoid the performance losses caused by lock competition.However, this also poses challenges to data consistency assurance.Next, we explore in-depth ways to ensure data consistency in lock-free concurrency programming. 1. The cornerstone function of atomic operation Atomic operations are the core means of lock-free concurrency programming to ensure data consistency.Taking Cangjie language as an example, it provides a rich atomic type, such as AtomicInt32, AtomicBool, and AtomicReference.These atomic types of operations are atomic, that is, operations are either executed intact or not at all, and will not be disturbed by other threads. When implementing a simple counter, if normal variables are used, self-increment operations in multi-threaded environments may cause data competition problems.But with the help of atomic types, it can be effectively avoided: let counter = AtomicInt32(0) // Multiple threads can operate on counter safely counter.fetchAdd(1, .relaxed) The .fetchAdd method here is an atomic operation, which increases the counter value atomically to ensure data consistency.At the same time, atomic types also support different memory order semantics, such as .relaxed (loose order), .acquireRelease (get - release semantics), and .sequentiallyConsistent (order consistency).Developers can choose the appropriate semantics based on the specific scenario to find a balance between performance and consistency.For example, in scenarios such as statistical counting, using the .relaxed semantics can achieve better performance; while when global state synchronization is synchronized, the .sequentiallyConsistent semantics need to be used to ensure strict consistency. 2. Application of Comparison and Exchange (CAS) Algorithm Comparison and exchange (CAS) algorithms are an important cornerstone of lock-free data structures and algorithms.The CAS operation contains three parameters: memory location, expected value and new value.It will first check whether the value of the memory location is consistent with the expected value. If it is consistent, the new value will be written. If it is inconsistent, the write operation will not be performed, and the result of whether the operation is successful is returned. When implementing lock-free queues, the CAS algorithm can ensure the consistency of queue operations in a multi-threaded environment.Suppose we have a lock-free queue based on linked list implementation, and the tail pointer of the linked list needs to be modified when joining the queue: // Simplified lock-free queue node definition struct QueueNode { var value: T var next: AtomicReference } // Lockless queue definition class NonBlockingQueue { private let head: AtomicReference private let tail: AtomicReference init() { let dummy = QueueNode(value: nil, next: AtomicReference(nil)) head = AtomicReference(dummy) tail = AtomicReference(dummy) } func enqueue(value: T) { let newNode = QueueNode(value: value, next: AtomicReference(nil)) while true { let currentTail = tail.load(.acquireRelease) let next = currentTail.next.load(.acquireRelease) if currentTail === tail.load(.acquireRelease) { if next == nil { if currentTail.next.compareExchange(expected: nil, desired: newNode, order:.acquireRelease) { tail.compareExchange(expected: currentTail, desired: newNode, order:.acquireRelease) return } } else { tail.compareExchange(expected: currentTail, desired: next, order:.acquireRelease) } } } } } In this enqueue operation, the CAS operation ensures that the tail pointer update is atomic, avoiding the inconsistency problem that may occur when multiple threads enqueue at the same time. 3. Design considerations for lock-free data structures A carefully designed lock-free data structure is the key to ensuring data consistency.Taking the lock-free hash table as an example, to ensure data consistency, segmented locks or similar mechanisms are usually used to reduce lock competition.In the ConcurrentHashMap of Cangjie Language, by reasonably setting the segmentation strategy, the hash table is divided into multiple segments, and each segment is independently synchronously controlled: let map = ConcurrentHashMap( concurrencyLevel: 16 // Matches the number of CPU cores ) When performing insert, delete or query operations, the thread only needs to lock the corresponding segment instead of the entire hash table, th

May 10, 2025 - 06:40

How to ensure data consistency in lock-free concurrency programming?

At a time when multi-core computing and distributed systems are popular, lock-free concurrent programming has become a key technology to improve system performance.Unlike traditional lock-based concurrent programming, lock-free concurrent programming abandons the lock mechanism to avoid the performance losses caused by lock competition.However, this also poses challenges to data consistency assurance.Next, we explore in-depth ways to ensure data consistency in lock-free concurrency programming.

1. The cornerstone function of atomic operation

Atomic operations are the core means of lock-free concurrency programming to ensure data consistency.Taking Cangjie language as an example, it provides a rich atomic type, such as AtomicInt32, AtomicBool, and AtomicReference.These atomic types of operations are atomic, that is, operations are either executed intact or not at all, and will not be disturbed by other threads.

When implementing a simple counter, if normal variables are used, self-increment operations in multi-threaded environments may cause data competition problems.But with the help of atomic types, it can be effectively avoided:

let counter = AtomicInt32(0)
// Multiple threads can operate on counter safely
counter.fetchAdd(1, .relaxed)

The .fetchAdd method here is an atomic operation, which increases the counter value atomically to ensure data consistency.At the same time, atomic types also support different memory order semantics, such as .relaxed (loose order), .acquireRelease (get - release semantics), and .sequentiallyConsistent (order consistency).Developers can choose the appropriate semantics based on the specific scenario to find a balance between performance and consistency.For example, in scenarios such as statistical counting, using the .relaxed semantics can achieve better performance; while when global state synchronization is synchronized, the .sequentiallyConsistent semantics need to be used to ensure strict consistency.

2. Application of Comparison and Exchange (CAS) Algorithm

Comparison and exchange (CAS) algorithms are an important cornerstone of lock-free data structures and algorithms.The CAS operation contains three parameters: memory location, expected value and new value.It will first check whether the value of the memory location is consistent with the expected value. If it is consistent, the new value will be written. If it is inconsistent, the write operation will not be performed, and the result of whether the operation is successful is returned.

When implementing lock-free queues, the CAS algorithm can ensure the consistency of queue operations in a multi-threaded environment.Suppose we have a lock-free queue based on linked list implementation, and the tail pointer of the linked list needs to be modified when joining the queue:

// Simplified lock-free queue node definition
struct QueueNode {
    var value: T
    var next: AtomicReference?>
}

// Lockless queue definition
class NonBlockingQueue {
    private let head: AtomicReference?>
    private let tail: AtomicReference?>

    init() {
        let dummy = QueueNode(value: nil, next: AtomicReference(nil))
        head = AtomicReference(dummy)
        tail = AtomicReference(dummy)
    }

    func enqueue(value: T) {
        let newNode = QueueNode(value: value, next: AtomicReference(nil))
        while true {
            let currentTail = tail.load(.acquireRelease)
            let next = currentTail.next.load(.acquireRelease)
            if currentTail === tail.load(.acquireRelease) {
                if next == nil {
                    if currentTail.next.compareExchange(expected: nil, desired: newNode, order:.acquireRelease) {
                        tail.compareExchange(expected: currentTail, desired: newNode, order:.acquireRelease)
                        return
                    }
                } else {
                    tail.compareExchange(expected: currentTail, desired: next, order:.acquireRelease)
                }
            }
        }
    }
}

In this enqueue operation, the CAS operation ensures that the tail pointer update is atomic, avoiding the inconsistency problem that may occur when multiple threads enqueue at the same time.

3. Design considerations for lock-free data structures

A carefully designed lock-free data structure is the key to ensuring data consistency.Taking the lock-free hash table as an example, to ensure data consistency, segmented locks or similar mechanisms are usually used to reduce lock competition.In the ConcurrentHashMap of Cangjie Language, by reasonably setting the segmentation strategy, the hash table is divided into multiple segments, and each segment is independently synchronously controlled:

let map = ConcurrentHashMap(
concurrencyLevel: 16 // Matches the number of CPU cores
)

When performing insert, delete or query operations, the thread only needs to lock the corresponding segment instead of the entire hash table, thereby reducing lock competition, improving concurrency performance while ensuring data consistency.In addition, lock-free data structures are designed to consider memory management and cache consistency issues.For example, cache line filling technology avoids pseudo-sharing, reduces performance losses caused by multiple threads accessing the same cache line at the same time, and indirectly ensures data consistency.

4. Strategies to solve ABA problems

The ABA problem is an important factor affecting data consistency in lock-free concurrency programming.When a value changes from A to B and then back to A, the CAS-based operation may mistakenly assume that the value has not changed and perform an incorrect operation.To solve this problem, a common approach is to use atomic references with version numbers.

In Cangjie language, you can define a reference type with a version number:

struct VersionedRef {
    var value: T
    var version: Int64
}

let ref = AtomicReference>(...)

Each time the referenced value is modified, the version number is incremented at the same time.When performing CAS operations, you must not only compare values, but also version numbers. You must only perform update operations when both match, thereby effectively avoiding ABA problems and ensuring data consistency.

5. The auxiliary role of memory barrier

Memory barrier is a synchronization primitive used to ensure the order and visibility of specific operations, and plays an auxiliary but important role in ensuring data consistency in lock-free concurrency programming.Different hardware platforms have different memory barrier instructions, such as ARMv9's DMB ISH, x86's MFENCE, and RISC-V's fence rw, rw, etc.

In Cangjie language, the compiler will automatically insert a suitable memory barrier based on the target platform.For example, when implementing a lock-free singleton mode, memory barriers may be used to ensure thread safety and data consistency of the initialization process:

class Singleton {
    private static var instance: AtomicReference = AtomicReference(nil)

    private init() {}

    static func getInstance() -> Singleton {
        var result = instance.load(.acquireRelease)
        if result == nil {
            let temp = Singleton()
            if instance.compareExchange(expected: nil, desired: temp, order:.acquireRelease) {
                result = temp
            } else {
                result = instance.load(.acquireRelease)
            }
        }
        return result!
    }
}

The memory barrier here ensures the order and visibility of the initialization and assignment operations of singleton objects in a multi-threaded environment, avoiding multiple threads creating singleton objects at the same time and ensuring data consistency.