Use scenarios and methods of distributed locks

Posted Jun 15, 202013 min read

Why use distributed lock --distributed lock ?


Before discussing this issue, let's take a look at a business scenario:

Inventory oversold problem

System A is an e-commerce system. It is currently a machine deployment. There is an interface for users to place orders in the system. However, users must check the inventory before placing orders to ensure that the inventory is sufficient before placing orders for users.

Because the system has certain concurrency, the inventory of goods will be saved in redis in advance, and the inventory of redis will be updated when the user places an order.
The system architecture is as follows:

However, this will create a problem:if at a certain moment, a certain stock in redis is 1, then two requests come at the same time, and one of the requests is executed to the third step of the above figure to update the database 'S inventory is 0, but step 4 has not been executed.

And another request is executed to the second step, and if the inventory is still 1, it continues to the third step.

As a result, two commodities were sold, but in fact there was only one in stock.

Obviously wrong! This is a typical "inventory oversold" problem

At this point, it is easy to think of a solution:lock the steps 2, 3, and 4 with a lock, and let them execute before another thread can come in and perform step 2.

According to the above diagram, when performing step 2, use synchronized or ReentrantLock provided by Java to lock, and then release the lock after the execution of step 4.

In this way, the three steps 2, 3, and 4 are "locked", and multiple threads can only be executed serially.

Service cluster deployment

But the good times are not long, the concurrency of the entire system has soared, and one machine can't carry it. Now we need to add a machine, as shown below:

After adding the machine, the system becomes as shown above, my god!

Suppose that at this time the requests of two users arrive at the same time, but fall on different machines, then these two requests can be executed at the same time, or there will be a problem of inventory oversold.

why?

  • Because the two A systems in the above figure are running in two different JVMs, the locks they add are only valid for the threads in their own JVM, and are not valid for the threads of other JVMs.
  • Therefore, the problem here is that the native lock mechanism provided by Java fails in a multi-machine deployment scenario. This is because the locks added by the two machines are not the same lock(the two locks are in different JVMs).
  • So, as long as we guarantee that the locks added by the two machines are the same lock, can't the problem be solved?

At this time, it is time to launch the distributed lock. The idea of the distributed lock is:

Provide a global, unique thing for acquiring locks in the entire system, and then every system needs to ask this "thing" to get a lock when it needs to be locked, so that different systems get it It can be considered as the same lock.

As for this "thing", it can be Redis, Zookeeper, or a database.

The text description is not very intuitive, let's look at the following picture:

Through the above analysis, we know that the inventory oversold scenario in the case of a distributed deployment system using Java's native lock mechanism cannot guarantee thread safety, so we need to use a distributed lock solution.

So, how to implement distributed locks? Then look down!

Implementation of distributed lock based on Redis

The above analysis is why you need to use distributed locks. Here we will look at how to deal with distributed locks. Extension:[How does Redisson implement distributed locks?]( http://mp.weixin.qq.com/s?__biz=MzI4Njc5NjM1NQ==&mid=2247493252&idx=2&sn=5530b330af0e0bcb56f9cc8bd7d0a25d&chksm=ebd5d9a8dca250be07d54c3103ccc2

The most common solution is to use Redis as a distributed lock

The idea of using Redis as a distributed lock is probably this:Setting a value in redis means that the lock is added, and then the key is deleted when the lock is released.

The specific code is this:

//Acquire lock
//NX means success if the key does not exist, the key returns false, PX can specify the expiration time
SET, AnyLock, unique\_value, NX, PX, 30000

//Release the lock:by executing a lua script
//Release the lock involves two instructions, these two instructions are not atomic
//The reda Lua script support feature is required, and the redis execution of the lua script is atomic
if redis.call("get",KEYS\[1\])==ARGV\[1\]then
return redis.call("del",KEYS\[1\])
else
return0
end

There are several important points in this way:

  • Be sure to use the SET key value NX PX milliseconds command

    If not, set the value first, and then set the expiration time. This is not an atomic operation. It may be down before the expiration time is set, which will cause a deadlock(key permanently exists)

  • value must be unique

    This is to delete the key after verifying that the value is consistent with the lock when unlocking.

    This is to avoid a situation:Suppose A has acquired the lock and the expiration time is 30s. At this time, after 35s, the lock has been automatically released. A will release the lock, but at this time B may have acquired the lock. Client A cannot delete B's lock.

In addition to considering how the client implements distributed locks, it also needs to consider the deployment of redis.

Redis has 3 deployment methods:

  • Stand-alone mode
  • master-slave + sentinel election mode
  • redis cluster mode

The disadvantage of using redis as a distributed lock is that if you use the stand-alone deployment mode, there will be a single point of problem, as long as redis fails. Locking will not work.

In master-slave mode, only one node is locked when it is locked. Even if high availability is achieved through sentinel, if the master node fails and a master-slave switch occurs, there may be a problem of lock loss.

Based on the above considerations, in fact, the author of redis also considered this problem, he proposed a RedLock algorithm, the meaning of this algorithm is probably like this:

Assuming that the deployment mode of redis is redis cluster, there are a total of 5 master nodes. Obtain a lock through the following steps:

  • Get the current timestamp in milliseconds
  • Try to create locks on each master node in turn, the expiration time is set to be short, generally tens of milliseconds
  • Try to establish a lock on most nodes, such as 5 nodes requires 3 nodes(n/2 +1)
  • The client calculates the time to establish the lock, if the time to establish the lock is less than the timeout time, even if the establishment is successful
  • If the lock establishment fails, then delete the lock in turn
  • As long as someone else establishes a distributed lock, you have to keep polling to try to acquire the lock

However, such an algorithm is still quite controversial, and there may still be many problems that cannot guarantee that the locking process must be correct.

Another way:Redisson

In addition, to implement the Redis distributed lock, in addition to its own implementation based on the redis client native API, you can also use the open source framework:Redission

Redisson is an enterprise-level open source Redis Client that also provides distributed lock support. I also highly recommend it to everyone, why?

Recall from the above, if you write code to set a value through redis, it is set by the following command.

  • SET anyLock unique_value NX PX 30000

The timeout time set here is 30s. If I have not completed the business logic after more than 30s, the key will expire, and other threads may acquire the lock.

As a result, the first thread has not finished executing the business logic, and the second thread will also have thread safety problems when it comes in. So we still need to maintain this expiration time, which is too much trouble~

Let's see how redisson is implemented? First feel the coolness of using redission:

Config config = new Config();
config.useClusterServers()
.addNodeAddress("redis://192.168.31.101:7001")
.addNodeAddress("redis://192.168.31.101:7002")
.addNodeAddress("redis://192.168.31.101:7003")
.addNodeAddress("redis://192.168.31.102:7001")
.addNodeAddress("redis://192.168.31.102:7002")
.addNodeAddress("redis://192.168.31.102:7003");

RedissonClient redisson = Redisson.create(config);


RLock lock = redisson.getLock("anyLock");
lock.lock();
lock.unlock();

It is so simple, we only need to complete the distributed lock through the lock and unlock in its API, he helped us consider many details:

  • All instructions of redisson are executed by lua script, redis supports atomic execution of lua script

  • Redisson sets the default expiration time of a key to 30s. What if a client holds a lock for more than 30s?

    There is a concept of watchdog in redisson. Translated as a watchdog, it will help you set the key timeout time to 30s every 10 seconds after you acquire the lock.

    In this case, even if the lock has been held for a long time, the key will not expire, and other threads have acquired the lock.

  • Redisson's "watchdog" logic ensures that no deadlock occurs.

    (If the machine is down, the watchdog will be gone. At this time, the key expiration time will not be extended. It will automatically expire after 30s, and other threads can acquire the lock)

The implementation code is posted slightly here:

        //Lock logic
private <T> RFuture<Long> tryAcquireAsync(long leaseTime, TimeUnit unit, final long threadId) {
    if(leaseTime != -1) {
        return tryLockInnerAsync(leaseTime, unit, threadId, RedisCommands.EVAL\_LONG);
    }
    //Call a lua script to set some keys and expiration time
    RFuture<Long> ttlRemainingFuture = tryLockInnerAsync(commandExecutor.getConnectionManager().getCfg().getLockWatchdogTimeout(), TimeUnit.MILLISECONDS, threadId, RedisCommands.EVAL\_LONG);
    ttlRemainingFuture.addListener(new FutureListener<Long>() {
        @Override
        public void operationComplete(Future<Long> future) throws Exception {
            if(!future.isSuccess()) {
                return;
            }

            Long ttlRemaining = future.getNow();
            //lock acquired
            if(ttlRemaining == null) {
            //Watchdog logic
                scheduleExpirationRenewal(threadId);
            }
        }
    });
    return ttlRemainingFuture;
}

<T> RFuture<T> tryLockInnerAsync(long leaseTime, TimeUnit unit, long threadId, RedisStrictCommand<T> command) {
    internalLockLeaseTime = unit.toMillis(leaseTime);

    return commandExecutor.evalWriteAsync(getName(), LongCodec.INSTANCE, command,
            "if(redis.call('exists', KEYS\[1\]) == 0) then "+
                    "redis.call('hset', KEYS\[1\], ARGV\[2\], 1); "+
                    "redis.call('pexpire', KEYS\[1\], ARGV\[1\]); "+
                    "return nil; "+
                    "end; "+
                    "if(redis.call('hexists', KEYS\[1\], ARGV\[2\]) == 1) then "+
                    "redis.call('hincrby', KEYS\[1\], ARGV\[2\], 1); "+
                    "redis.call('pexpire', KEYS\[1\], ARGV\[1\]); "+
                    "return nil; "+
                    "end; "+
                    "return redis.call('pttl', KEYS\[1\]);",
            Collections.<Object>singletonList(getName()), internalLockLeaseTime, getLockName(threadId));
}


//The watchdog will eventually call here
private void scheduleExpirationRenewal(final long threadId) {
    if(expirationRenewalMap.containsKey(getEntryName())) {
        return;
    }

    //This task will be delayed for 10s
    Timeout task = commandExecutor.getConnectionManager().newTimeout(new TimerTask() {
        @Override
        public void run(Timeout timeout) throws Exception {

            //This operation will reset the key expiration time to 30s
            RFuture<Boolean> future = renewExpirationAsync(threadId);

            future.addListener(new FutureListener<Boolean>() {
                @Override
                public void operationComplete(Future<Boolean> future) throws Exception {
                    expirationRenewalMap.remove(getEntryName());
                    if(!future.isSuccess()) {
                        log.error("Can't update lock "+ getName() +" expiration", future.cause());
                        return;
                    }

                    if(future.getNow()) {
                    //reschedule itself
                    //Call this method recursively to extend the expiration time in an infinite loop
                        scheduleExpirationRenewal(threadId);
                    }
                }
            });
        }

    }, internalLockLeaseTime/3, TimeUnit.MILLISECONDS);

    if(expirationRenewalMap.putIfAbsent(getEntryName(), new ExpirationEntry(threadId, task)) != null) {
        task.cancel();
    }
}

In addition, redisson also provides support for the redlock algorithm,

Its usage is also very simple:

    RedissonClient redisson = Redisson.create(config);
    RLock lock1 = redisson.getFairLock("lock1");
    RLock lock2 = redisson.getFairLock("lock2");
    RLock lock3 = redisson.getFairLock("lock3");
    RedissonRedLock multiLock = new RedissonRedLock(lock1, lock2, lock3);
    multiLock.lock();
    multiLock.unlock();

summary:

This section analyzes the specific landing scheme of using redis as a distributed lock

And some of its limitations

Then introduced a redis client framework redisson,

This is what I recommend everyone to use,

It will take care of many details less than writing your own code.

Implementation of distributed lock based on zookeeper

In the common distributed lock implementation scheme, in addition to redis, the use of zookeeper can also achieve distributed locks.

Before introducing the mechanism of zookeeper(replaced by zk in the following) to implement distributed locks, let me briefly introduce what zk is:

Zookeeper is a centralized service that provides configuration management, distributed collaboration, and naming.

The zk model is this:zk contains a series of nodes, called znodes, just like a file system. Each znode represents a directory, and then znode has some features:

  • Ordered node:If there is currently a parent node /lock, we can create child nodes under this parent node;

    Zookeeper provides an optional ordered feature, for example, we can create a child node "/lock/node-" and indicate the order, then zookeeper will automatically add an integer sequence number according to the current number of child nodes when generating child nodes

    In other words, if it is the first child node created, the generated child node is /lock/node-0000000000, the next node is /lock/node-0000000001, and so on.

  • Temporary node:The client can establish a temporary node. Zookeeper will automatically delete the node after the session ends or the session times out.

  • Event monitoring:When reading data, we can set event monitoring on the node at the same time. When the node data or structure changes, zookeeper will notify the client. The current zookeeper has the following four events:

  • Node creation

  • Node deletion

  • Node data modification

  • Child node changes

Based on some of the characteristics of zk above, we can easily come up with a landing solution for implementing distributed locks using zk:

  1. Using zk's temporary nodes and ordered nodes, each thread acquires a lock by creating a temporarily ordered node in zk, such as in the /lock/directory.

  2. After creating the node successfully, obtain all temporary nodes in the /lock directory, and then determine whether the node created by the current thread is the node with the lowest serial number of all nodes

  3. If the node created by the current thread is the node with the smallest serial number of all nodes, it is considered that the lock acquisition is successful.

  4. If the node created by the current thread is not the node with the smallest serial number of all nodes, add an event listener to the node before the serial number of the node.

    For example, the serial number of the node obtained by the current thread is /lock/003, and then the list of all nodes is [/lock/001,/lock/002,/lock/003], then the /lock/002 The node adds an event listener.

If the lock is released, it will wake up the node with the next sequence number, and then re-execute step 3 to determine whether its own node sequence number is the smallest.

For example, /lock/001 is released and /lock/002 has heard the time. At this time, the node set is [/lock/002,/lock/003], then /lock/002 is the node with the lowest sequence number. To obtain the lock.

The whole process is as follows:

The specific implementation idea is this. As for how to write the code, it is not complicated to post it here.

Introduction to Curator

Curator is an open source client of zookeeper, and also provides the implementation of distributed locks.

His method of use is relatively simple:

InterProcessMutex interProcessMutex = new InterProcessMutex(client,"/anyLock");
interProcessMutex.acquire();
interProcessMutex.release();

The core source code for implementing distributed locks is as follows:

private Boolean internalLockLoop(longstartMillis,LongmillisToWait,StringourourPath)throwsException {
boolean haveTheLock = false;
boolean doDelete = false;
try {
if((revocable.get()!==null)){
Client.getData().usingWatcher(revocableWatcher).forPath(ourPath);
}}

while(client(get.State() ==CuratorFrameworkState.STARTED)&&!haveTheLock){
//Get the sorted collection of all current nodes
List<String>Children=getSortedChildren();
//Get the name of the current node
String String SequenceNodeName = ourPath.substring(basePath.length()++1); //+1+1 to include
//determine whether the current node is the smallest node
PredicateResults PredicateResults = driver.getsTheLock(client, children, sequenceNodeName, maxLeases);
if(prepreicateResults.getsTheLock()) {
//Get the lock
??HaveTheLock=true;
} Else {
//The lock is not acquired, register a listener for the previous node of the current node
PrecedencePath = StringBase = +/"/" + predicateResults.getPathToWatch();
synchronized(this){
StatStat = client.checkExists().usingWatcher(watcher).forPath(previousSequencePath);
if((stat!!=null)){
if((millisToWait!==null)){

StartMillis = System.currentTimeMillis();
if((millisToWait <= 0)) {
DoDelete = true; //timed out-delete
break;
?

(Waiting for millisToWait);
}}} Else{

?
?
}
//Else it may have been deleted(i.e. lock released). Try try to acquire again
}}}
}}
}
catch(Exception)e{
DoDelete = true;
throw
}Finally{
if((doDelete)){
 ¢ deleteOurPath(ourPath);
}}
}
returnhaveTheLock;
}

In fact, the underlying principles of Curator's implementation of distributed locks are similar to those analyzed above. Here we use a picture to describe its principle in detail:

summary:

This section introduces zookeeper's solution for implementing distributed locks and the basic use of zk's open source client, and briefly introduces its implementation principles. Related can refer to:[Live ZooKeeper to achieve distributed lock program, with examples!]( http://mp.weixin.qq.com/s?__biz=MzI4Njc5NjM1NQ==&mid=2247493195&idx=1&sn=84e2caa2f5364db02bfb82a8d9f421e5&chksm=ebd5d967dca250710d905dc9c69529c16bdd

Comparison of advantages and disadvantages of the two schemes

After learning two implementations of distributed locks, what needs to be discussed in this section are the advantages and disadvantages of the redis and zk implementations.

For redis distributed locks, it has the following disadvantages:

  • The method of acquiring the lock is simple and rude. If you can't get the lock, you will keep trying to acquire the lock, which consumes more performance.
  • In other words, the design of redis determines that its data is not strongly consistent. In some extreme cases, problems may occur. The lock model is not robust enough
  • Even if it is implemented using the redlock algorithm, in some complex scenarios, there is no guarantee that its implementation will be 100%without problems. For a discussion of redlock, see How to do distributed locking
  • Redis distributed locks, in fact, need to constantly try to obtain locks, which consumes more performance.

But on the other hand, using redis to implement distributed locks is very common in many enterprises, and in most cases, you will not encounter the so-called "extremely complex scenario"

Therefore, using redis as a distributed lock is also a good solution. The most important point is that redis has high performance and can support high concurrent acquisition and release of lock operations.

For zk distributed locks:

  • Zookeeper's natural design positioning is distributed coordination and strong consistency. The lock model is robust, easy to use, and suitable for distributed locks.
  • If you can't get the lock, you only need to add a listener, without polling all the time, and the performance consumption is small.

However, zk also has its disadvantages:if more clients frequently apply for locking and releasing locks, the pressure on the zk cluster will be greater.

summary:

In summary, both redis and zookeeper have their advantages and disadvantages. We can use these questions as reference factors when making technical selections.

Suggest

Through the previous analysis, two common solutions to achieve distributed locks:redis and zookeeper, they have their own advantages. How should I choose?

Personally, I prefer the lock implemented by zk:

Because redis is likely to have hidden dangers, it may lead to incorrect data. However, how to choose depends on the specific scene in the company.

If the company has zk cluster conditions, it is preferred to use zk to achieve, but if the company only has redis clusters, there is no condition to build a zk cluster.

Then its practical redis can also be achieved. In addition, it may be that the system designer considers that the system already has redis, but does not want to introduce some external dependencies again, you can choose redis.

This is for the system designer to consider based on architecture