Unexpectedly, such a simple thread pool usage hides so many pits!

Posted Jun 29, 20208 min read

Stepped on the pit again

There is a reconciliation system in production. Every day, you need to download reconciliation files from the channel and then start reconciliation at the end of the day. This system has been running for a long time. Suddenly received a text message warning two days ago, and did not obtain the channel reconciliation file.

ps:Detailed implementation of reconciliation system: Design and implementation of reconciliation system

I thought it was the channel side that was doing things again. After a series of investigations, I found that all download tasks were blocked. After further checking the source code, I found that I always used the wrong thread pool method.

Because thread creation is more expensive, we will use thread pools to perform asynchronous tasks in formal projects. Thread pool, use the pooling technology to save thread objects, take them out directly when they are used, and return them after use.

Although the thread pool is very simple to use, the simpler it is, the easier it is to step on the pit. Counting down, there have been several accidents caused by thread pools over the years.

So today, the little black brother will show you how to use the thread pool to step on the pit for the thread pool topic.

I hope you can avoid these pits perfectly after reading it~

Look at it first, then develop a habit. WeChat search for "Program Tongshi", follow up and you're done!

Use Executors components with caution

Java provides the implementation class of thread pool from JDK1.5. We only need to pass in the relevant parameters in the constructor to create a thread pool.

However, the constructor of the thread pool can be said to be very complicated. Even the simplest constructor needs to pass in 5 parameters. This is very inconvenient for novices.

Perhaps JDK developers have also considered this problem, so very intimate provides us with a tool class Executors, used to quickly create and create thread pools.

Although this tool class is really very convenient to use, you can write a lot of code, but Xiaohei still recommends that the production system still honestly manually create the thread pool, use Executors with caution, especially the two methods in the tool class Executors# newFixedThreadPool and Executors#newCachedThreadPool.

If you plan to create a thread pool using the above method, it is a time bomb, and you may not be able to produce the system that day ?.

Let's take a look at two ? and see what will happen to these two methods.

Suppose we have an application with a batch interface, each request will download 100w files, here we use Executors#newFixedThreadPool batch download.

In the following method, we randomly sleep to simulate the time-consuming real download.

To quickly reproduce the problem, adjust the JVM parameters to -Xmx128m -Xms128m.

private ExecutorService threadPool = Executors.newFixedThreadPool(10);

/**
 * Download reconciliation files in batch
 *
 * @return
 */
@RequestMapping("/batchDownload")
public String batchDownload() {

    //Simulate to download 100w files
    for(int i = 0; i <1000000; i++) {
        threadPool.execute(() -> {
            //Random sleep, simulate download time
            Random random = new Random();
            try {
                TimeUnit.SECONDS.sleep(random.nextInt(100));
            } catch(InterruptedException e) {
                e.printStackTrace();
            }
        });
    }

    return "process";
}

After the program runs, request this batch download method several times, and the program will soon OOM.

Looking at the source code of Executors#newFixedThreadPool, we can see that this method creates a default LinkedBlockingQueue as the task queue.

public static ExecutorService newFixedThreadPool(int nThreads) {
    return new ThreadPoolExecutor(nThreads, nThreads,
                                  0L, TimeUnit.MILLISECONDS,
                                  new LinkedBlockingQueue<Runnable>());
}

This problem lies in the LinkedBlockingQueue, the default construction method of this queue is as follows:

/**
 * Creates a {@code LinkedBlockingQueue} with a capacity of
 * {@link Integer#MAX_VALUE}.
 */
public LinkedBlockingQueue() {
    this(Integer.MAX_VALUE);
}

When creating a LinkedBlockingQueue queue, if we do not specify the number of queues, the default upper limit is Integer.MAX_VALUE. With such a large number, we can almost be regarded as an unbounded queue.

Above we used newFixedThreadPool, we only used a fixed number of threads to download. If threads are executing tasks, the thread pool will add the tasks to the task queue.

If the thread pool executes tasks too slowly, the tasks will always accumulate in the queue. Since our queue can be regarded as unbounded, tasks can be added without restriction, which leads to higher and higher memory usage until OOM bursts.

ps: Basic working principle of thread pool

Let's modify the above example a bit and use newCachedThreadPool to create the thread pool.

After the program runs, please request this batch download method a few times, the program will soon OOM, but the error message this time is different from the previous information.

Judging from the error message, the main reason for this OOM is because it is no longer possible to create new threads.

Looking at the source code of the newCachedThreadPool method this time, you can see that this method will create a thread pool with a maximum number of threads of Integer.MAX_VALUE.

image-20200627180428310

Since this thread pool uses the SynchronousQueue queue, this queue is special and there is no way to store tasks. So by default, as long as the thread pool receives a task, it will create a thread.

Once the thread pool receives a large number of tasks, it creates a large number of threads. Threads in Java occupy a certain amount of memory space, so creating a large number of threads will inevitably lead to OOM.

Look at it first, then develop a habit. WeChat search for "Program Tongshi", follow up and you're done!

Reuse thread pool

Because the thread pool construction method is more complicated, and the thread pool created by Executors is more pitted, we have a project that encapsulates a thread pool tool class.

The tool code is as follows:

public static ThreadPoolExecutor getThreadPool() {
    //In order to quickly reproduce the problem, the number of core threads and the maximum number of threads in the thread pool are set to 100
    return new ThreadPoolExecutor(100, 100, 60, TimeUnit.SECONDS, new LinkedBlockingDeque<>(200));
}

Use this tool class in the project code like this:

@RequestMapping("/batchDownload")
public String batchDownload() {
    ExecutorService threadPool = ThreadPoolUtils.getThreadPool();

    //Simulate to download 100w files
    for(int i = 0; i <100; i++) {
        threadPool.execute(() -> {
            //Random sleep, simulate download time
            Random random = new Random();
            try {
                TimeUnit.SECONDS.sleep(random.nextInt(100));
            } catch(InterruptedException e) {
                e.printStackTrace();
            }
        });
    }

    return "process";
}

Use the WRK tool to initiate multiple requests to this interface at the same time, and the application will soon throw OOM.

Each request will create a new thread pool to perform tasks. If there are a large number of requests in a short time, a lot of thread pools will be created, which indirectly leads to the creation of many threads. As a result, the memory is used up and the OOM problem occurs.

The solution to this problem is simple. Either the tool class generates a singleton thread pool, or the thread pool created in the project code is reused.

Spring asynchronous tasks

In the above code, we are all creating a thread pool to perform asynchronous tasks, which is still more troublesome. In Spring, we can use Spring annotation @Async on methods, and then perform asynchronous tasks.

code show as below:

@Async
public void async() throws InterruptedException {
    log.info("async process");
    Random random = new Random();
    TimeUnit.SECONDS.sleep(random.nextInt(100));
}

However, to use Spring asynchronous tasks, we need to customize the thread pool, otherwise under a large number of requests, there may still be OOM problems.

This is mainly because Spring asynchronous tasks use Spring's internal thread pool SimpleAsyncTaskExecutor by default.

image-20200627191850022

This thread pool is a bit ugly and will not reuse threads. In other words, a request will create a new thread.

So if you need to use asynchronous tasks, you must replace the default thread pool with a custom thread pool.

If using XML configuration, we can add the following configuration:

<task:executor id="myexecutor" pool-size="5" />
<task:annotation-driven executor="myexecutor"/>

If using annotation configuration, we need to set up a Bean:

@Bean(name = "threadPoolTaskExecutor")
public Executor threadPoolTaskExecutor() {
    ThreadPoolTaskExecutor executor=new ThreadPoolTaskExecutor();
    executor.setCorePoolSize(5);
    executor.setMaxPoolSize(10);
    executor.setThreadNamePrefix("test-%d");
    //other settings
    return new ThreadPoolTaskExecutor();
}

Then specify the thread pool name when using annotations:

@Async("threadPoolTaskExecutor")
public void xx() {
    //Business logic
}

If it is a SpringBoot project, from my test situation, the default number of core threads will be 8, the maximum number of threads is Integer.MAX_VALUE, and the number of queues is also Integer.MAX_VALUE thread pool.

ps:The following code is based on Spring-Boot 2.1.6-RELEASE. I am not sure whether the Spring-Boot 1.x version is also this strategy. You can also leave a message and point out to familiar students.

Although the above thread pool does not have to worry about creating too many threads, it is still possible that there are too many queue tasks, causing OOM problems. So is it still recommended to use a custom thread pool, or modify the default configuration in the configuration file, for example:

spring.task.execution.pool.core-size=10
spring.task.execution.pool.max-size=20
spring.task.execution.pool.queue-capacity=200

Spring related stepping pit case: Spring timed task suddenly not executed

Improper use of thread pool method

Finally, let's talk about the pit I stepped on at the beginning of the article. This problem is mainly due to a misunderstanding of this method.

The error codes are as follows:

//Create thread pool
ExecutorService threadPool = ...
List<Callable<String>> tasks = new ArrayList<>();
//Create tasks in batch
for(int i = 0; i <100; i++) {
    tasks.add(() -> {
        Random random = new Random();
        try {
            TimeUnit.SECONDS.sleep(random.nextInt(100));
        } catch(InterruptedException e) {
            e.printStackTrace();
        }
        return "success";
    });
}
//perform all tasks
List<Future<String>> futures = threadPool.invokeAll(tasks);
//Get the result
for(Future<String> future:futures) {
    try {
        future.get();
    } catch(ExecutionException e) {
        e.printStackTrace();
    }
}

In the above code, use invokeAll to perform all tasks. Since the return value of this method is List<Future<T>>, I mistakenly think that this method is executed asynchronously like submit and will not block the main thread.

In fact, from the source code, this method actually calls Future#get one by one to get the task results, and this method will synchronously block the main thread.

Once a task is permanently blocked, such as the Socket network connection location timeout, the task is always blocked on the network connection, which indirectly causes this method to be blocked all the time, which affects the execution of subsequent methods.

If you need to use the invokeAll method, it is best to use another overload method to set the timeout period.

to sum up

Today's article shows you some pits about the use of thread pools through several examples. In order to quickly reproduce the problem, the above sample code is still extreme, and may not be used in practice.

But even so, we must not take the luck of thinking that these tasks will soon be completed. We have encountered several accidents in production, and the normal situation is very fast. But occasionally the external program cramps and the return time becomes longer, which may cause a large number of tasks in the system, resulting in OOM.

Finally, summarize some best practices of several thread pools:

First, the production system carefully uses the convenient methods provided by the Executors class. We need to configure a reasonable number of threads, task queues, rejection strategies, thread recycling strategies, etc. according to our business scenarios, and remember to customize the thread pool. Naming method for easy troubleshooting.

Second, do not create thread pools repeatedly. Creating a thread pool every time may be worse than not using a thread pool. If you use the thread pool tool class created by other students, it is best to take a look at the implementation method to prevent misuse.

Third, you must not use the API method according to your one-sided understanding. If you are not sure, you must look at the method notes and related source code.

Welcome to pay attention to my public account:program communication, get daily dry goods push. If you are interested in my special content, you can also follow my blog: studyidea.cn