The project has created hundreds of threads. How do you optimize it?

preface

Have you ever encountered this situation when you open the Profiler tool of Android studio

Hundreds of threads?? Why is the name 12345? Why are they all in sleep or wait but not destroyed?

In fact, when the scale of a project becomes larger and larger, with the change of developers, the non-standard old code and the introduction of more and more third-party SDKs, it is difficult to avoid the problem of soaring number of threads. When there are too many threads, there are not only oom risks, but also many hidden dangers of memory leakage. But the Profiler tool only knows the number of threads. Use thread The getallstacktraces () method only gets the run-time stack of the thread, that is, the run() method ends. I don't know who called start(). Therefore, it is impossible to know the business logic corresponding to the current thread and whether it should have been destroyed at the current time.

thinking

How can I know the call stack of a thread start() method? If we have a BaseThread and all threads use or inherit from it, we can get the stack information in the start() method:

@Synchronized
override fun start() {
    val stackElements = Throwable().stackTrace
    super.start()
} 

To do this, you can change the bytecode during asm compilation. Here is a brief introduction to asm. It can traverse all the class files (including those written by yourself and in the third-party jar package) in the project after the code is compiled into class and packaged into dex through the custom gradle plug-in. During the traversal, you can arbitrarily modify the class bytecode through asm to achieve various hidden purposes. Then we can replace all threads in the project through asm:

new Thread() -> new BaseThread()
extends Thread -> extends BaseThread 

For specific asm related codes, see:

https://github.com/codoon/ThreadTracker/tree/master/threadtracker-plugin/src/main/groovy/com/codoon/threadtracker/plugins

operation

Now that there is a call stack, how to establish a relationship between the call stack and the Thread? In fact, Thread id has been generated during Thread initialization. We can put Thread id and corresponding stack information into a map in start(). In addition, we can also duplicate the run () method when super After run() is executed, the Thread will be destroyed. We can remove the corresponding information in the map at this time.

okay! That's it. Let's look at the thread pool. The threads in the thread pool cannot be replaced according to the method just described, because these codes are in the framework layer and cannot be managed by asm. And for the thread pool, what we need is not where its thread is started, but where the task being executed by the thread is added, so that we can know which business added the task to keep the thread running.

There are generally two ways to create a thread pool: directly new and calling executors XXX, let's first look at the new. According to the above routine, make a BaseThreadPool, replace it all with asm, and then get the thread pool in the constructor to create a stack, and then copy the methods of submitting tasks such as execute, submit and invokeAny to get the task and add the stack. But how does the obtained stack correspond to the thread? We know that the submitted tasks are in the form of Runnable or Callable. If we write a PoolRunnable to package it, pass in the thread pool name and the task add stack, and then call thread Currentthread() gets the current thread and associates the thread pool name, thread pool creation stack, thread id and task addition stack.

class PoolRunnable constructor(
    private val any: Any,
    private val callStack: String,
    private val poolName: String? = null
) : Runnable, Callable<Any>, Comparable<Any> {

    override fun run() {
        val threadId = Thread.currentThread().id
        //At this point, callStack, poolName and thread can all be associated
        //poolName and poolCreateStack can create associations outside in advance

        (any as Runnable).run()

        //The task has been executed. callStack indicates that the task adds a stack. At this time, it should be empty, indicating that there are no tasks running in the thread
        info.callStack = ""
    }

    override fun call(): Any {
        //Similar to the run() method
    }

    override fun compareTo(other: Any): Int {
        //Omit code
    }
} 

Here's a point to note. Some task s may inherit from runnable, Callable and even Comparable at the same time. If they are only packaged as runnable, they will crash when calling other interface methods. Therefore, all known interfaces that may inherit should be implemented here. (of course, if it is found later that the coverage is not complete, you can continue to add new interfaces. As for the implementation of many interfaces by user-defined runnable, don't worry. Because the system code is running after replacing PoolRunnable, it mainly depends on whether the system code will call call() and compareTo() methods. It's OK for the upper layer to call all kinds of custom methods in the packaged runnable at will).

However, for establishing the "thread thread pool" relationship, it feels a little late to wait until the run() method is executed. Here you can go further and replace the threadFactory in the drop process pool. It is also a layer of its own package. You can get the threads created by the thread pool in time in the newThread() method. In this way, you can first establish an association with the thread pool, and then establish an association with the stack when running (). okay! The thread pool created in new mode is finished. Next, let's look at executors How to play XXX? Can we write ProxyExecutors by ourselves and use asm to put all executors XXX is replaced by ProxyExecutors XXX, and then replace the new ThreadPool with new BaseThreadPool?

object ProxyExecutors {

    @JvmStatic
    fun newFixedThreadPool(nThreads: Int): ExecutorService {
        return BaseThreadPoolExecutor(
            nThreads, nThreads,
            0L, TimeUnit.MILLISECONDS,
            LinkedBlockingQueue()
        )
    }

    @JvmStatic
    fun newCachedThreadPool(): ExecutorService {
        return BaseThreadPoolExecutor(
            0, Int.MAX_VALUE,
            60L, TimeUnit.SECONDS,
            SynchronousQueue()
        )
    }

    @JvmStatic
    fun newScheduledThreadPool(
        corePoolSize: Int,
        threadFactory: ThreadFactory?
    ): ScheduledExecutorService {
        return BaseScheduledThreadPoolExecutor(corePoolSize, threadFactory)
    }
} 

For some thread pools, there is no problem, but when you continue to write, there will be problems, such as:

@JvmStatic
    fun newSingleThreadScheduledExecutor(): ScheduledExecutorService {
        return BaseDelegatedScheduledExecutorService(
            ScheduledThreadPoolExecutor(1)
        )
    } 

DelegatedScheduledExecutorService is a private internal class of Executors. It is not convenient to write BaseDelegatedScheduledExecutorService. Therefore, continue to observe and find that all thread pool creation methods return ExecutorService or ScheduledExecutorService. Since they are so unified and both are interfaces, let's use dynamic proxy! Through the agent, you can get various methods and method parameters of the interface, and then do whatever you want. Take the ExecutorService interface as an example:

object ProxyExecutors {
    @JvmStatic
    fun newFixedThreadPool(nThreads: Int): ExecutorService {
        return proxy(Executors.newFixedThreadPool(nThreads))
    }
} 

private fun proxy(executorService: ExecutorService): ExecutorService {
        if (executorService is ThreadPoolExecutor) {
            //Like BaseThreadPoolExecutor, ThreadFactory is set to obtain thread information and establish contact with thread pool as soon as possible instead of waiting for run
            executorService.threadFactory = BaseThreadFactory(
                executorService.threadFactory,
                toObjectString(executorService)
            )
        }
        val handler = ProxyExecutorService(executorService)
        return Proxy.newProxyInstance(
            executorService.javaClass.classLoader,
            AbstractExecutorService::class.java.interfaces,
            handler
        ) as ExecutorService
    } 
//java is used here because kotlin has a hole when calling java variable length parameter methods
public class ProxyExecutorService implements InvocationHandler {
    private ExecutorService executor;
    private String poolName = null;

    ProxyExecutorService(ExecutorService executor) {
        this.executor = executor;
        poolName = TrackerUtils.toObjectString(executor);
        //Get thread pool information during initialization
        String createStack = TrackerUtils.getStackString(false);
        //Omit some codes
    }

    @Override
    public Object invoke(Object proxy, Method method, Object[] args) throws Throwable {
        //Because of the large number of methods, and the methods in the various types of agents are also inconsistent
        //Therefore, as long as the parameters of Runnable and Callable types are contained in the called method, they are replaced with PoolRunnable proxy
        if (args != null) {
            String callStack = TrackerUtils.getStackString(true);
            for (int i = 0; i < args.length; i++) {
                Object arg = args[i];
                if ((arg instanceof Runnable || arg instanceof Callable) && !(arg instanceof PoolRunnable)) {
                    //execute , submit , etc
                    PoolRunnable any = new PoolRunnable(arg, callStack, poolName);
                    //Replace method parameters
                    args[i] = any;
                } else if (arg instanceof Collection && !((Collection) arg).isEmpty()) {
                    //invokeAny , invokeAll , etc
                    Iterator iter = ((Collection) arg).iterator();
                    ArrayList<PoolRunnable> taskList = new ArrayList<>();
                    boolean allOk = iter.hasNext();
                    while (iter.hasNext()) {
                        Object it = iter.next();
                        if (it instanceof Runnable || it instanceof Callable) {
                            if (it instanceof PoolRunnable) {
                                taskList.add((PoolRunnable) it);
                            } else {
                                taskList.add(new PoolRunnable(it, callStack, poolName));
                            }
                        } else {
                            allOk = false;
                            break;
                        }
                    }
                    if (allOk) {
                        //Replace method parameters
                        args[i] = taskList;
                    }
                }
            }
        }

        if (method.getName().equals("shutdown") || method.getName().equals("shutdownNow")) {
            ThreadInfoManager.getINSTANCE().shutDownPool(poolName);
        }
        return method.invoke(executor, args);
    }
} 

In the invoke method, we replace all parameters with Runnable or Callable with their own PoolRunnable. Later, like the routine of new creating thread pool, we associate the thread with the stack in the run() or call() method of PoolRunnable. Perfect! But consider the question, what happens if someone unfortunately writes such code?

val pool = Executors.newFixedThreadPool(3) as ThreadPoolExecutor

By the way, it will crash, because after our dynamic proxy, the proxy object becomes ExecutorService and cannot be transformed downward into ThreadPoolExecutor. This is also a disadvantage of dynamic proxy. We can only proxy the interface. If a Class A has many own methods in addition to implementing the interface, the dynamic proxy can do nothing about these methods. The proxy object is only an instance of the interface and cannot be transformed into class A. This problem can be solved by using cglib/javassist library, which will not be expanded here. For the sake of insurance, we can replace all that can be replaced with Base. Only those that cannot be replaced with Base, such as newSingleThreadScheduledExecutor(), can use dynamic agent:

object ProxyExecutors {

    @JvmStatic
    fun newFixedThreadPool(nThreads: Int): ExecutorService {
        //The use of BaseThreadPoolExecutor here is mainly to avoid the problem that the upper layer code transforms ExecutorService into ThreadPoolExecutor. If the proxy method is used for dynamic proxy, the upper layer will crash
        return BaseThreadPoolExecutor(
            nThreads, nThreads,
            0L, TimeUnit.MILLISECONDS,
            LinkedBlockingQueue()
        )
        // return proxy(Executors.newFixedThreadPool(nThreads))
    } 

ok, the thread pool is done. But in addition to threads and thread pools, there are many classes in Java and Android that encapsulate thread pools. For example, HandlerThread, Timer, ASyncTask, etc., let's deal with these common classes first.

HandlerThread is essentially a Thread. It's good to adopt a processing method similar to Thread.

There is a private TimerThread member inside Timer, which is also a Thread in essence. From the source code, we can see that Timer starts this Thread in the construction method, so the call stack is the stack in which Timer is initialized. We can create a new BaseTimer, use asm to replace the Timer in the code, then obtain the call stack in the construction method, and then get the internal Thread member through reflection, obtain its id and other information, and associate it with the call stack.

ASyncTask is relatively difficult. There are mainly two executors: THREAD_POOL_EXECUTOR and SERIAL_EXECUTOR, whose main work is THREAD_POOL_EXECUTOR, while SERIAL_EXECUTOR is not a thread pool in the real sense. It just implements the executor interface. SERIAL_ The execute () method of executor adds tasks to a queue, and then takes them out in turn and sends them to thread_ POOL_ Execute in executor, so we need thread_ POOL_ Thread pool information of executor, associated with serial_ The execute() task of executor adds stack information.

And the outside world can use thread directly_ POOL_ Executor adds tasks, in which case thread is required_ POOL_ The task of executor adds stack information, so some special processing is required here. The general idea is still dynamic agent SERIAL_EXECUTOR and thread_ POOL_ Execute, and then set the agent back to ASyncTask when the app starts. Since the two thread pool objects in ASyncTask are final, you need to modify modifiers through reflection to remove the final bit. This is no problem in systems 5.0 and above, but 4.0 In the source code of X, modifiers are obtained through native:

/**
     * Returns the modifiers for this field. The {@link Modifier} class should
     * be used to decode the result.
     *
     * @return the modifiers for this field
     * @see Modifier
     */
    public int getModifiers() {
        return getFieldModifiers(declaringClass, slot);
    }

    private native int getFieldModifiers(Class<?> declaringClass, int slot); 

No modification method has been found for this, so if you want to track ASyncTask, please use Android 5.0 and above phones. So far, the traceability of thread / thread pool has been basically completed. In the future, we will continue to improve IntentService, ForkJoinPool and other less commonly used thread / thread pool encapsulation classes encapsulated in Java/Android source code. The effect of the first edition is as follows:

You can see that the user code is highlighted in the interface to help users see the root cause of the problem at a glance in the confused call stack. The principle is that in the process of asm scanning the class, the code written by the user is recorded according to the obtained project information (it can not be judged directly according to whether it is a jar package. If there are multiple modules in the project, these modules may be scanned by asm in the form of jar packages), then the relevant class package names are recorded, and the "longer package names" containing "existing short package names" are filtered, Write these package names to the specified ArrayList member of the specified Java class through asm. After the call stack is obtained, it can be compared with these package name list s for highlighting/ Summary / finally, this paper provides a thread tracing idea as a brick to attract jade. In fact, it can be seen that asm is still very powerful. What it can do depends on everyone's imagination. Later, new functions will be gradually improved, such as recording thread running time, filtering / sorting according to the status of each dimension of the thread, counting the package name or jar file name of the thread, and looking at the wrongdoing of those three-party libraries (especially some advertising SDKs), Even give warnings and optimization suggestions for suspicious running status of thread / thread pool, etc. Finally, as a beginner of asm, many usages are still relatively elementary. If there is a more elegant implementation method, welcome to crazy pull requests; In addition, although the project can run normally on the relatively large and complex app of Gudong, it is inevitable that there are some thoughtless places. After all, replacing bytecode is a dangerous operation. If there is a crash after the introduction, you are welcome to raise issues ~ project address:

https://github.com/codoon/ThreadTracker

Welcome, Star_

Android core knowledge notes github: https://github.com/AndroidCot/Android

Tags: Android Design Pattern

Posted by uberpolak on Sat, 21 May 2022 18:01:49 +0300