Arthas realizes a CPU troubleshooting and code hot update

Author he Bo

[the official community of Arthas is holding an essay solicitation activity. You will get a prize if you participate~ Click Submit]

1. Preface

background

Online code often takes up too much CPU. According to past experience, I will use the top instruction and further use jstack to check the specific information for problem troubleshooting, but basically, I can't escape the situation of re contracting. It is an incremental package in time, and the application also needs to be stopped for a short time. Later, the big brother of operation and maintenance asked me to try Arthas. He said that he could perform the hot update operation of the code. He just came to have a try.

For the installation and basic use of Arthas, please refer to the following two articles:

environment

JDK1.8
SPringBoot 2.2.2
Arthas
Linux

Test code:

@RequestMapping(value = "/bigThread")
@ResponseBody
public String bigThread(int id) {
    ArthasService.test();
    while (true) {
        Thread t2 = new Thread();
        t2.start();
        id ++;
        if(100000 == id) {
            return String.valueOf(id);
        }
    }
}

thinking

2.thread -b check whether there are blocked threads

thread -b: find out the thread that is currently blocking other threads. If it is not found after execution, it indicates that the thread is not blocked and executed all the time.

3.thread view the thread with the highest occupancy

When the thread is not followed by parameters, all the current thread information will be displayed. I think thread -n 10 and the first 10 should be enough, which can be determined according to the actual needs.

It can be seen intuitively in the figure below that our application occupies 77% of the CPU at the moment (here I view it through thread at the moment of initiating the request, so it is more intuitive. The production environment should only be blocked and deadlock).

4.thread id view specific information

Based on the previous step, let's take a further look at thread 15 (because the above ID=15).

His general meaning is: the thread is waiting for a condition to continue execution. You can see that the method is executing linkedblockingqueue When using the take method, check the API prompt of this method as follows:

public E take() throws InterruptedException {
    E x;
    int c = -1;
    final AtomicInteger count = this.count;
    final ReentrantLock takeLock = this.takeLock;
    takeLock.lockInterruptibly();
    try {
        while (count.get() == 0) {
            notEmpty.await();
        }
        x = dequeue();
        c = count.getAndDecrement();
        if (c > 1)
            notEmpty.signal();
    } finally {
        takeLock.unlock();
    }
    if (c == capacity)
        signalNotFull();
    return x;
}

Among them: AtomicInteger is used to ensure atomicity in the case of high concurrency. ReentrantLock marks reentrant locks, which need to be understood under the JUC package. It is not repeated here, but Baidu.

The key point of this code is: notempty Await() consumes data from the queue. When the queue is empty, the thread is blocked, so we generally know that the problem now is thread blocking, but we still don't know which line of code.

If you can clearly know which code has been changed this time, you can directly perform step 6. If you don't know, you can locate the problem through step 5.

5.watch to see which Controller executes the code

watch org.springframework.web.servlet.DispatcherServlet getHandler returnObj

This script can detect all methods that match the Handler through the dispatcher servlet, that is, the request to enter the Controller, as follows:

After finding the corresponding code, let's further observe the exception information. There may be a problem here: why do I need such a cumbersome operation when I can view the error information through the log. My business scenario is: the log is still very large, and it is brushed away as soon as it is retrieved. At this time, it is not easy to locate the log. Of course, it is also possible and intuitive to retrieve the log. I usually check the log for problem location. Here is also an idea.

6.watch the exception information of this method

watch Class full path method name "{params[0],throwExp}" -e -x 2

As mentioned above, the error is intuitively prompted, and the following can be repaired and solved. Here, we can also check the execution time through the trace instruction:

trace Class full path method name "{params[0],throwExp}" -e -x 2

The returned information is as follows. You can also see the error information and the execution time of each method.

[arthas@10999]$ trace com.arthas.controller.OrderController bigThread
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 53 ms, listenerId: 10
`---ts=2020-08-19 14:45:57;thread_name=http-nio-0.0.0.0-8080-exec-10;id=16;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@1f1c7bf6
    `---[1452.684406ms] com.arthas.controller.OrderController:bigThread() [throws Exception]
        +---[0.168814ms] com.arthas.service.ArthasService:test() #20
        `---throw:java.lang.OutOfMemoryError #-2 [unable to create new native thread]

7.jad decompile hot update

After knowing the problem above, let's locate the problem.

Command: jad class full path method name

[arthas@13190]$ jad com.arthas.controller.OrderController

ClassLoader:                                                                                                                                                                                
+-org.springframework.boot.loader.LaunchedURLClassLoader@17f052a3                                                                                                                           
  +-sun.misc.Launcher$AppClassLoader@3d4eac69                                                                                                                                               
    +-sun.misc.Launcher$ExtClassLoader@45f45fa1                                                                                                                                             

Location:                                                                                                                                                                                   
file:/opt/software/arthas/Arthas.jar!/BOOT-INF/classes!/                                                                                                                                    

/*
 * Decompiled with CFR.
 * 
 * Could not load the following classes:
 *  com.arthas.service.ArthasService
 *  org.springframework.stereotype.Controller
 *  org.springframework.web.bind.annotation.RequestMapping
 *  org.springframework.web.bind.annotation.ResponseBody
 */
package com.arthas.controller;

import com.arthas.service.ArthasService;
import org.springframework.stereotype.Controller;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.ResponseBody;

@Controller
public class OrderController {
    @RequestMapping(value={"/bigThread"})
    @ResponseBody
    public String bigThread(int id) {
        ArthasService.test();
        do {
            Thread t2 = new Thread();
            t2.start();
        } while (100000 != ++id);
        return String.valueOf(id);
    }
}

Affect(row-cnt:1) cost in 1513 ms.

At this point, the code is decompiled. In order to change it, we need to output it as a java file.

Instruction: JAD com arthas. controller. OrderController > /tmp/OrderController. java

That is: jad class full path method name > storage path / storage name

Then go to the tmp path to vi and modify the java file. After the modification, check the corresponding classloader to prepare for compilation.

sc -d *OrderController | grep classLoaderHash
mc -c 17f052a3 /tmp/OrderController.java -d /tmp

But there is a compilation error here. The official prompt is:

So we compile the class file locally and upload it the same.

Call before Compilation

[arthas@13190]$ trace com.arthas.controller.OrderController bigThread
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 77 ms, listenerId: 2
`---ts=2020-08-19 15:51:46;thread_name=http-nio-0.0.0.0-8080-exec-1;id=d;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@1f1c7bf6
    `---[6734.666529ms] com.arthas.controller.OrderController:bigThread() [throws Exception]
        +---[0.786517ms] com.arthas.service.ArthasService:test() #20
        `---throw:java.lang.OutOfMemoryError #-2 [unable to create new native thread]

Pre update code

@RequestMapping(value = "/bigThread")
@ResponseBody
public String bigThread(int id) {
    ArthasService.test();
    while (true) {
        Thread t2 = new Thread();
        t2.start();
        id ++;
        if(100000 == id) {
            return String.valueOf(id);
        }
    }
}

Updated code

@RequestMapping(value = "/bigThread")
@ResponseBody
public String bigThread(int id) {
    ArthasService.test();
    Thread t2 = new Thread();
    t2.start();
    return "success";
}

Compile instruction

[arthas@13190]$ redefine /tmp/OrderController.class
redefine success, size: 1, classes:
com.arthas.controller.OrderController

Call three times after compilation

`---ts=2020-08-19 15:52:02;thread_name=http-nio-0.0.0.0-8080-exec-3;id=f;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@1f1c7bf6
    `---[5.609405ms] com.arthas.controller.OrderController:bigThread()
        `---[0.204675ms] com.arthas.service.ArthasService:test() #20

`---ts=2020-08-19 15:52:04;thread_name=http-nio-0.0.0.0-8080-exec-4;id=10;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@1f1c7bf6
    `---[3.900149ms] com.arthas.controller.OrderController:bigThread()
        `---[0.14636ms] com.arthas.service.ArthasService:test() #20

`---ts=2020-08-19 15:52:04;thread_name=http-nio-0.0.0.0-8080-exec-5;id=11;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@1f1c7bf6
    `---[1.90945ms] com.arthas.controller.OrderController:bigThread()
        `---[0.147353ms] com.arthas.service.ArthasService:test() #20

It can be found that the time has changed from 6734.666529ms to about 3ms, indicating that the hot update code has taken effect.

8.profile draw flame diagram for subsequent analysis

As shown in the figure below:

Arthas essay solicitation activity is in progress

Arthas officials are holding an essay solicitation activity. If you have:

  • Problems checked with Arthas
  • Interpret the source code of Arthas
  • Make recommendations to Arthas
  • Unlimited, other contents related to Arthas

Welcome to the essay solicitation activity and take the prize~ Click Submit

"Alibaba cloud native Pay attention to the technical fields such as microservices, Serverless, containers, Service Mesh, the popular technology trend of focusing cloud native, and the large-scale implementation practice of cloud native, and be a official account that knows the cloud native developers best. "

Tags: Java Web Development string thread cpu test service

Posted by tech0925 on Thu, 19 May 2022 22:04:54 +0300