banner
田野放空

田野放空

认真扮演一个擅长白日做梦的普通人

Arthas Sharing

Introduction#

Arthas is an online monitoring and diagnostic product that allows real-time viewing of application memory, GC, and thread status information from a global perspective. It can diagnose business issues without modifying application code, including viewing method call parameters, exceptions, monitoring method execution time, class loading information, etc., significantly improving the efficiency of online issue troubleshooting.

Problem Solving:

  1. Local development environment cannot access the production environment, making debugging difficult when issues arise.
  2. Certain issues cannot be easily reproduced in different environments.
  3. Adding some logs to help resolve issues involves deployment.

Basic Commands#

Download and Install:#

Machine installation:
curl -O https://arthas.aliyun.com/arthas-boot.jar

Command List:#

image

  • dashboard - Real-time data dashboard of the current system, refreshed every 5 seconds
ID: Java-level thread ID, note that this ID does not correspond one-to-one with the nativeID in jstack.
NAME: Thread name
GROUP: Thread group name
PRIORITY: Thread priority, a number between 1 and 10, with larger numbers indicating higher priority. setPriority(1-10)
STATE: Thread state
CPU%: Thread CPU usage. For example, with a sampling interval of 1000ms, if a thread's incremental CPU time is 100ms, then CPU usage = 100/1000 = 10%
DELTA_TIME: Incremental CPU time the thread has run since the last sample, data format is seconds
TIME: Total CPU time the thread has run, data format is minutes:seconds
INTERRUPTED: Current interrupt status of the thread
DAEMON: Whether it is a daemon thread

C1 C2 CompilerThread:
Garbage collection count gc.ps_scavenge.count, garbage collection time gc.ps_scavenge.time, mark-sweep algorithm count gc.ps_marksweep.count, mark-sweep algorithm time
  • heapdump - Dump Java heap, similar to the heap dump function of the jmap command

    heapdump xx-output/dump.hprof
    The generated file is in the arthas-output directory and can be opened in a browser: http://localhost:8563/xx-output/

  • jvm - View current JVM information

    Thread:
    COUNT: Current number of active threads in the JVM
    DAEMON-COUNT: Current number of active daemon threads in the JVM
    PEAK-COUNT: Maximum number of threads that have ever been alive since the JVM started
    STARTED-COUNT: Total number of threads started since the JVM started
    DEADLOCK-COUNT: Current number of deadlocked threads in the JVM
    File descriptors:
    MAX-FILE-DESCRIPTOR-COUNT: Maximum number of file descriptors that the JVM process can open
    OPEN-FILE-DESCRIPTOR-COUNT: Current number of open file descriptors in the JVM

  • logger - View and modify logger

    Update logger level
    logger --name ROOT --level debug

  • memory - View JVM memory information

  • thread - View the current JVM thread stack information

    thread: Default sorted in descending order by CPU incremental time
    thread -n 3 : Display the top 3 busiest threads and print the stack
    thread id : Display the running stack of the specified thread
    thread -b : Find the thread currently blocking other threads: Currently only supports finding threads blocked by the synchronized keyword
    thread -i : Specify sampling time interval
    thread --state : View threads in the specified state

  • vmoption - View and modify JVM diagnostic-related options

    Update specified option
    vmoption PrintGC true

  • vmtool - Query objects from the JVM, execute forceGc

    vmtool --action forceGc

  • classloader - View the inheritance tree of class loaders, URLs, class loading information, use class loader to getResource

    View statistics by class loading type: classloader
    View statistics by class loading instance: classloader -l
    View ClassLoader inheritance tree: classloader -t

  • dump - Dump the bytecode of loaded classes to a specific directory

    dump com.xx.xx.controller.ArthasController Generates ArthasController.class file

  • jad - Decompile the source code of the specified loaded class

    By default, the decompiled result will include ClassLoader information. Using the --source-only option, only the source code will be printed
    jad --source-only com.xx.xx.controller.ArthasController
    jad --source-only com.xx.xx.controller.ArthasController deadLock

  • redefine - Load external .class files and redefine them in the JVM

    The redefine command conflicts with commands like jad/watch/trace/monitor/tt. After executing redefine, if any of the aforementioned commands are executed again, the bytecode of redefine will be reset.

  • retransform - Load external .class files and retransform them in the JVM

    1. Load external .class file
      retransform /Users/xx/IdeaProjects/ArthasController.class
    2. View retransform entries
      retransform -l
    3. Delete retransform entries:
      retransform -d 1
      retransform --deleteAll
      Note:
    4. If all retransform entries are not cleared and retransform is triggered again, then when arthas stops, the classes that have been retransform will still be effective.
    5. If retransform is not cleared and Arthas is stopped, it can be restarted, and the jad-mc-retransform will still decompile to the code before modification.
    6. Retransform does not allow adding new fields or methods.
  • sc - View information about classes loaded in the JVM

    sc -d com.xx.xx.controller.*
    sc -d com.xx.xx.controller.ArthasController

  • sm - View method information of loaded classes

    sm com.xx.xx.controller.ArthasController *

Please note that these commands are implemented through bytecode enhancement technology, which inserts aspects into the specified class methods to achieve data statistics and observation. Therefore, when using them online or in pre-release, please clearly define the classes, methods, and conditions to be observed. After diagnosis, execute stop or reset the enhanced classes.
Always execute commands with -n; otherwise, when encountering methods with a very high call volume, it can instantly overwhelm your JVM memory.

  • monitor - Method execution monitoring

    Monitor the flag method in ArthasController, printing once every 5 seconds, for a total of 10 times
    monitor com.xx.xx.controller.ArthasController flag -n 10 --cycle 5
    Count requests where parameter param[0] <= 2
    monitor -c 5 demo.MathGame primeFactors "params[0] <= 2"

  • stack - Output the call path of the current method
    Many times we know that a method is being executed, but the path of execution can be very numerous, or we may not know where the method is being executed from.

stack com.xx.xx.service.impl.WjVideoServiceImpl preprocessingVideo -n 5

  • trace - Internal call path of the method, outputting the time spent at each node along the method path

    • Trace can conveniently help you locate and discover performance issues caused by high RT, but it can only trace the first-level method call chain at a time.
    • Many times we only want to see the trace results of a certain method when its RT exceeds a certain time. You can use '#cost>100' to indicate that trace results will only be output when execution time exceeds 100ms.
    • Currently, tracing of java.lang.Thread getName is not supported.

    --exclude-class-pattern Exclude specified classes
    --skipJDKMethod true Trace functions in the JDK
    '#cost > 10' Trace results will only be output when execution time exceeds 10ms.
    Trace multi-level method implementation chain calls
    trace -E com.xx.xx.service.impl.VideoServiceImpl|com.xx.xx.controller.AutoCutController|com.xx.xx.service.impl.WjVideoServiceImpl preprocessing|preprocessingVideo -n 5 '#cost>10'

  • tt(TimeTunnel) - Time-space tunnel for method execution data, records the input parameters and return information of each call to the specified method, and allows observation of these calls at different times

    tt -t com.kscs.capcut.service.impl.WjVideoServiceImpl preprocessingVideo -n 5
    tt -l Retrieve call records tt -s 'method.name=="preprocessingVideo"'
    tt -i 1000 View call information
    tt -i 1004 -p Re-execute a call

  • watch - Observe the calling situation of the specified function. The observable range includes: return value, thrown exceptions, input parameters.

    • -f Observe after the function returns
    • -x Output result property traversal depth

    watch com.xx.xx.service.impl.WjVideoServiceImpl preprocessingVideo '{params,returnObj,throwExp}' -n 5 -x 3

Problem Solving:#

1. Use Arthas to Accurately Locate High CPU Load Issues in Java Applications#

View the threads with the highest CPU usage  
1. thread / thread -n 3
2. thread id

1. The method is called in a loop
2. There is an internal infinite loop

We use the tt command to listen to the call situation of this method
tt -t com.xx.xx.controller.ArthasController deadAdd -n 5 
When executing this command online, be sure to remember to add the -n parameter; otherwise, the huge traffic online may instantly overwhelm your JVM memory.

Decompile the method to see
jad --source-only com.xx.xx.controller.ArthasController deadAdd


top --> top -Hp pid --> jstack pid 16 hexadecimal

2. Use Arthas to Analyze Memory Issues#

dashboard View GC data
 gc.ps_scavenge.count      gc.ps_marksweep.count  

Use the vmoption command to dynamically enable GC logging
vmoption PrintGC true
vmoption PrintGCDetails true

Use vmtool to force GC
vmtool --action forceGc
https://www.v2ex.com/t/792635

3. Dynamically Modify Log Print Levels#

logger
By default, the logger command will execute under SystemClassloader. If the application is a traditional war application or a Spring Boot fat jar application, then the classloader needs to be specified.
 logger -c 685f4c2e  --name ROOT --level debug

4. Decompile Source Code to Check if the JVM Loaded the Expected File Content.#

jad --source-only com.xx.xx.controller.AutoCutController preprocessing

5. Method Internal Call Paths and Each Method Call Duration.#

    monitor: This command is used to monitor the execution of methods, including the number of calls, successful calls, failed calls, maximum duration, minimum duration, and average duration. It is typically used to locate performance issues.
[arthas@19702]$ monitor com.xx.xx.service.impl.WjVideoServiceImpl preprocessingVideo  -n 10  --cycle 10
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 49 ms, listenerId: 3
 timestamp         class                     method                    total    success  fail    avg-rt(  fail-ra
                                                                                                 ms)      te
------------------------------------------------------------------------------------------------------------------
 2023-07-17 16:21  com.xx.xx.service.  preprocessingVideo        1        1        0       3755.98  0.00%
 :14               impl.WjVideoServiceImpl

 
 
    trace: This command is used to trace the call paths and durations of methods. It records the input parameters and return results of methods, as well as other methods called internally and their execution times.
 [arthas@19702]$  trace -E com.xx.xx.service.impl.VideoServiceImpl|com.xx.xx.controller.AutoCutController|com.xx.xx.service.impl.WjVideoServiceImpl preprocessing|preprocessingVideo -n 5  --skipJDKMethod false '1==1'
Press Q or Ctrl+C to abort.
Affect(class count: 4 , method count: 4) cost in 157 ms, listenerId: 2
`---ts=2023-07-17 16:20:42;thread_name=http-nio-8183-exec-8;id=2b;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@3576ddc2
    `---[1.069325ms] com.xx.xx.controller.AutoCutController$$EnhancerBySpringCGLIB$$333516af:preprocessing()
        `---[83.17% 0.889312ms ] org.springframework.cglib.proxy.MethodInterceptor:intercept()
            `---[43.36% 0.385619ms ] com.xx.xx.controller.AutoCutController:preprocessing()
                +---[6.45% 0.024868ms ] org.slf4j.MDC:get() #31
                +---[1.90% 0.007333ms ] com.xx.xx.common.vo.ProcessingVO:setTaskId() #31
                +---[70.24% 0.270861ms ] com.xx.xx.service.VideoService:preprocessingVideo() #32
                `---[2.93% 0.011295ms ] com.xx.xx.base.model.Result:success() #33


    tt: This command is used to record the historical call situation of methods, including input parameters, return results, and exception information. It can view both the current and historical call situations of methods and retry historical calls.
[arthas@19702]$ tt -t com.kscs.capcut.controller.AutoCutController preprocessing -n 5
Press Q or Ctrl+C to abort.
Affect(class count: 2 , method count: 2) cost in 39 ms, listenerId: 4
 INDEX  TIMESTAMP         COST(ms  IS-RE  IS-EX  OBJECT       CLASS                     METHOD
                          )        T      P
------------------------------------------------------------------------------------------------------------------
 1000   2023-07-17 16:23  0.20207  true   false  0x5ec8b24c   AutoCutController         preprocessing
        :17               4
 1001   2023-07-17 16:23  1.94166  true   false  0xe637be2a   AutoCutController$$Enhan  preprocessing
        :17               4                                   cerBySpringCGLIB$$333516

6. Code Hot Update#

1. Decompile the source code of the loaded class
jad --source-only com.kscs.capcut.controller.ArthasController 

2. Modify local code and compile, test. After passing, find the compiled .class file in the target directory
3. Hot update
retransform /Users/edy/IdeaProjects/ArthasController.class
2. View retransform entries
    retransform -l 
3. Delete retransform entries:
    retransform -d 1
    retransform --deleteAll

Impact on Service Performance#

When executing commands like trace/watch/tt/monitor, Arthas essentially inserts code before and after the method, which invalidates the code originally generated by JIT compilation in the JVM, requiring a return to the original code and re-JIT compilation. Therefore, high concurrency function calls can cause jitter.
Thus, it is advisable to avoid tracing/watching a large number of classes at once.
It is recommended to select a low-load machine in the cluster for analysis in online environments; if there is only one machine, use it cautiously.

Demo Used#

public class ArthasController {
    public static boolean flag = true;
    private List<ProcessingVO> stringList = new ArrayList<>();

    @RequestMapping(method = RequestMethod.POST, value = "flag")
    public Result<Map<String, String>> flag(Integer param1, Integer param2) {
        flag = !flag;
        return Result.success();
    }

    @RequestMapping(method = RequestMethod.POST, value = "oom")
    public Result<List<ProcessingVO>> oom() {
        stringList.clear();
        for (int i = 0; i < 1000000; i++) {
            if (stringList.size() > 500000) {
                continue;
            }
            ProcessingVO vo = new ProcessingVO();
            vo.setVideoName("videoName" + i);
            vo.setTaskId("taskId" + i);
            stringList.add(vo);
        }
        return Result.success();
    }

    @RequestMapping(method = RequestMethod.POST, value = "deadLock")
    public Result<Map<String, String>> deadLock() {
        Thread thread1 = new Thread(() -> {
            synchronized (String.class) {
                log.info("thread1 get String lock");
                try {
                    Thread.sleep(10000);
                } catch (Exception e) {
                    e.printStackTrace();
                }
                synchronized (Integer.class) {
                    log.info("thread1 get Integer lock");
                }
            }
        });
        Thread thread2 = new Thread(() -> {
            synchronized (Integer.class) {
                log.info("thread2 get Integer lock");
                try {
                    Thread.sleep(100);
                } catch (Exception e) {
                    e.printStackTrace();
                }
                synchronized (String.class) {
                    log.info("thread2 get String lock");
                }
            }
        });
        thread1.setName("deadlock-thread1");
        thread2.setName("deadlock-thread2");
        thread1.start();
        thread2.start();
        return Result.success();
    }

    @RequestMapping(method = RequestMethod.POST, value = "highCPU")
    public Result<Map<String, String>> highCPU() {
        Thread highCpuThread = new Thread(this::deadAdd);
        highCpuThread.setName("HighCpu");
        highCpuThread.start();
        return Result.success();
    }
    //0<-1<-2<-0
    private void deadAdd() {
        ListNode node2 = new ListNode(2, null); // Create node (data + reference)
        ListNode node1 = new ListNode(1, node2);
        ListNode node = new ListNode(0, node1);
        node2.next = node;
        while (node != null) {
            if (flag) {
                node = node.next;
            } else {
                break;
            }
        }
        System.out.println("stop deadAdd");

    }

    class ListNode {        // Class name: Java class is a custom data structure
        int val;            // Data: Node data
        ListNode next;      // Object: Reference to the next node object. In Java, there is no concept of pointers; references in Java are similar to pointers in C language.

        ListNode(int val, ListNode next) {  // Constructor: The constructor has the same name as the class
            this.val = val;   // Assign the received parameter to the current class's val variable
            this.next = next;
        }
    }
}
Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.