We often encounter many difficult problems in our daily work. While solving the problems, some tools have played a considerable role. Write it down here. First, it can be used as a note to make us forget later and can be read quickly. Second, it is to share. We hope that the students who see this article can take out the tools they feel are very helpful in daily life and make progress together.
Linux command class
tail
The most commonly used tail -f
tail -300f shopbase.log #Count down 300 lines and enter the real-time listening file writing mode
grep
grep forest f.txt #File search grep forest f.txt cpf.txt #Multi file lookup grep 'log' /home/admin -r -n #Find all files that match the keyword in the directory cat f.txt | grep -i shopbase grep 'shopbase' /home/admin -r -n --include *.{vm,java} #Specify file suffix grep 'shopbase' /home/admin -r -n --exclude *.{vm,java} #Inverse matching seq 10 | grep 5 -A 3 #Upper match seq 10 | grep 5 -B 3 #Lower matching seq 10 | grep 5 -C 3 #Match up and down. It's appropriate to use this at ordinary times cat f.txt | grep -c 'SHOPBASE'
awk
1 basic command
awk '{print $4,$6}' f.txt awk '{print NR,$0}' f.txt cpf.txt awk '{print FNR,$0}' f.txt cpf.txt awk '{print FNR,FILENAME,$0}' f.txt cpf.txt awk '{print FILENAME,"NR="NR,"FNR="FNR,"$"NF"="$NF}' f.txt cpf.txt echo 1:2:3:4 | awk -F: '{print $1,$2,$3,$4}'
2 matching
awk '/ldb/ {print}' f.txt #Match ldb awk '!/ldb/ {print}' f.txt #Mismatched ldb awk '/ldb/ && /LISTEN/ {print}' f.txt #Match ldb and LISTEN awk '$5 ~ /ldb/ {print}' f.txt #The fifth column matches ldb
3 built in variables
-
NR:NR indicates the number of data read according to the record separator after awk. The default record separator is line feed, so the default is the number of data rows read. NR can be understood as the abbreviation of Number of Record.
-
FNR: when awk processing multiple input files, after the first file is processed, the NR does not start from 1, but continues to accumulate. Therefore, FNR appears. Whenever a new file is processed, the FNR counts from 1. FNR can be understood as File Number of Record.
-
NF: NF indicates the number of fields divided by the current record. NF can be understood as Number of Field.
find
sudo -u admin find /home/admin /tmp /usr -name \*.log(Multiple directories to find) find . -iname \*.txt(Match case) find . -type d(All subdirectories under the current directory) find /usr -type l(All symbolic links in the current directory) find /usr -type l -name "z*" -ls(Details of symbolic links eg:inode,catalogue) find /home/admin -size +250000k(Over 250000 k Of course+Change to-Is less than) find /home/admin f -perm 777 -exec ls -l {} \; (Query files according to permissions) find /home/admin -atime -1 1 Files accessed within days find /home/admin -ctime -1 1 Files whose status has changed within days find /home/admin -mtime -1 1 Documents modified within days find /home/admin -amin -1 1 Files accessed in minutes find /home/admin -cmin -1 1 Files whose status has changed in minutes find /home/admin -mmin -1 1 Files modified within minutes
pgm
Batch query VM shopbase logs that meet the conditions
pgm -A -f vm-shopbase 'cat /home/admin/shopbase/logs/shopbase.log.2017-01-17|grep 2069861630'
tsar
tsar is our company's own collection tool. It's easy to use. The data collected in the history is persisted on the disk, so we can quickly query the historical system data. Of course, real-time applications can also be queried. It is installed on most machines.
tsar ###You can view the indicators of the last day tsar --live ###You can view real-time indicators. By default, you can brush every five seconds tsar -d 20161218 ###Specify to view the data of a certain day. It seems that you can only view the data of four months at most tsar --memtsar --loadtsar --cpu###Of course, this can also be combined with the - d parameter to query the situation of a single indicator on a certain day
top
In addition to looking at some basic information, the rest of top is to cooperate to query various problems of vm
ps -ef | grep java top -H -p pid
After getting the thread from hexadecimal to hexadecimal, jstack goes to catch it and see what the thread is doing
other
netstat -nat|awk '{print $6}'|sort|uniq -c|sort -rn #Check the current connection and pay attention to close_ High wait
Check sharp weapon
btrace
The first thing to say is btrace. It's really a problem killer in the production environment. I won't say anything about the introduction. Go straight to the code
1. Check who has called the add method of ArrayList, and print only the thread call stack with the size greater than 500 of the current ArrayList
@OnMethod(clazz = "java.util.ArrayList", method="add", location = @Location(value = Kind.CALL, clazz = "/.*/", method = "/.*/")) public static void m(@ProbeClassName String probeClass, @ProbeMethodName String probeMethod, @TargetInstance Object instance, @TargetMethodOrField String method) { if(getInt(field("java.util.ArrayList", "size"), instance) > 479){ println("check who ArrayList.add method:" + probeClass + "#" + probeMethod + ", method:" + method + ", size:" + getInt(field("java.util.ArrayList", "size"), instance)); jstack(); println(); println("==========================="); println(); } }
2. Monitor the value returned when the current service method is called and the requested parameters
@OnMethod(clazz = "com.taobao.sellerhome.transfer.biz.impl.C2CApplyerServiceImpl", method="nav", location = @Location(value = Kind.RETURN)) public static void mt(long userId, int current, int relation, String check, String redirectUrl, @Return AnyType result) { println("parameter# userId:" + userId + ", current:" + current + ", relation:" + relation + ", check:" + check + ", redirectUrl:" + redirectUrl + ", result:" + result); }
Some tools of other functional groups are more or less available, so I won't say. If you are interested, please move. https://github.com/btraceio/btrace
be careful:
- After observation, the release output of 1.3.9 is unstable. You need to trigger several times to see the correct result
- When the regular expression matches the trace class, the range must be controlled, otherwise the application may get stuck due to running full of CPU
- Due to the principle of bytecode injection, if you want the application to return to normal, you need to restart the application.
Greys
Greys is @ Du Kun's masterpiece. Say some great functions (some functions coincide with btrace):
- sc -df xxx: output the details of the current class, including the source location and classloader structure
- I like this function very much! You can see this function in jpprofiler a long time ago. Print out the time consumption of the current method call and subdivide it into each method. It is very helpful to check the performance of methods. For example, this article uses the trace command to: http://www.atatech.org/articles/52947 .
Other functional parts coincide with btrace and can be selected. Please move if you are interested. http://www.atatech.org/articles/26247
In addition, Arthas is associated. It is based on Greys and is interested in moving again http://mw.alibaba-inc.com/products/arthas/docs/middleware-container/arthas.wiki/home.html?spm=a1z9z.8109794.header.32.1lsoMc
javOSize
Let's say a function classes: by modifying the bytecode, the content of the class is changed and takes effect immediately. So you can quickly log somewhere to see the output. The disadvantage is that it is too intrusive to the code. But if you know what you're doing, it's a good thing.
Other functions Greys and btrace can easily do, no more.
Take a look at an introduction to javOSize http://www.atatech.org/articles/38546 Please move to the official website http://www.javosize.com/
Arthas
This is a powerful troubleshooting tool of Alibaba open source soon. It is very convenient. More operations and steps can be taken https://github.com/alibaba/arthas
JProfiler
Before, many problems had to be judged through jpprofiler, but now Greys and btrace can basically solve them. In addition, the problem is basically the production environment (network isolation), so it is not used much, but it still needs to be marked. Please move to the official website https://www.ej-technologies.com/products/jprofiler/overview.html
Big killer
eclipseMAT
It can be used as a plug-in of eclipse or as a separate program. Please move for details http://www.eclipse.org/mat/
zprofiler
The development within the group should be known by everyone. In a nutshell: what do you need to mat with zprofiler? Please move to zprofiler for details alibaba-inc.com
java three board axe, oh, no, it's seven
jps
I only have one command:
sudo -u admin /opt/taobao/java/bin/jps -mlvV
jstack
Common usage:
sudo -u admin /opt/taobao/install/ajdk-8_1_1_fp1-b52/bin/jstack 2815
native+java stack:
sudo -u admin /opt/taobao/install/ajdk-8_1_1_fp1-b52/bin/jstack -m 2815
jinfo
You can see the system startup parameters as follows:
sudo -u admin /opt/taobao/install/ajdk-8_1_1_fp1-b52/bin/jinfo -flags 2815
jmap
Two purposes
1. Check the heap
sudo -u admin /opt/taobao/install/ajdk-8_1_1_fp1-b52/bin/jmap -heap 2815
2.dump
sudo -u admin /opt/taobao/install/ajdk-8_1_1_fp1-b52/bin/jmap -dump:live,format=b,file=/tmp/heap2.bin 2815
perhaps
sudo -u admin /opt/taobao/install/ajdk-8_1_1_fp1-b52/bin/jmap -dump:format=b,file=/tmp/heap3.bin 2815
3. Look who occupied the pile? With zprofiler and btrace, troubleshooting problems is like a tiger's wings
sudo -u admin /opt/taobao/install/ajdk-8_1_1_fp1-b52/bin/jmap -histo 2815 | head -10
jstat
There are many jstat parameters, but one is enough
sudo -u admin /opt/taobao/install/ajdk-8_1_1_fp1-b52/bin/jstat -gcutil 2815 1000
jdb
Today, jdb is often used. jdb can be used to pre send debug, assuming that you pre send JAVA_HOME is * * / opt/taobao/java / * *, and the remote debugging port is 8000 that
sudo -u admin /opt/taobao/java/bin/jdb -attach 8000
You can set breakpoints for debugging later. Specific parameters can be seen Official description of oracle
CHLSDB
CHLSDB feels that more interesting things can be seen in many cases. I won't describe them in detail. It is said that tools such as jstack and jmap are based on it.
sudo -u admin /opt/taobao/java/bin/java -classpath /opt/taobao/java/lib/sa-jdi.jar sun.jvm.hotspot.CLHSDB
More detailed It can be seen that R big this post
VM options
Which file is your class loaded from?
-XX:+TraceClassLoading The result is as follows[Loaded java.lang.invoke.MethodHandleImpl$Lazy from D:\programme\jdk\jdk8U74\jre\lib\rt.jar]
The application hung the output dump file
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/home/admin/logs/java.hprof Group vm This option is basically available in the parameters
jar package conflict
Isn't it too much to write this in a separate headline? Everyone has dealt with this annoying case more or less. I have so many plans below. Can't you believe it?
mvn dependency:tree > ~/dependency.txt
Play all dependencies
mvn dependency:tree -Dverbose -Dincludes=groupId:artifactId
Only the dependencies between the specified groupId and artifactId are typed
-XX:+TraceClassLoading
vm startup script is added. The details of the loaded class can be seen in the tomcat startup script
-verbose
vm startup script is added. The details of the loaded class can be seen in the tomcat startup script
greys:sc
The sc command of greys can also clearly see where the current class is loaded from
tomcat-classloader-locate
You can find out from the following url where the current class is loaded
curl http://localhost:8006/classloader/locate?class=org.apache.xerces.xs.XSObject
Surprise from ALI-TOMCAT (thanks @ Wu Guan)
List the jar s loaded by the container
curl http://localhost:8006/classloader/jars
Lists the actual jar package locations currently loaded by the current class, which is useful for resolving class conflicts
curl http://localhost:8006/classloader/locate?class=org.apache.xerces.xs.XSObject
other
gpref
http://www.atatech.org/articles/33317
dmesg
If you find that your java process has quietly disappeared without leaving any clues, then dmesg is likely to have what you want.
sudo dmesg|grep -i kill|less
Find the keyword oom_killer. The results found are similar to the following:
[6710782.021013] java invoked oom-killer: gfp_mask=0xd0, order=0, oom_adj=0, oom_scoe_adj=0[6710782.070639] [<ffffffff81118898>] ? oom_kill_process+0x68/0x140 [6710782.257588] Task in /LXC011175068174 killed as a result of limit of /LXC011175068174 [6710784.698347] Memory cgroup out of memory: Kill process 215701 (java) score 854 or sacrifice child [6710784.707978] Killed process 215701, UID 679, (java) total-vm:11017300kB, anon-rss:7152432kB, file-rss:1232kB
The above shows that the corresponding java process was killed by the system's OOM Killer, with a score of 854.
Explain the out of memory killer, which monitors the memory resource consumption of the machine. Before the machine runs out of memory, the mechanism will scan all processes (calculated according to certain rules, memory occupation, time, etc.), select the process with the highest score, and then kill it to protect the machine.
Dmesg log time conversion formula: log actual time = Greenwich 1970-01-01 + (current time seconds - seconds since system startup + log time printed by dmesg) seconds:
date -d "1970-01-01 UTC `echo "$(date +%s)-$(cat /proc/uptime|cut -f 1 -d' ')+12288812.926194"|bc ` seconds"