Linux server performance view analysis tuning

A view of linux server performance

1.1 cpu performance view

1. View the number of physical CPUs:

cat /proc/cpuinfo |grep "physical id"|sort|uniq|wc -l

2. View the number of core s in each physical cpu:

cat /proc/cpuinfo |grep "cpu cores"|wc -l

3. Number of logical CPUs:

cat /proc/cpuinfo |grep "processor"|wc -l

Number of physical CPUs * number of cores = number of logical CPUs (when hyper threading technology is not supported)

1.2 memory view

1. To view memory usage:

#free -m
             total       used       free     shared    buffers     cached
Mem:          3949       2519       1430          0        189       1619
-/+ buffers/cache:        710       3239
Swap:         3576          0       3576

total: Total memory
used: Memory used
free: Free memory
shared: Total memory shared by multiple processes
- buffers/cache: (Used)Amount of memory, i.e used-buffers-cached
+ buffers/cache: (available)Amount of memory, i.e free+buffers+cached

Buffer Cache For reading and writing to disk blocks;
Page Cache For files inode Reading and writing, these Cache Can effectively shorten I/O The time of the system call.


For the operating system free/used Yes, the system is available/Occupied memory;
For applications-/+ buffers/cache Is available/Occupied memory,because buffers/cache It will be used soon.

We should work from an application perspective.

1.3 hard disk viewing

1. View hard disk and partition information:

fdisk -l

2. View disk space usage:

df -h

3. Check the I/O performance of the hard disk (once every second, 5 times):

iostat -x 1 5

iostat is included in the package systat, which can be installed using yum -y install systat.

Frequently concerned parameters:

as%util Close to 100%,Description generated I/O Too many requests, I/O The system is full, and the disk may have a bottleneck.
as idle Less than 70%,I/O The pressure is relatively large, indicating that there are more errors in the reading process wait. 

4. Check the size of a directory in linux system:

du -sh /root

If you find that the space of a partition is nearly exhausted, you can enter the mount point of the partition, use the following command to find the files or directories that occupy the most space, and then find the top 10 files or directories that occupy the most space in the system in the order from large to small:

du -cksh *|sort -rn|head -n 10

1.4 view average load

Sometimes the system response is very slow, but the reason cannot be found. At this time, it is necessary to check the average load to see if there are a large number of processes waiting in the queue.

Simplest command:

uptime--View the average number of processes in the process queue in the past 1, 5, and 15 minutes.

There is also the dynamic command top
We only care about the following parts:

top - 21:33:09 up  1:00,  1 user,  load average: 0.00, 0.01, 0.05

If each logic cpu If the current active process is no more than 3, the system performance is good;
If each logic cpu The current activity process is no more than 4, which means it is acceptable;
If each logic cpu If the current active process is greater than 5, the system performance problem is serious.

General calculation method: load value / number of logical CPUs

You can also judge whether the system is busy in combination with the vmstat command, where:

procs
r: Number of processes waiting to run.
b: The number of processes in non disruptive sleep.
w: The number of runnable processes swapped out.

memeory
swpd: Virtual memory usage, in KB. 
free: Free memory, in KB. 
buff: The amount of memory used as a cache, in KB. 

swap
si: Number of swap pages swapped from disk to memory, unit: KB. 
so: Number of swap pages swapped from memory to disk, unit: KB. 

io
bi: Number of blocks sent to the block device, unit: KB. 
bo: The number of blocks received from the block device, in KB. 

system
in: Number of interrupts per second, including clock interrupts.
cs: Number of environment switches per second.

cpu
 Press cpu The total usage percentage of is displayed.
us: cpu Usage time.
sy: cpu System usage time.
id: Idle time.

1.5 other parameters

View kernel version number:
uname -a

Simplify command: uname -r

Check whether the system is 32-bit or 64 bit:
file /sbin/init

View release:
cat /etc/issue
 or lsb_release -a

View the relevant modules loaded in the system:
lsmod

see pci set up:
lspci

II. Performance evaluation of Linux server

2.1.1 factors affecting Linux server performance

1. Operating system level

CPU
 Memory
 disk I/O bandwidth
 network I/O bandwidth

2. Application level

2.1.2 system performance evaluation criteria

Factors affecting performance good bad too bad
CPU user% + sys%< 70% user% + sys%= 85% user% + sys% >=90%
Memory Swap In(si)=0 Swap Out(so)=0 Per CPU with 10 page/s More Swap In & Swap Out
disk iowait % < 20% iowait % =35% iowait % >= 50%

Of which:

%user: express CPU Percentage of time in user mode.
%sys: express CPU Percentage of time in system mode.
%iowait: express CPU Percentage of time waiting for input and output to complete.
swap in: Namely si,Represents the page import of virtual memory, that is, from SWAP DISK Swap to RAM
swap out: Namely so,Represents the page export of virtual memory, that is, from RAM Swap to SWAP DISK

2.1.3 system performance analysis tools

1. Common system commands

Vmstat, sar, iostat, netstat, free, ps, top, etc

2. Common combination methods

vmstat,sar,iostat Check whether it is CPU bottleneck
free,vmstat Detect whether it is a memory bottleneck
iostat Detect whether it is a disk I/O bottleneck
netstat Check whether it is a network bandwidth bottleneck

2.1.4 Linux performance evaluation and optimization

Overall system performance evaluation (uptime command)
uptime

16:38:00 up 118 days, 3:01, 5 users,load average: 1.22, 1.02, 0.91

be careful:

  • Generally, the load average ternary size cannot be greater than the number of system CPU s.

The system has 8 CPUs. If the three values of load average are greater than 8 for a long time, it indicates that the CPU is very busy and the load is very high, which may affect the system performance.

  • However, if it is greater than 8 occasionally, it will not affect the system performance.

  • If the output value of load average is less than the number of CPUs, it indicates that the CPU has an idle time slice. For example, the output in this example, the CPU is very idle

2.2.1 CPU performance evaluation

1. Use vmstat command to monitor system CPU

Displays the brief performance information related to various resources of the system, mainly depending on the CPU load.

The following is the output result of vmstat command in a system:

[root@node1 ~]#vmstat 2 3

procs
 -—–memory-—- —swap– —–io—- –system– —–cpu-

r  b swpd freebuff  cache si so bi bo incs us sy idwa st

0  0 0 162240 8304 67032 0 0 13 21 1007 23 0 1 98 0 0

0  0 0 162240 8304 67032 0 0 1 0 1010 20 0 1 100 0 0

0  0 0 162240 8304 67032 0 0 1 1 1009 18 0 1 99 0 0
Procs

r -- the number of processes running and waiting for CPU time slice. If this value is greater than the number of system CPUs for a long time, it indicates that the CPU is insufficient and needs to be increased

b -- the number of processes waiting for resources, such as I/O or memory exchange.

CPU

us

Percentage of CPU time consumed by the user process.
When the value of us is relatively high, it indicates that the user process consumes more cpu time, but if it is greater than 50% for a long time, the optimization program or algorithm needs to be considered.

sy

The percentage of CPU time consumed by the kernel process. When the value of Sy is high, it indicates that the kernel consumes a lot of CPU resources.

According to experience, the reference value of us+sy is 80%. If us+sy is greater than 80%, it indicates that there may be insufficient CPU resources.

2. Use sar command to monitor system CPU

sar makes separate statistics on every aspect of the system, but it will increase the system overhead. However, the overhead can be evaluated and will not have a great impact on the statistical results of the system.

The following is the statistical output of sar command to the CPU of a system:

[root@webserver ~]# sar -u 3 5

Linux
 2.6.9-42.ELsmp (webserver) 11/28/2008_i686_
 (8 CPU)

11:41:24
 AM CPU %user %nice%system
 %iowait %steal %idle

11:41:27
 AM all 0.88 0.00 0.29 0.00 0.00 98.83

11:41:30
 AM all 0.13 0.00 0.17 0.21 0.00 99.50

11:41:33
 AM all 0.04 0.00 0.04 0.00 0.00 99.92

11:41:36
 AM all 90.08 0.00 0.13 0.16 0.00 9.63

11:41:39
 AM all 0.38 0.00 0.17 0.04 0.00 99.41

Average:
 all 0.34 0.00 0.16 0.05 0.00 99.45

The output is explained as follows:

%user The column shows the amount consumed by the user process CPU Percentage of time.
%nice The column shows the cost of running a normal process CPU Percentage of time.
%system The column shows the amount consumed by the system process CPU Percentage of time.
%iowait Column shows IO Waiting for occupied CPU Time percentage
%steal Column shows that in a relatively tight memory environment pagein Force changes to different pages steal Operation.
%idle Column shows CPU The percentage of time that is idle.
problem

Have you ever encountered the phenomenon that the overall CPU utilization of the system is not high and the application is slow?

In a multi CPU system, if the program uses a single thread, there will be a phenomenon that the overall utilization rate of the CPU is not high, but the system application response is slow. This may be because the program uses a single thread. A single thread uses only one CPU, resulting in a CPU utilization rate of 100%, unable to process other requests, while other CPUs are idle, resulting in a low overall CPU utilization rate, And the slow application phenomenon occurs.

2.3.1 memory performance evaluation

1. Use the free instruction to monitor the memory

free is the most commonly used instruction for monitoring Linux memory usage. See the following output:

[root@webserver ~]# free -m

total
 used freeshared
 buffers cached

Mem:
 8111 7185 926 0 243 6299

 -/+
 buffers/cache:
 643 7468

Swap:
 8189 0 8189

Empirical formula:

Application available memory/System physical memory>70%,It indicates that the system memory resources are very sufficient and will not affect the system performance;
Application available memory/System physical memory<20%,Indicates that the system is short of memory resources and needs to increase system memory;
20%<Application available memory/System physical memory<70%,Indicates that the system memory resources can basically meet the application requirements and will not affect the system performance for the time being

2. Use vmstat command to monitor memory

[root@node1
 ~]#
 vmstat 2 3

procs
 -—–memory-—- —swap– —–io—- –system– —–cpu-

r b swpd freebuff cache si so bi bo incs us sy idwa st

0 0 0 162240 8304 67032 0 0 13 21 1007 23 0 1 98 0 0

0 0 0 162240 8304 67032 0 0 1 0 1010 20 0 1 100 0 0

0 0 0 162240 8304 67032 0 0 1 1 1009 18 0 1 99 0 0

memory

swpd--Amount of memory switched to memory swap( k Unit). as swpd Occasionally, the value is not 0, which does not affect the system performance
free--Amount of physical memory currently free( k (unit)
buff--buffers cache The amount of memory. Generally, buffering is required for reading and writing to block devices
cache--page cached Amount of memory

Generally, as a file system cached, frequently accessed files will be cached. If the cache value is large, it indicates that the number of cached files is large. If the bi in IO is relatively small, it indicates that the efficiency of the file system is relatively good.

swap

si--Transferred from disk to memory, that is, the number of memory entering the memory swap area.
so--Transferred from memory to disk, that is, the amount of memory into memory in the memory swap area.

The values of si and so are not 0 for a long time, indicating that the system memory is insufficient. Need to increase system memory.

2.4.1 disk I/O performance evaluation

1. Disk storage infrastructure

For frequently accessed files or data, try to use memory reading and writing instead of direct disk I/O, with a thousand times higher efficiency.

Separate the files that are often read and written from the files that remain unchanged for a long time and place them on different disk devices respectively.

For data with frequent write operations, consider using raw devices instead of file systems.

Advantages of bare equipment:

Data can be read and written directly without operating system level cache, saving memory resources and avoiding memory resource contention;
Avoid file system level maintenance overhead, such as maintaining super blocks I-node etc.;
Avoid operating system cache Pre reading function, reduced I/O request

The disadvantages of using bare equipment are:

Data management and space management are inflexible and need very professional people to operate.

2. Evaluate disk performance with iostat

[root@webserver ~]# iostat -d 2 3

Linux
 2.6.9-42.ELsmp (webserver) 12/01/2008_i686_
 (8 CPU)

 

Device:
 tps Blk_read/sBlk_wrtn/sBlk_read
 Blk_wrtn

sda 1.87 2.58 114.12 6479462 286537372

 

Device:
 tps Blk_read/sBlk_wrtn/sBlk_read
 Blk_wrtn

sda
 0.00 0.00 0.00 0 0
 

Device:
 tps Blk_read/sBlk_wrtn/sBlk_read
 Blk_wrtn

sda
 1.00 0.00 12.00 0 24

The explanation is as follows:

Blk_read/s--Number of data blocks read per second
Blk_wrtn/s--Number of data blocks written per second
Blk_read--Number of all blocks read
Blk_wrtn--Number of all blocks written

Available via Blk_read/s and Blk_wrtn/s value has a basic understanding of disk read and write performance
Such as BLK_ A large wrtn / s value indicates that the disk is frequently written. Consider optimizing the disk or program,
Such as BLK_ A large read / s value indicates that there are many direct read operations on the disk, and the read data can be put into memory

Rules to follow:

Long term and super large data reading and writing must be abnormal, which will affect the system performance.

3. Using sar to evaluate disk performance

Through the combination of "sar – d", you can make a basic statistics on the disk IO of the system. Please see the following output:

[root@webserver ~]# sar -d 2 3

Linux
 2.6.9-42.ELsmp (webserver) 11/30/2008_i686_
 (8 CPU)

11:09:33
 PM DEV tps rd_sec/swr_sec/savgrq-sz
 avgqu-sz await svctm %util

11:09:35
 PM dev8-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

11:09:35
 PM DEV tps rd_sec/swr_sec/savgrq-sz
 avgqu-sz await svctm %util

11:09:37
 PM dev8-0 1.00 0.00 12.00 12.00 0.00 0.00 0.00 0.00

11:09:37
 PM DEV tps rd_sec/swr_sec/savgrq-sz
 avgqu-sz await svctm %util

11:09:39
 PM dev8-0 1.99 0.00 47.76 24.00 0.00 0.50 0.25 0.05

Average:
 DEV tps rd_sec/swr_sec/savgrq-sz
 avgqu-sz await svctm %util

Average:
 dev8-0 1.00 0.00 19.97 20.00 0.00 0.33 0.17 0.02

Parameter meaning:

await--Average per device I/O Operation wait time (MS)
svctm--Average per device I/O Service time of the operation (MS)
%util--What percentage of a second is spent I/O operation

Evaluation criteria for disk IO performance:

The normal svctm should be less than the await value, while svctm is related to disk performance. CPU and memory load will also affect the svctm value, and too many requests will indirectly increase the svctm value.

await Value depends svctm and I/O Queue length and I/O Request mode,
If svctm Value and await Very close, which means almost no I/O Wait, the disk performance is very good,
If await The value of is much higher than svctm The value of I/O If the queue is too long, the applications running on the system will slow down,
At this time, the problem can be solved by replacing a faster hard disk.

%util -- an important indicator of disk I/O,

If% util is close to 100%, it means that the disk generates too many I/O requests, and the I/O system is working at full capacity. There may be a bottleneck in the disk.

Programs can be optimized or by replacing higher, faster disks.

2.5.1. Network performance evaluation

(1)adopt ping Command to detect network connectivity
(2)adopt netstat –i Combined detection network interface status
(3)adopt netstat –r Routing table information of combined detection system
(4)adopt sar –n Combined display of network operation status of the system

III. Linux server performance tuning

1. Adjust the elevator algorithm of Linux kernel for disk I/O

After selecting the file system, the algorithm can balance the requirements of low latency, collect enough data and effectively organize the read-write requests to the disk.

2. Disable unnecessary daemons to save memory and CPU resources

Many daemons or services are usually unnecessary and consume valuable memory and resources CPU Time. Put the server at risk.
Disable to speed up startup time and free up memory.

reduce CPU Number of processes to process

Some Linux daemons that should be disabled run automatically by default:

Sequence number daemon description
1 Apmd advanced power management daemon
2 nflock is used for NFS file locking
ISDN isdnmode support
4 Autofs automatically mounts the file system in the background (such as CD-ROM)
5 Sendmail mail transfer agent
6 Xfs X Window font server

3. Turn off the GUI

4. Clean up unwanted modules or functions

Too many started functions or modules in the server software package are actually unnecessary (such as many function modules in Apache). Disabling them helps to improve the availability of system memory and free up resources for the software that really needs them to run faster.

5. Disable control panel

In Linux, there are many popular control panels, such as Cpanel, Plesk, Webmin and phpMyAdmin. Disabling releases about 120MB of memory, and the memory usage decreases by about 30-40%.

6. Improve Linux Exim server performance

Using DNS cache daemon can reduce the bandwidth and CPU time required to resolve DNS records. DNS cache can improve network performance by eliminating the need to find DNS records from the root node every time.

Djbdns is a very powerful DNS server. It has DNS caching function. Djbdns is safer and better than BIND DNS server. It can be used directly http://cr.yp.to/ Download it or get it through the package provided by Red Hat.

7. Using AES256 to enhance the security of gpg file encryption

In order to improve the security of backup files or sensitive information, many Linux system administrators use gpg for encryption. When using gpg, it is best to specify that gpg uses AES256 encryption algorithm. AES256 uses 256 bit key. It is an open encryption algorithm, which is used by the National Security Agency (NSA) to protect top secret information.

8. Remote backup service security

Security is the most important factor in choosing remote backup services. Most system administrators are afraid of two things: hackers can delete backup files and cannot restore the system from backup.

In order to ensure 100% security of backup files, the backup service company provides a remote backup server and uses scp script or RSYNC to transmit data through SSH. In this way, no one can directly enter and access the remote system. Therefore, no one can delete data from the backup service. When selecting a remote backup service provider, it is best to understand its service robustness from multiple aspects. If you can, you can test it yourself.

9. Update default kernel parameter settings

In order to run enterprise applications smoothly and successfully, such as database server, some default kernel parameter settings may need to be updated, for example, 2.4 The X-Series kernel message queue parameter msgmni has a default value (for example, shared memory, or shmmax is only 33554432 bytes by default on Red Hat system). It only allows limited concurrent database connections. The following provides some recommended values for better operation of the database server (from IBM DB2 support website):

kernel.shmmax=268435456 (32 bits)
kernel.shmmax=1073741824 (64 bit)
kernel.msgmni=1024
fs.file-max=8192
kernel.sem="250 32000 32 1024″

10. Optimize TCP

Optimizing the TCP protocol helps to improve the network throughput. When the bandwidth used for cross Wan communication is larger and the delay time is longer, it is recommended to use a larger TCP Linux size to improve the data transmission rate. The size of TCP Linux determines how much data the sending host can send to the receiving host when it does not receive the data transmission confirmation.

11. Select the correct file system

Replace ext3 with ext4 file system

● Ext4 is an enhanced version of ext3 file system, which extends the storage limit

● with log function to ensure a high level of data integrity (in abnormal shutdown events)

● in case of abnormal shutdown and restart, it does not need to check the disk (this is a very time-consuming action)

● faster write speed, ext4 log optimizes hard disk head action

12. Using the noatime file system mount option

Use the noatime option in the file system startup configuration file fstab. If external storage is used, this mount option can effectively improve performance.

13. Adjust Linux file descriptor limits

Linux limits the number of file descriptors that can be opened by any process. The default limit is 1024 per process. These limits may hinder the benchmark client (such as httperf and Apache bench) and the Web server itself from achieving the best performance. Apache uses one process per connection, so it will not be affected. However, single process Web servers, such as Zeus, use one file descriptor per connection, so they are easily affected by the default limit.

The open file limit is a limit that can be adjusted with the ulimit command. The ulimit -aS command displays the current limit and the ulimit -aH command displays the hard limit (you can't increase the limit before adjusting the kernel parameters in / proc).

Performance tips for Linux third party applications

There are also many performance optimization techniques for third-party applications running on Linux. These techniques can help you improve the performance of Linux server and reduce the running cost.

14. Configure MySQL correctly

In order to allocate more memory to MySQL, you can set the MySQL cache size. If the MySQL server instance uses more memory, reduce the cache size. If MySQL stalls when requests increase, increase the MySQL cache.

15. Configure Apache correctly

Checking how much memory Apache uses and adjusting the StartServers and MinSpareServers parameters to free up more memory will help you save 30-40% of your memory.

16. Analyze Linux server performance

The best way to improve the system efficiency is to find out and solve the bottlenecks that lead to the decline of the overall speed. Here are some basic skills to find out the key bottlenecks of the system:

● when large applications, such as OpenOffice and Firefox, run at the same time, the computer may start to slow down and the probability of insufficient memory is higher.

● if the startup is really slow, it may take a long time to load the application for the first time. Once it is started, it will run normally. Otherwise, it is likely that the hard disk is too slow.

● the CPU load continues to be high and the memory is enough, but the CPU utilization is very low. You can use the CPU load analysis tool to monitor the load time.

17. Learn five Linux performance commands

You can use several commands to manage the performance of the Linux system. The five most commonly used linux performance commands are listed below, including
top, vmstat, iostat, free and sar, which help system administrators quickly solve performance problems.

(1)top

The task of the current kernel service also displays statistics of many host states. By default, it is automatically updated every 5 seconds.
Such as: current normal running time, system load, number of processes and memory utilization,

In addition, this command also displays the processes that use the most CPU time (including various information of each process, such as running users, executed commands, etc.).

(2)vmstat

The Vmstat command provides a snapshot of the current CPU, IO, process and memory utilization. It is similar to the top command and automatically updates data, such as:

$ vmstat 10

(3)iostat

Iostat provides three reports: CPU utilization, device utilization and network file system utilization. These three reports can be displayed independently by using the - c, - d and - h parameters.

(4)free

Display the main memory and swap space memory statistics. Specify the - t parameter to display the total memory. Specify the - b parameter in bytes. Use the - m parameter in megabytes. By default, kilobytes are used.

The Free command can also use the - s parameter plus a delay time (in seconds) to run continuously, such as:

$ free -s 5

(5)sar

Collect, view and record performance data. This command has a longer history than the previous commands. It can collect and display long-term data.

other

Here are some performance tips classified as other:

18. Transfer log files to memory

When a machine is running, it's best to put the system log in memory and copy it to the hard disk when the system is turned off. When you run a laptop or mobile device with syslog enabled, ramlog can help you improve the service life of the system battery or flash drive of mobile device. One advantage of using ramlog is that you don't have to worry about a daemon sending a message to syslog every 30 seconds, Before being placed, the hard disk must keep running at any time, which is not good for the hard disk and battery.

19. Package before write

A fixed size of space is divided in the memory to save log files, which means that the laptop hard disk does not need to keep running all the time. It can only run when a daemon needs to write logs. Note that the memory space used by ramlog is fixed, otherwise the system memory will be used up quickly. If the laptop uses solid-state disk, 50-80MB memory can be allocated to ramlog, which can reduce many write cycles, Greatly improve the service life of solid state disk.

20. General tuning skills

Use static content instead of dynamic content as much as possible. If you are generating weather forecasts or other data that must be updated every hour, it is better to write a program to generate a static file every hour, rather than let users run a CGI to generate reports dynamically.

Choosing the fastest and most appropriate API for dynamic applications, CGI may be the easiest to program, but it will generate a process for each request. Usually, this is a costly and unnecessary process. FastCGI is a better choice, and Apache's Mod_ Like Perl, it can greatly improve the performance of applications.

Tags: Linux server kernel

Posted by NTM on Mon, 09 May 2022 14:03:23 +0300