Previously, we have shared the database optimization method on wechat official account. The link is https://mp.weixin.qq.com/s/6Atzk9UKPJRxxAs0nsKBXg . The operating system part introduces the optimization of IO scheduling algorithm. This paper will compare the performance of disk IO under different scheduling algorithms through stress test.
1 Preparation
1.1} installing sysbench
This time, sysbench is used for pressure measurement. Install sysbench first. The steps are as follows:
curl -s https://packagecloud.io/install/repositories/akopytov/sysbench/script.rpm.sh | sudo bash yum -y install sysbench sysbench --version
1.2 preparation of test documents
Generate test files that need to be used in the follow-up. The block size is 16k (known to MySQL DBA s, ha ha). Create four files, totaling 20G
[root@mha1 ~]# sysbench fileio --file-num=4 --file-block-size=16384 --file-total-size=20G prepare sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2) 4 files, 5242880Kb each, 20480Mb total Creating files for the test... Extra file open flags: (none) Creating file test_file.0 Creating file test_file.1 Creating file test_file.2 Creating file test_file.3 21474836480 bytes written in 47.94 seconds (427.24 MiB/sec).
1.3 preparation of test sheet
Because we also need to test the database reading and writing, we need to create relevant tables and data first
[root@mha1 ~]# sysbench --db-driver=mysql --time=300 --threads=10 --report-interval=1 --mysql-host=127.0.0.1 --mysql-port=3306 --mysql-user=root --mysql-password=123456 --mysql-db=testdb --tables=20 --table_size=1000000 oltp_insert --db-ps-mode=disable prepare sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2) Initializing worker threads... Creating table 'sbtest3'... Creating table 'sbtest2'... Creating table 'sbtest5'... Creating table 'sbtest6'... Creating table 'sbtest8'... Creating table 'sbtest7'... Creating table 'sbtest4'... Creating table 'sbtest1'... Creating table 'sbtest10'... Creating table 'sbtest9'... Inserting 1000000 records into 'sbtest4' Inserting 1000000 records into 'sbtest7' Inserting 1000000 records into 'sbtest5' Inserting 1000000 records into 'sbtest3' Inserting 1000000 records into 'sbtest1' Inserting 1000000 records into 'sbtest2' Inserting 1000000 records into 'sbtest8' Inserting 1000000 records into 'sbtest6' Inserting 1000000 records into 'sbtest10' Inserting 1000000 records into 'sbtest9' Creating a secondary index on 'sbtest7'... Creating a secondary index on 'sbtest10'... Creating a secondary index on 'sbtest8'... Creating a secondary index on 'sbtest5'... Creating a secondary index on 'sbtest2'... Creating a secondary index on 'sbtest9'... Creating a secondary index on 'sbtest1'... Creating table 'sbtest17'... Creating a secondary index on 'sbtest3'... Inserting 1000000 records into 'sbtest17' Creating a secondary index on 'sbtest4'... Creating a secondary index on 'sbtest6'... Creating table 'sbtest20'... Inserting 1000000 records into 'sbtest20' Creating table 'sbtest18'... Inserting 1000000 records into 'sbtest18' Creating table 'sbtest15'... Inserting 1000000 records into 'sbtest15' Creating table 'sbtest19'... Inserting 1000000 records into 'sbtest19' Creating table 'sbtest14'... Inserting 1000000 records into 'sbtest14' Creating table 'sbtest11'... Inserting 1000000 records into 'sbtest11' Creating table 'sbtest13'... Creating table 'sbtest12'... Inserting 1000000 records into 'sbtest13' Inserting 1000000 records into 'sbtest12' Creating table 'sbtest16'... Inserting 1000000 records into 'sbtest16' Creating a secondary index on 'sbtest17'... Creating a secondary index on 'sbtest20'... Creating a secondary index on 'sbtest18'... Creating a secondary index on 'sbtest19'... Creating a secondary index on 'sbtest15'... Creating a secondary index on 'sbtest11'... Creating a secondary index on 'sbtest12'... Creating a secondary index on 'sbtest13'... Creating a secondary index on 'sbtest14'... Creating a secondary index on 'sbtest16'...
2. View the supported scheduling algorithms
This disk is SSD hard disk, and the operating system version is centos7 8 . Next, the scheduling algorithm is modified to three different values for random read and random write stress testing
This system is centos7 8. You need to check the supported IO scheduling algorithm, and then conduct modification test.
[root@mha1 ~]# dmesg | grep -i scheduler [ 4.885816] io scheduler noop registered [ 4.885820] io scheduler deadline registered (default) [ 4.885867] io scheduler cfq registered [ 4.885870] io scheduler mq-deadline registered [ 4.885872] io scheduler kyber registered
It can be seen that in this system, the default scheduling algorithm is {deadline.
You can also view the current scheduling algorithm through the following command, where the brackets represent the currently used scheduling algorithm.
[root@mha1 ~]# cat /sys/block/sda/queue/scheduler noop [deadline] cfq
3. deadline algorithm
Deadline is the best choice for database environment (Oracle, RAC, mysql, etc.) in the case of mechanical disk. The following is a stress test of random write and random read
3.1 random writing
[root@mha1 ~]# sysbench fileio \ > --time=180 \ > --threads=24 \ > --file-total-size=20G \ > --file-test-mode=rndwr \ > --file-num=4 \ > --file-extra-flags=direct \ > --file-fsync-freq=0 \ > --file-block-size=16384 \ > run sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2) Running the test with following options: Number of threads: 24 Initializing random number generator from current time Extra file open flags: directio 4 files, 5GiB each 20GiB total file size Block size 16KiB Number of IO requests: 0 Read/Write ratio for combined random IO test: 1.50 Calling fsync() at the end of test, Enabled. Using synchronous I/O mode Doing random write test Initializing worker threads... Threads started! File operations: reads/s: 0.00 writes/s: 6935.37 fsyncs/s: 0.53 Throughput: read, MiB/s: 0.00 written, MiB/s: 108.37 General statistics: total time: 180.0138s total number of events: 1248484 Latency (ms): min: 0.10 avg: 3.46 max: 107.39 95th percentile: 14.73 sum: 4317610.93 Threads fairness: events (avg/stddev): 52020.1667/426.95 execution time (avg/stddev): 179.9005/0.01
The iops of random write is 6935.37, and the disk write speed is 108.37MiB/s
3.2 random reading
[root@mha1 ~]# sysbench fileio \ > --time=180 \ > --threads=24 \ > --file-total-size=20G \ > --file-test-mode=rndrd \ > --file-num=4 \ > --file-extra-flags=direct \ > --file-fsync-freq=0 \ > --file-block-size=16384 \ > run sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2) Running the test with following options: Number of threads: 24 Initializing random number generator from current time Extra file open flags: directio 4 files, 5GiB each 20GiB total file size Block size 16KiB Number of IO requests: 0 Read/Write ratio for combined random IO test: 1.50 Calling fsync() at the end of test, Enabled. Using synchronous I/O mode Doing random read test Initializing worker threads... Threads started! File operations: reads/s: 7956.88 writes/s: 0.00 fsyncs/s: 0.00 Throughput: read, MiB/s: 124.33 written, MiB/s: 0.00 General statistics: total time: 180.0075s total number of events: 1432313 Latency (ms): min: 0.10 avg: 3.01 max: 322.24 95th percentile: 5.47 sum: 4309094.67 Threads fairness: events (avg/stddev): 59679.7083/2688.56 execution time (avg/stddev): 179.5456/0.18
The iops of random reading is 7956.88, and the disk reading speed is 124.33MiB/s
3.3 test database write
sysbench --db-driver=mysql --time=300 --threads=8 --report-interval=1 --mysql-host=127.0.0.1 --mysql-port=3306 --mysql-user=root --mysql-password=123456 --mysql-db=testdb --tables=10 --table_size=1000000 oltp_write_only --db-ps-mode=disable run SQL statistics: queries performed: read: 0 write: 589934 other: 294968 total: 884902 transactions: 147483 (491.43 per sec.) queries: 884902 (2948.62 per sec.) ignored errors: 2 (0.01 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 300.1050s total number of events: 147483 Latency (ms): min: 2.58 avg: 16.27 max: 2608.34 95th percentile: 35.59 sum: 2399415.58 Threads fairness: events (avg/stddev): 18435.3750/90.33 execution time (avg/stddev): 299.9269/0.04
It can be seen that the TPS written randomly is 491.43 and the number of queries is 2948.62
3.4 test database reading
sysbench --db-driver=mysql --time=300 --threads=8 --report-interval=1 --mysql-host=127.0.0.1 --mysql-port=3306 --mysql-user=root --mysql-password=123456 --mysql-db=testdb --tables=10 --table_size=1000000 oltp_read_only --db-ps-mode=disable run SQL statistics: queries performed: read: 1651692 write: 0 other: 235956 total: 1887648 transactions: 117978 (393.13 per sec.) queries: 1887648 (6290.13 per sec.) ignored errors: 0 (0.00 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 300.0949s total number of events: 117978 Latency (ms): min: 3.08 avg: 20.34 max: 170.48 95th percentile: 29.19 sum: 2399636.31 Threads fairness: events (avg/stddev): 14747.2500/1513.84 execution time (avg/stddev): 299.9545/0.04
It can be seen that the TPS during random reading is 393.13 and the number of queries is 6290.13
4. noop algorithm
4.1} modified to noop algorithm
noop, called elevator scheduling algorithm, is based on FIFO queue. All requests are first in first out. Because SSD has fast random read and random write speed, this algorithm is suitable for SSD hard disk.
[root@mha1 ~]# echo 'noop' >/sys/block/sda/queue/scheduler [root@mha1 ~]# cat /sys/block/sda/queue/scheduler [noop] deadline cfq
4.2 random writing
[root@mha1 ~]# sysbench fileio \ > --time=180 \ > --threads=24 \ > --file-total-size=20G \ > --file-test-mode=rndwr \ > --file-num=4 \ > --file-extra-flags=direct \ > --file-fsync-freq=0 \ > --file-block-size=16384 \ > run sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2) Running the test with following options: Number of threads: 24 Initializing random number generator from current time Extra file open flags: directio 4 files, 5GiB each 20GiB total file size Block size 16KiB Number of IO requests: 0 Read/Write ratio for combined random IO test: 1.50 Calling fsync() at the end of test, Enabled. Using synchronous I/O mode Doing random write test Initializing worker threads... Threads started! File operations: reads/s: 0.00 writes/s: 7057.60 fsyncs/s: 0.53 Throughput: read, MiB/s: 0.00 written, MiB/s: 110.27 General statistics: total time: 180.0136s total number of events: 1270481 Latency (ms): min: 0.10 avg: 3.40 max: 240.39 95th percentile: 14.46 sum: 4317435.99 Threads fairness: events (avg/stddev): 52936.7083/487.57 execution time (avg/stddev): 179.8932/0.02
The iops of random write is 7057.60, and the disk write speed is 110.27MiB/s
3.4 random reading
[root@mha1 ~]# sysbench fileio \ > --time=180 \ > --threads=24 \ > --file-total-size=20G \ > --file-test-mode=rndrd \ > --file-num=4 \ > --file-extra-flags=direct \ > --file-fsync-freq=0 \ > --file-block-size=16384 \ > run sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2) Running the test with following options: Number of threads: 24 Initializing random number generator from current time Extra file open flags: directio 4 files, 5GiB each 20GiB total file size Block size 16KiB Number of IO requests: 0 Read/Write ratio for combined random IO test: 1.50 Calling fsync() at the end of test, Enabled. Using synchronous I/O mode Doing random read test Initializing worker threads... Threads started! File operations: reads/s: 8399.89 writes/s: 0.00 fsyncs/s: 0.00 Throughput: read, MiB/s: 131.25 written, MiB/s: 0.00 General statistics: total time: 180.0100s total number of events: 1512081 Latency (ms): min: 0.10 avg: 2.85 max: 315.77 95th percentile: 5.00 sum: 4312384.33 Threads fairness: events (avg/stddev): 63003.3750/10086.77 execution time (avg/stddev): 179.6827/0.12
The iops of random reading is 8399.89, and the disk reading speed is 131.25MiB/s
4.4 database writing
sysbench --db-driver=mysql --time=300 --threads=8 --report-interval=1 --mysql-host=127.0.0.1 --mysql-port=3306 --mysql-user=root --mysql-password=123456 --mysql-db=testdb --tables=10 --table_size=1000000 oltp_write_only --db-ps-mode=disable run SQL statistics: queries performed: read: 0 write: 653457 other: 326730 total: 980187 transactions: 163364 (544.38 per sec.) queries: 980187 (3266.28 per sec.) ignored errors: 2 (0.01 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 300.0903s total number of events: 163364 Latency (ms): min: 2.62 avg: 14.69 max: 220.12 95th percentile: 32.53 sum: 2399040.57 Threads fairness: events (avg/stddev): 20420.5000/112.69 execution time (avg/stddev): 299.8801/0.04
It can be seen that the TPS written randomly is 544.38 and the number of queries is 3266.28
4.5 database read only
sysbench --db-driver=mysql --time=300 --threads=8 --report-interval=1 --mysql-host=127.0.0.1 --mysql-port=3306 --mysql-user=root --mysql-password=123456 --mysql-db=testdb --tables=10 --table_size=1000000 oltp_read_only --db-ps-mode=disable run SQL statistics: queries performed: read: 1596364 write: 0 other: 228052 total: 1824416 transactions: 114026 (379.97 per sec.) queries: 1824416 (6079.59 per sec.) ignored errors: 0 (0.00 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 300.0869s total number of events: 114026 Latency (ms): min: 3.08 avg: 21.04 max: 321.03 95th percentile: 31.37 sum: 2399600.56 Threads fairness: events (avg/stddev): 14253.2500/1475.71 execution time (avg/stddev): 299.9501/0.02
It can be seen that the TPS for read-only is 379.97 and the number of queries is 6079.59
5. cfq algorithm
5.1 modified to cfq algorithm
cfq is called absolute fair scheduling algorithm. It creates a queue for each process and thread to manage IO requests, which has the effect of evenly distributing io for each process and thread. This algorithm is applicable to general servers, and centos6 is the default IO scheduling algorithm.
[root@mha1 ~]# echo 'cfq' >/sys/block/sda/queue/scheduler [root@mha1 ~]# cat /sys/block/sda/queue/scheduler noop deadline [cfq]
5.2 random writing
[root@mha1 ~]# sysbench fileio \ > --time=180 \ > --threads=24 \ > --file-total-size=20G \ > --file-test-mode=rndwr \ > --file-num=4 \ > --file-extra-flags=direct \ > --file-fsync-freq=0 \ > --file-block-size=16384 \ > run sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2) Running the test with following options: Number of threads: 24 Initializing random number generator from current time Extra file open flags: directio 4 files, 5GiB each 20GiB total file size Block size 16KiB Number of IO requests: 0 Read/Write ratio for combined random IO test: 1.50 Calling fsync() at the end of test, Enabled. Using synchronous I/O mode Doing random write test Initializing worker threads... Threads started! File operations: reads/s: 0.00 writes/s: 6614.37 fsyncs/s: 0.53 Throughput: read, MiB/s: 0.00 written, MiB/s: 103.35 General statistics: total time: 180.0118s total number of events: 1190677 Latency (ms): min: 0.10 avg: 3.63 max: 348.78 95th percentile: 15.27 sum: 4317092.54 Threads fairness: events (avg/stddev): 49611.5417/517.80 execution time (avg/stddev): 179.8789/0.03
The iops of random write is 6614.37, and the disk write speed is 103.35MiB/s
5.3 random reading
[root@mha1 ~]# sysbench fileio \ > --time=180 \ > --threads=24 \ > --file-total-size=20G \ > --file-test-mode=rndrd \ > --file-num=4 \ > --file-extra-flags=direct \ > --file-fsync-freq=0 \ > --file-block-size=16384 \ > run sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2) Running the test with following options: Number of threads: 24 Initializing random number generator from current time Extra file open flags: directio 4 files, 5GiB each 20GiB total file size Block size 16KiB Number of IO requests: 0 Read/Write ratio for combined random IO test: 1.50 Calling fsync() at the end of test, Enabled. Using synchronous I/O mode Doing random read test Initializing worker threads... Threads started! File operations: reads/s: 7481.39 writes/s: 0.00 fsyncs/s: 0.00 Throughput: read, MiB/s: 116.90 written, MiB/s: 0.00 General statistics: total time: 180.0086s total number of events: 1346731 Latency (ms): min: 0.10 avg: 3.20 max: 374.49 95th percentile: 5.77 sum: 4312382.07 Threads fairness: events (avg/stddev): 56113.7917/3058.00 execution time (avg/stddev): 179.6826/0.17
The iops of random reading is 7481.39, and the disk reading speed is 116.90MiB/s
5.4 database write
sysbench --db-driver=mysql --time=300 --threads=8 --report-interval=1 --mysql-host=127.0.0.1 --mysql-port=3306 --mysql-user=root --mysql-password=123456 --mysql-db=testdb --tables=10 --table_size=1000000 oltp_write_only --db-ps-mode=disable run SQL statistics: queries performed: read: 0 write: 598765 other: 299384 total: 898149 transactions: 149691 (498.54 per sec.) queries: 898149 (2991.25 per sec.) ignored errors: 2 (0.01 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 300.2552s total number of events: 149691 Latency (ms): min: 2.55 avg: 16.02 max: 779.62 95th percentile: 35.59 sum: 2397311.08 Threads fairness: events (avg/stddev): 18711.3750/132.24 execution time (avg/stddev): 299.6639/0.38
It can be seen that the TPS written randomly is 498.54 and the number of queries is 2991.25
5.5 database reading
sysbench --db-driver=mysql --time=300 --threads=8 --report-interval=1 --mysql-host=127.0.0.1 --mysql-port=3306 --mysql-user=root --mysql-password=123456 --mysql-db=testdb --tables=10 --table_size=1000000 oltp_read_only --db-ps-mode=disable run SQL statistics: queries performed: read: 1448342 write: 0 other: 206906 total: 1655248 transactions: 103453 (344.66 per sec.) queries: 1655248 (5514.58 per sec.) ignored errors: 0 (0.00 per sec.) reconnects: 0 (0.00 per sec.) General statistics: total time: 300.1562s total number of events: 103453 Latency (ms): min: 3.11 avg: 23.19 max: 222.31 95th percentile: 38.25 sum: 2399486.55 Threads fairness: events (avg/stddev): 12931.6250/1278.72 execution time (avg/stddev): 299.9358/0.01
It can be seen that the TPS in read - only mode is 344.66 and the number of queries is 5514.58
6 Summary
According to the test results, compare the reading and writing speeds of the three scheduling algorithms
algorithm | IOPS | Disk write speed | IOPS | Disk read speed | oltp_write_only | oltp_read_only |
deadline | 6935.37 | 118.37MiB/s | 7956.88 | 124.33MiB/s | The TPS is 491.43 and the number of queries is 2948.62 | The TPS is 393.13 and the number of queries is 6290.13 |
noop | 7057.60 | 110.27MiB/s | 8399.89 | 131.25MiB/s | The TPS is 544.38 and the number of queries is 3266.28 | The TPS is 379.97 and the number of queries is 6079.59 |
cfq | 6614.37 | 103.35MiB/s | 7481.39 | 116.90MiB/s | The TPS is 498.54 and the number of queries is 2991.25 | The TPS is 344.66 and the number of queries is 5514.58 |
Because the test environment is SSD hard disk, it is recommended to choose noop disk IO scheduling algorithm in this case. This conclusion is also in line with our expectations.
Special note: the scheduling algorithm of disk IO also needs to be distinguished according to various scenarios such as disk conditions, database type, database architecture and business scenarios (OLTP, OLAP, etc.), and the scheduling algorithms of different scenarios also need to be adjusted, which can not be generalized. If it is uncertain, it is recommended to conduct pressure measurement to judge and select the most appropriate algorithm in line with the corresponding scene.
If you want to learn more or participate in technical exchanges, you can follow the wechat official account [Database dry shop] or enter the technical exchange group for communication.