Eight - SQL statement analysis

Eight - SQL statement analysis

8.1 Several aspects that affect server performance

Several aspects that affect performance
1. Server Hardware
2. The operating system of the server
3. The choice of database storage engine
4. Database parameter configuration
5. Data Structure Design and SQL

Reasons for SQL Performance Degradation

  • The query statement is not well written
  • index invalidation
  • Too many join queries
  • Server tuning and various parameter settings

SQL load order

  • Sequence of handwritten SQL
select distinct
    <select _list>
 join <right_table> on <join_codition>
 group by
  order by
 limit <limit number>

  • Machine Readable SQL Sequence
1.from <left_table>
2.on <join_condition>
3.<join_type> join <right_table>
4.where <where_condition>
5.group by <group_by_list>
6.having <having_condition>
8.distinct <select_list>
9.order by <order_by_condition>
10.limit <limit_number>

Common bottlenecks in MySQL

  • CPU: When the CPU is firing, it generally occurs when data is loaded into memory or read from disk
  • IO: Disk I/O bottlenecks occur when the loaded data is much larger than the memory capacity
  • Server hardware performance bottleneck

8.2 explain to analyze SQL statements


what is explain?

Use the explain keyword to simulate optimizing a qualitative SQL query to know how MySQL handles your SQL.

What can explain do?

  • table read order
  • The operation type of the data read operation
  • those indexes can be used
  • those indexes are actually used
  • references between tables

How does explain play?

explain + SQL statement

explain read order

read order of id table

The sequence number of the select query, which contains a set of numbers, indicating the order in which the selec clause or operation table is executed in the query

Two cases:

  1. The id is the same, the execution order is from top to bottom
  2. If the id is different, if it is a subquery, the serial number of the id will increase. The larger the id value, the higher the priority, and the earlier it is executed.

Data read operation type

About that table when displaying data for this row

Query access partition


From best to worst are

system > const > eq_ref > ref > range > index > ALL
  • The system table has only one record (equal to the system table), which is a special case of the const type
  • const means that the index is found once, and const is used to compare the primary key
  • eq_ref, unique index scan, for the key given to suoyin, only one record in the table matches it
  • ref, non-unique index scan, returns all rows matching a single value, which is essentially an index access, it returns all rows matching a single value
  • range, which retrieves only the given range of rows, uses an index to select rows.
  • The difference between idex, Full, index Scan, index and ALL is that the index type only traverses the index tree.
  • ALL will traverse the entire table to find a matching row.

Displays one or more indexes that may be applied to this table.

The actual index used. If null, no index is used.
If a covering index is used in the query, the index only appears in the key list.
Indicates the number of bytes used in the index by which the index length can be used to query the query. The shorter the length, the better, without sacrificing accuracy. The displayed value of key_len is the maximum possible length of the index field, not the actual length used. That is, key_len is calculated according to the table definition, not by

Index length calculation

varchar(24)  Variable length fields and allow NULL
24*(Character Set: Set   utf8=3,gbk=2,lation1=1)+1(NULL) +2(variable length)

varchar(10) Variable length field and not allowed NULL
10*(Character Set: utf8=3, gbk=3,latin1=1)+2(variable length field)

char(10)Fixed field and allowed NULL
10*(Character Set: utf8=3,gbk=2,latin1=1)+1(NULL)

char(10)Fixed field and not allowed NULL
10*(Character Set: utf=3,gbk=2,latin1=1)

Show which column of the index is used

Based on table statistics and index selection, roughly estimate the number of rows that need to be read to find the desired record

Contains important extra information that doesn't fit in other columns

  • Using filesort, it means that mysql will use an external index to sort the data, instead of reading it according to the index order in the table. The sorting operation that cannot be done by the index in MySQL is called 'file sort'
  • Using temporary, use temporary table to save intermediate results, MySQL uses temporary table when sorting query results
  • Using index, using index, avoids full table scan
  • Using where, where filtering is used
  • Using join buffer, using the connection cache
  • impossible where, impossible condition, the value of where clause is always false.

8.3 Show Profile for SQL analysis

Show Profile for SQL analysis

MySQL provides a measure of resource consumption that can be used to analyze the execution of statements in the current session, and can be used for SQL tuning.

By default, the parameter is off and keeps the results of the last 15 runs.

Show Profile analysis steps

  • 1. Whether it is supported, see if the current MySQL version supports it
  • 2. Turn on the function, the default is off, it needs to be turned on before use


converting HEAP to MyISAM The query result is too large, and the memory is not enough to move it to the disk

Creating tmp table Create a temporary table

Coping to tmp table on disk Copy in-memory temporary table to disk, dangerous


Global query log

open command

set global  log_output='TABLE';

Write SQL statement to table

set global log_output='TABLE';

The SQL statements you write will be recorded in the genral_log table in the MySQL library

select * from mysql.general_log;

8.4 trace analysis SQL optimizer

build table

CREATE TABLE `test_trace`(
   `id` int(11) not null auto_increment,
   `a` int(11) default null,
   `b` int(11) default null,
   `create_time` datetime not null default current_timestamp comment 'record creation time',
   ·update_time· datetime not null default current_timestamp on update current_timestamp comment 'record update time'
   primary key(`id`),
   key `idx_a` (`a`),
   key `idx_b` (`b`),
   engine=innodb default charset=utf8mb4;

trace Analysis SQL Optimizer Introduction
I learned from the previous explanation that I can view the SQL execution plan, but I can't know why it made this decision. If I want to determine how to choose between multiple indexing schemes or which sorting mode is selected when sorting, is there a good way? ?

Starting with MySQL 5.6, you can use trace to see how the optimizer chooses an execution plan.

Through trace, we can further understand why the optimizer chooses the A execution plan instead of the B plan, or know the ordering mode used by a certain ordering, which helps us better understand the optimizer behavior.

If you need to use it, first enable trace, set the format to JSON, execute the SQL to be analyzed, and finally view the trace analysis result (in information_schema.OPTIMIZER_TRACE).
Enabling this function will affect the performance of MySQL, so it is only recommended to temporarily enable it when there are some problems.

Let's take a look at how to use trace. Use the table test_trace created when explaining explain to do the experiment. First construct the following SQL (referring to take out the data with the value of a greater than 900 and the value of b greater than 910 in table t1, and then sort by the a field):

select * from test_trace where a>900 and b>910 order by a;

Analyze with trace

set session optimizer_trace='enabled=on',end_markers_in_json=on;
  • opitmizer_trace = 'enabled=on' means to enable trace
  • end_markers_in_json=on indicates that the JSON output turns on the end marker

View trace analysis results

select * from information_schema.OPTIMIZER_TRACE\G

QUERY: select * from t1 where a >900 and b > 910 order  by a    --SQL statement
  "steps": [
      "join_preparation": {       --SQL preparation stage
        "select#": 1,
        "steps": [
            "expanded_query": "/* select#1 */ select `t1`.`id` AS `id`,`t1`.`a` AS `a`,`t1`.`b` AS `b`,`t1`.`create_time` AS `create_time`,`t1`.`update_time` AS `update_time` from `t1` where ((`t1`.`a` > 900) and (`t1`.`b` > 910)) order by `t1`.`a`"
        ] /* steps */
      } /* join_preparation */
      "join_optimization": {      --SQL optimization phase
        "select#": 1,
        "steps": [
            "condition_processing": {    --Condition processing
              "condition": "WHERE",
              "original_condition": "((`t1`.`a` > 900) and (`t1`.`b` > 910))",        --original condition
              "steps": [
                  "transformation": "equality_propagation",
                  "resulting_condition": "((`t1`.`a` > 900) and (`t1`.`b` > 910))"    --pass-by-value conversion
                  "transformation": "constant_propagation",
                  "resulting_condition": "((`t1`.`a` > 900) and (`t1`.`b` > 910))"       --Constant pass conversion
                  "transformation": "trivial_condition_removal",
                  "resulting_condition": "((`t1`.`a` > 900) and (`t1`.`b` > 910))"        --The structure after removing the conditions that do not exist
              ] /* steps */
            } /* condition_processing */
            "substitute_generated_columns": {
            } /* substitute_generated_columns */   --Replace dummy generated columns
            "table_dependencies": [   --Table Dependency Details
                "table": "`t1`",
                "row_may_be_null": false,
                "map_bit": 0,
                "depends_on_map_bits": [
                ] /* depends_on_map_bits */
            ] /* table_dependencies */
            "ref_optimizer_key_uses": [
            ] /* ref_optimizer_key_uses */
            "rows_estimation": [  --Estimated table access cost
                "table": "`t1`",
                "range_analysis": {
                  "table_scan": {
                    "rows": 1000,       --Scan lines
                    "cost": 207.1       --cost
                  } /* table_scan */,
                  "potential_range_indexes": [    --Analyze possible indexes
                      "index": "PRIMARY",
                      "usable": false,       --for false,Indicates that the primary key index is unavailable
                      "cause": "not_applicable"
                      "index": "idx_a",      --maybe use index idx_a
                      "usable": true,
                      "key_parts": [
                      ] /* key_parts */
                      "index": "idx_b",      --maybe use index idx_b
                      "usable": true,
                      "key_parts": [
                      ] /* key_parts */
                  ] /* potential_range_indexes */,
                  "setup_range_conditions": [
                  ] /* setup_range_conditions */,
                  "group_index_range": {
                    "chosen": false,
                    "cause": "not_group_by_or_distinct"
                  } /* group_index_range */,
                  "analyzing_range_alternatives": { --Analyze the cost of each index
                    "range_scan_alternatives": [
                        "index": "idx_a", --use index idx_a the cost of
                        "ranges": [
                          "900 < a"     --use index idx_a range
                        ] /* ranges */,
                        "index_dives_for_eq_ranges": true, --use or not index dive(Please see the knowledge extension below for a detailed description)
                        "rowid_ordered": false, --Whether the records obtained using this index are sorted by the primary key
                        "using_mrr": false,   --use or not mrr
                        "index_only": false,    --whether to use a covering index
                        "rows": 100,            --The number of records fetched using this index
                        "cost": 121.01,         --the cost of using the index
                        "chosen": true          --may choose this index
                        "index": "idx_b",       --use index idx_b the cost of
                        "ranges": [
                          "910 < b"
                        ] /* ranges */,
                        "index_dives_for_eq_ranges": true,
                        "rowid_ordered": false,
                        "using_mrr": false,
                        "index_only": false,
                        "rows": 90,
                        "cost": 109.01,
                        "chosen": true             --It is also possible to select the index
                    ] /* range_scan_alternatives */,
                    "analyzing_roworder_intersect": { --Analyze the cost of using index merging
                      "usable": false,
                      "cause": "too_few_roworder_scans"
                    } /* analyzing_roworder_intersect */
                  } /* analyzing_range_alternatives */,
                  "chosen_range_access_summary": {  --Confirm the best method
                    "range_access_plan": {
                      "type": "range_scan",
                      "index": "idx_b",
                      "rows": 90,
                      "ranges": [
                        "910 < b"
                      ] /* ranges */
                    } /* range_access_plan */,
                    "rows_for_plan": 90,
                    "cost_for_plan": 109.01,
                    "chosen": true
                  } /* chosen_range_access_summary */
                } /* range_analysis */
            ] /* rows_estimation */
            "considered_execution_plans": [  --Execution plan under consideration
                "plan_prefix": [
                ] /* plan_prefix */,
                "table": "`t1`",
                "best_access_path": {          --optimal access path
                  "considered_access_paths": [ --determined access path
                      "rows_to_scan": 90,      --number of lines scanned
                      "access_type": "range",  --access type: as range
                      "range_details": {
                        "used_index": "idx_b"  --The index used is: idx_b
                      } /* range_details */,
                      "resulting_rows": 90,    --number of result rows
                      "cost": 127.01,          --cost
                      "chosen": true,      --confirm choice
                      "use_tmp_table": true
                  ] /* considered_access_paths */
                } /* best_access_path */,
                "condition_filtering_pct": 100,
                "rows_for_plan": 90,
                "cost_for_plan": 127.01,
                "sort_cost": 90,
                "new_cost_for_plan": 217.01,
                "chosen": true
            ] /* considered_execution_plans */
            "attaching_conditions_to_tables": {  --Try adding some other query conditions
              "original_condition": "((`t1`.`a` > 900) and (`t1`.`b` > 910))",
              "attached_conditions_computation": [
              ] /* attached_conditions_computation */,
              "attached_conditions_summary": [
                  "table": "`t1`",
                  "attached": "((`t1`.`a` > 900) and (`t1`.`b` > 910))"
              ] /* attached_conditions_summary */
            } /* attaching_conditions_to_tables */
            "clause_processing": {
              "clause": "ORDER BY",
              "original_clause": "`t1`.`a`",
              "items": [
                  "item": "`t1`.`a`"
              ] /* items */,
              "resulting_clause_is_simple": true,
              "resulting_clause": "`t1`.`a`"
            } /* clause_processing */
            "reconsidering_access_paths_for_index_ordering": {
              "clause": "ORDER BY",
              "index_order_summary": {
                "table": "`t1`",
                "index_provides_order": false,
                "order_direction": "undefined",
                "index": "idx_b",
                "plan_changed": false
              } /* index_order_summary */
            } /* reconsidering_access_paths_for_index_ordering */
            "refine_plan": [          --Improved execution plan
                "table": "`t1`",
                "pushed_index_condition": "(`t1`.`b` > 910)",
                "table_condition_attached": "(`t1`.`a` > 900)"
            ] /* refine_plan */
        ] /* steps */
      } /* join_optimization */
      "join_execution": {             --SQL execution phase
        "select#": 1,
        "steps": [
            "filesort_information": [
                "direction": "asc",
                "table": "`t1`",
                "field": "a"
            ] /* filesort_information */,
            "filesort_priority_queue_optimization": {
              "usable": false,             --Sorting is not optimized using priority queues
              "cause": "not applicable (no LIMIT)"     --The reason why priority queue sorting is not used is that there is no limit
            } /* filesort_priority_queue_optimization */,
            "filesort_execution": [
            ] /* filesort_execution */,
            "filesort_summary": {           --Sort details
              "rows": 90,
              "examined_rows": 90,          --The number of rows involved in sorting
              "number_of_tmp_files": 0,     --Number of temporary files used during sorting
              "sort_buffer_size": 115056,
              "sort_mode": "<sort_key, additional_fields>"   --Sorting mode (for details, please see the knowledge extension below)
            } /* filesort_summary */
        ] /* steps */
      } /* join_execution */
  ] /* steps */
MISSING_BYTES_BEYOND_MAX_MEM_SIZE: 0  --This field indicates the size in bytes of text discarded by the analysis process. In this case, it is 0, indicating that no text is discarded.
          INSUFFICIENT_PRIVILEGES: 0    --Check trace Whether the permission is insufficient, 0 means permission to view trace Details
1 row in set (0.00 sec)

close trace

set session optimizer_trace='enabled=off';

The entire text in the TRACE field is roughly divided into three processes.

  • Preparation stage: join_preparation in the corresponding text
  • Optimization stage: join_optimization in the corresponding text
  • Execution stage: corresponds to join_execution in the text

When used, focus on the optimization phase and the execution phase.

In the analyzing_range_alternatives item of the trace results, you can see that the cost of using the index idx_a is 121.01, and the cost of using the index idx_b is 109.01. Obviously, the cost of using the index idx_b is lower, so the optimizer chooses the idx_b index;


  • explain: Get the execution plan of the SQL statement in MySQL, such as whether the statement uses an associated query, whether an index is used, the number of rows scanned, etc.;
  • trace: See how the optimizer chooses the execution plan, and get the cost of each possible index choice.

Posted by SithLordKyle on Mon, 23 May 2022 20:51:26 +0300