Following is an introduction to the previous article, SQL Optimization Case (1): Implicit Conversion, which focuses on OR optimization.
In MySQL, the same query condition, if the OR s are transformed in the SQL statement, the results of the query will be different. In many complex cases, poor index selection may be a potential performance hazard. To avoid a significant reduction in execution efficiency, we can properly consider using uniform all SQL to separate complex query logic.
Common OR usage scenarios, please read the following cases.
Case 1: Use OR Conditional Query for Different Columns
1. Scene to be optimized
SELECT .. .. FROM`t1` a WHERE a.token= '16149684' AND a.store_id= '242950' AND(a.registrationId IS NOT NULL AND a.registrationId<> '') OR a.uid= 308475 AND a.registrationId IS NOT NULL AND a.registrationId<> ''
Execution Plan
+--------------+-----------------------+-----------------+----------------+-------------------+-------------------+---------------+----------------+---------------------------------------------+ | id | select_type | table | type | key | key_len | ref | rows | Extra | +--------------+-----------------------+-----------------+----------------+-------------------+-------------------+---------------+----------------+---------------------------------------------+ | 1 | SIMPLE | a | range |idx_registrationid | 99 | | 100445 | Using index condition; Using where | +--------------+-----------------------+-----------------+----------------+-------------------+-------------------+---------------+----------------+---------------------------------------------+
A total of 1 row of records was returned, which took 5 ms.
2. Scene Analysis
Tokens and uid filtering can be ground very well from query conditions, but because of use, or because of the need for index merging, better performance can be achieved. However, the MySQL optimizer chose to use the registrationId index instead during actual execution, resulting in poor SQL performance.
3. Scene optimization
We'll rewrite SQL to union all.
SELECT ... ... FROM`t1` a WHERE a.token = '16054473' AND a.store_id = '138343' AND b.is_refund = 1 AND (a.registrationId IS NOT NULL AND a.registrationId <> '') union all SELECT ... ... FROM`t1` a where a.uid = 181579 AND a.registrationId IS NOT NULL AND a.registrationId <> ''
+--------------+-----------------------+-----------------+----------------+------------------------------+---------------+-------------------+------------------------------+----------------+------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +--------------+-----------------------+-----------------+----------------+------------------------------+---------------+-------------------+------------------------------+----------------+------------------------------------+ | 1 | PRIMARY | a | ref | IDX_TOKEN,IDX_STORE_ID_TOKEN | IDX_TOKEN | 63 | const | 1 | Using index condition; Using where | | 1 | PRIMARY | b | eq_ref | PRIMARY | PRIMARY | 4 | youdian_life_sewsq.a.role_id | 1 | Using where | | 2 | UNION | a | const | PRIMARY | PRIMARY | 4 | const | 1 | | | 2 | UNION | b | const | PRIMARY | PRIMARY | 4 | const | 0 | unique row not found | | | UNION RESULT | <union1,2> | ALL | | | | | | Using temporary | +--------------+-----------------------+-----------------+----------------+------------------------------+---------------+-------------------+------------------------------+----------------+------------------------------------+
Five rows of records were returned, which took 5 ms.
By comparing the execution plans before and after the optimization, you can prune them significantly, merge the SQL into two subqueries, and then use union to merge the results, which is more stable, safe and efficient.
Case 2: Use OR query criteria for the same column
1. Scene to be optimized
select .... .... from t1 as mci left join t1 as ccv2_1 on ccv2_1.unique_no = mci=category_no1 left join t1 as ccv2_2 on ccv2_2.unique_no = mci=category_no2 left join t1 as ccv2_3 on ccv2_3.unique_no = mci=category_no3 left join( select product_id, count(0) count from t2 pprod inner join t3 pinfo on pinfo.promotion_id = pprod.promotion_id and pprod.is_enable =1 and ppinfo.is_enable=1 and pinfo.belong_t0 =1 and pinfo.end_time >=now() and not ( pinfo.onshelv_time>'2019-06-30 00:00:00' or pinfo.end_time>'2018-12-05 00:00:00' )group by pprod.product_id )as pc on pc.product_id = mci.product_id where mci.is_enable =0 and mci.comodifty_type in ('1', '5', '6') and (pc.count =0 or pc.count isnull ) limit 0,5;
Execution Plan
2. Scene Analysis
In this example, there is a subquery in the SQL query, which is treated as a drive table and produces auto_key, tested by SQL, verifies that (pc.count = 0 or pc.count is null) can affect the performance of the entire SQL and needs to be overridden.
3. Scene optimization
First, let's think about how to optimize (pc.count = 0 or pc.count is null)? Write a similar SQL first
Select col from test where col =100 or col is null; +--------+ | col | +--------+ | 100 | | NULL | +--------+ 2 rows in set (0.00 sec)
At this point we see the same column, but corresponding to different values, which can be converted using case when.
Select col From test where case when col is null then 100 else col =100 end; +--------+ | col | +--------+ | 100 | | NULL | +--------+ 2 rows in set (0.00 sec)
Return to the original SQL for overwriting.
select .... .... from t1 as mci left join t1 as ccv2_1 on ccv2_1.unique_no = mci=category_no1 left join t1 as ccv2_2 on ccv2_2.unique_no = mci=category_no2 left join t1 as ccv2_3 on ccv2_3.unique_no = mci=category_no3 left join( select product_id, count(0) count from t2 pprod inner join t3 pinfo on pinfo.promotion_id = pprod.promotion_id and pprod.is_enable =1 and ppinfo.is_enable=1 and pinfo.belong_t0 =1 and pinfo.end_time >=now() and not ( pinfo.onshelv_time>'2019-06-30 00:00:00' or pinfo.end_time>'2018-12-05 00:00:00' )group by pprod.product_id )as pc on pc.product_id = mci.product_id where mci.is_enable =0 and mci.comodifty_type in ('1', '5', '6') and case when pc.count is null then 0 else pc.count end=0 limit 0,5;
Optimized SQL can be polished 30 seconds faster than the original SQL, and the execution efficiency is about 50 times higher.
Case 3: Optimizing associated SQL OR conditions
1. Scene to be optimized
SELECT user_msg.msg_id AS 'msg_id', user_msg.content AS 'msg_content', ... FROM user_msg LEFT JOIN user ON user_msg.user_id = user.user_id LEFT JOIN group ON user_msg.group_id = group.group_id WHERE user_msg.gmt_modified >= date_sub('2018-03-29 09:31:44', INTERVAL30SECOND) OR user.gmt_modified >= date_sub('2018-03-29 09:31:44', INTERVAL 30 SECOND) OR group.gmt_modified >= date_sub('2018-03-29 09:31:44', INTERVAL 30 SECOND)
2. Scene Analysis
We carefully analyzed the above query statements and found that although business logic only needs to query data modified in half a minute, the execution process must associate all the data, resulting in a loss of performance.
3. Scene optimization
We decompose the original SQL, the first part sql-01 is as follows:
SELECT user_msg.msg_id AS 'msg_id', user_msg.content AS 'msg_content', ... FROM user_msg LEFT JOIN user ON user_msg.user_id = user.user_id LEFT JOIN group ON user_msg.group_id = group.group_id WHERE user_msg.gmt_modified >= date_sub('2018-03-29 09:31:44', INTERVAL 30 SECOND)
sql-01 to user_ The MSG table is driven, using gmt_ The modified index filters the latest data.
The second part sql-02 is as follows:
SELECT user_msg.msg_id AS 'msg_id', user_msg.content AS 'msg_content', ... FROM user_msg LEFT JOIN user ON user_msg.user_id = user.user_id LEFT JOIN group ON user_msg.group_id = group.group_id WHERE user.gmt_modified >= date_sub('2018-03-29 09:31:44', INTERVAL 30 SECOND)
sql-02 user-driven table, msg user_id has a good index filter row.
The third part sql-03 is as follows:
SELECT user_msg.msg_id AS 'msg_id', user_msg.content AS 'msg_content', ... FROM user_msg LEFT JOIN user ON user_msg.user_id = user.user_id LEFT JOIN group ON user_msg.group_id = group.group_id WHERE group.gmt_modified >= date_sub('2018-03-29 09:31:44', INTERVAL 30 SECOND)
sql-03 takes group as the drive table and uses gmt_ The modified index filters the latest data.
summary
Common scenarios for MySQL OR condition optimization include the following:
1, the same column can be replaced with IN
2. union all can be used to separate columns and complex cases
3, Associate SQL OR Conditions
We need to analyze and optimize the scenarios.
More technologies to view on the official website [ https://www.dtstack.com/dtsmart/
](https://www.dtstack.com/dtsma...