MySQL 知识-- 相关子查询优化
本文最后更新于:10 天前
⚡相关子查询的 MySQL 错误优化(MySQL 5.7 以后不适用)
⚡问题
对于不相关子查询来说,我们会认为先执行子查询,在执行外层查询,但是实际不会: MySQL 会将相关的外层表压缩到子查询中,它认为这样可以更高效地查找到数据行
例如:
select select * from test.tabname where id in(select id from test.tabname2 where name='name');
# MySQL 不会这样执行:
select group_concat(id) from test.tabname2 where name='name';
# 内层查询结果:1,3,5,7,9,11,13,15,17,1
select * from test.tabname where id in(1,3,5,7,9,11,13,15,17,19);
# MySQL 实际执行:
select * from test.tabname
where exists(
select * from test.tabname2
where tabname.id=tabname2.id
and tabname2.name='name');
# 这时,子查询先根据 id 来关联外部表 tabname ,通过 explain 我们可以看到子查询是一个相关子查询(DEPENDENT SUBQUERY)
explain select * from test.tabname where id in(select id from test.tabname2 where name='name');
# +----+--------------------+----------+-----------------+---------------+---------+---------+------+------+-------------+
# | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
# +----+--------------------+----------+-----------------+---------------+---------+---------+------+------+-------------+
# | 1 | PRIMARY | tabname | ALL | NULL | NULL | NULL | NULL | 30 | Using where |
# | 2 | DEPENDENT SUBQUERY | tabname2 | unique_subquery | PRIMARY | PRIMARY | 4 | func | 1 | Using where |
# +----+--------------------+----------+-----------------+---------------+---------+---------+------+------+-------------+
explain select * from test.tabname
where exists(
select * from test.tabname2
where tabname.id=tabname2.id and tabname2.name='name'
);
# +----+--------------------+----------+--------+---------------+---------+---------+-----------------+------+-------------+
# | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
# +----+--------------------+----------+--------+---------------+---------+---------+-----------------+------+-------------+
# | 1 | PRIMARY | tabname | ALL | NULL | NULL | NULL | NULL | 30 | Using where |
# | 2 | DEPENDENT SUBQUERY | tabname2 | eq_ref | PRIMARY | PRIMARY | 4 | test.tabname.id | 1 | Using where |
# +----+--------------------+----------+--------+---------------+---------+---------+-----------------+------+-------------+
# 然而,MySQL 的优化把不相关子查询变成了相关子查询,效率更低: 先扫描外层表,再逐个匹配内层表,如果外层表很大,这样效率降低更加明显
⚡解决方案:使用连接查询
explain select tabname. * from test.tabname inner join test.tabname2 using(id) where tabname2.name='name';
# +----+-------------+----------+--------+---------------+---------+---------+------------------+------+-------------+
# | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
# +----+-------------+----------+--------+---------------+---------+---------+------------------+------+-------------+
# | 1 | SIMPLE | tabname2 | ALL | PRIMARY | NULL | NULL | NULL | 20 | Using where |
# | 1 | SIMPLE | tabname | eq_ref | PRIMARY | PRIMARY | 4 | test.tabname2.id | 1 | |
# +----+-------------+----------+--------+---------------+---------+---------+------------------+------+-------------+
# 这次可以看到,select_type 变为两个简单查询simple
# 首先访问的是 tabname2,因为表 tabname2 的记录比较少,只需该表全表扫描,再查询子查询
本博客所有文章除特别声明外,均采用 知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议 。转载请注明出处!