子查询, 是指在SQL语句中内嵌了一个SELECT查询, 该SELECT称为内层查询, 包含SELECT的SQL称为外层查询. 其按照内层查询是否依赖于外层查询, 可以分为独立子查询和相关子查询.
为了演示方便, 有测试表tb1 和tb2, 数据如下:
mysql> SELECT * FROM tb1;
+------+
| col1 |
+------+
| 3 |
| 9 |
+------+
2 rows in set (0.00 sec)
mysql> SELECT * FROM tb2;
+------+
| col1 |
+------+
| 2 |
| 3 |
| 8 |
+------+
3 rows in set (0.00 sec)
独立子查询, 是指内层和外层查询没有关联,不用进行联合查询. 和其相关的有三组关键字: ANY/ IN/ SOME, ALL和EXISTS.
operand comparison_operator ANY (subquery)
operand IN (subquery)
operand comparison_operator SOME (subquery)
a. ANY关键字的含义是, 对于在子查询返回的列中的任一值, 如果表达式结果为TRUE的话, 则返回TRUE.
对于如下SQL, 表tb1包含(3, 9), tb2包含(2, 3, 8), 表达式结果为TRUE.
mysql> SELECT col1 FROM tb1 WHERE col1> ANY (SELECT col1 FROM tb2);
+------+
| col1 |
+------+
| 3 |
| 9 |
+------+
2 rows in set (0.00 sec)
b. 在子查询中, = ANY的别名是IN, 下面两个SQL返回是相同的.
mysql> SELECT col1 FROM tb1WHERE col1 = ANY (SELECT col1 FROM tb2);
+------+
| col1 |
+------+
| 3 |
+------+
1 row in set (0.00 sec)
mysql> SELECT col1 FROM tb1WHERE col1 IN (SELECT col1 FROM tb2);
+------+
| col1 |
+------+
| 3 |
+------+
1 row in set (0.00 sec)
c. ANY的别名是SOME, 下面两个SQL返回是相同的.
mysql> SELECT col1 FROM tb1WHERE col1 <> ANY (SELECT col1 FROM tb2);
+------+
| col1 |
+------+
| 3 |
| 9 |
+------+
2 rows in set (0.00 sec)
mysql> SELECT col1 FROM tb1WHERE col1 <> SOME (SELECT col1 FROM tb2);
+------+
| col1 |
+------+
| 3 |
| 9 |
+------+
2 rows in set (0.00 sec)
ALL的语法是:
operand comparison_operator ALL (subquery)
a. ALL关键字的含义是, 对于在子查询返回的列中的所有值, 如果表达式结果为TRUE的话, 则返回TRUE.
mysql> SELECT col1 FROM tb1WHERE col1 > ALL (SELECT col1 FROM tb2);
+------+
| col1 |
+------+
| 9 |
+------+
1 row in set (0.00 sec)
b. <> ALL的别名是NOT IN, 下面两个SQL返回是相同的.
mysql> SELECT col1 FROM tb1 WHERE col1 <> ALL (SELECT col1 FROM tb2);
+------+
| col1 |
+------+
| 9 |
+------+
1 row in set (0.00 sec)
mysql> SELECT col1 FROM tb1WHERE col1 NOT IN (SELECT col1 FROM tb2);
+------+
| col1 |
+------+
| 9 |
+------+
1 row in set (0.00 sec)
a. EXISTS的含义是, 若子查询返回非空集, 则EXISTS为TRUE, NOT EXISTS为FALSE. 下面的SQL只是为了便于理解EXISTS,一般不这样用.
mysql>SELECT col1 FROM tb1 WHERE EXISTS (SELECT * FROMtb2);
+------+
| col1 |
+------+
| 3 |
| 9 |
+------+
rows in set (0.00 sec)
b. 如下EXISTS子查询比较接近实际情况, 获取表tb1和tb2中相同的记录. 可以看到其内层关联了外层表,这也就是下面说的相关子查询.
mysql> SELECTcol1 FROM tb1 WHERE EXISTS (SELECT * FROM tb2 WHERE tb2.col1 = tb1.col1);
+------+
| col1 |
+------+
| 3 |
+------+
1 row in set (0.00 sec)
相关子查询, 是指内层查询需要和外层查询的表相关联, 进行联合查询. 在上面已经看到了相关子查询的例子, 仔细体会下其和独立子查询语法上的差异.
另外, 如上面的SELECT col1 FROM tb1 WHERE EXISTS (SELECT * FROM tb2WHERE tb2.col1 = tb1.col1), 是和再上面的SELECT col1 FROM tb1 WHERE col1 IN (SELECT col1 FROM tb2)等价的,即可将IN的独立子查询和EXISTS的相关子查询相互改写, 那么两者有什么不同呢 …
在表tb2中加入一条记录(NULL), 其数据如下, tb1的不变:
mysql> SELECT * FROM tb2;
+------+
| col1 |
+------+
| 2 |
| 3 |
| 8 |
| NULL |
+------+
4 rows in set (0.00 sec)
找出在表tb1, 不在tb2中的记录, 目测结果应为9, 但返回却为空.
mysql> SELECT * FROM tb1 WHERE col1 NOTIN (SELECT col1 FROM tb2);
Empty set (0.00 sec)
为什么呢, 测试如下, 原来在有NULL值的情况下, NOT IN只返回NOT TRUE和NULL, 即FALSE. 就是在NULL的情况下, NOT IN永远不会返回结果.
mysql> SELECT ‘a‘ NOT IN (‘a‘, ‘b‘,NULL);
+-----------------------------+
| ‘a‘ NOT IN (‘a‘, ‘b‘, NULL) |
+-----------------------------+
| 0 |
+-----------------------------+
1 row in set (0.00 sec)
mysql> SELECT ‘c‘ NOT IN (‘a‘, ‘b‘,NULL);
+-----------------------------+
| ‘c‘ NOT IN (‘a‘, ‘b‘, NULL) |
+-----------------------------+
| NULL |
+-----------------------------+
1 row in set (0.00 sec)
若想NOT IN返回结果, 要先过滤掉NULL值, 这里也说明了不建议把数据存为NULL的原因.
mysql> SELECT * FROM tb1 WHERE col1 NOTIN (SELECT col1 FROM tb2 WHERE col1 IS NOT NULL);
+------+
| col1 |
+------+
| 9 |
+------+
1 row in set (0.00 sec)
使用EXISTS来写的话, 由于其是相关子查询,不用特殊考虑NULL的情况.
mysql> SELECT * FROM tb1 WHERE NOTEXISTS (SELECT col1 FROM tb2 WHERE tb2.col1 = tb1.col1);
+------+
| col1 |
+------+
| 9 |
+------+
1 row in set (0.00 sec)
对于子查询的性能优化, 以及改写为JOIN等, 稍后会进行整理, 感兴趣可关注订阅号”数据库最佳实践”(DBBestPractice).
本文出自 “感动自己” 博客,请务必保留此出处http://coveringindex.blog.51cto.com/12533373/1956396
原文地址:http://coveringindex.blog.51cto.com/12533373/1956396