This post's content
I recently stumbled upon a very interesting post by Lukas Eder, where he describes 10 query transformations which do not depend on the database’s cost model. He posted it a couple of years ago, though when I read it, I assumed some portions of it may still be relevant today.
In the original post, several databases were tested to see if their internal optimizer will be able to automatically re-write the SQL queries and optimize them. In those tests, MySQL under-performed in several of the use cases (the tested version was MySQL 8.0.2, which was released on 2017-07-17).
Seeing those results, and given the previous evaluation was done almost two years ago, I thought that now can be a good chance to re-evaluate a few of those tests with the latest MySQL 8.0.16 (released on 2019-04-25), and demonstrate EverSQL Query Optimizer's capabilities while we are at it.
We'll use the Sakila sample database for the demonstrations, and some of the original queries from Lukas's post.
The following two tables will be used for these demonstrations:
CREATE TABLE address ( address_id INT NOT NULL, address VARCHAR(50) NOT NULL, CONSTRAINT pk_address PRIMARY KEY (address_id) ); CREATE TABLE customer ( customer_id INT NOT NULL, first_name VARCHAR(45) NOT NULL, last_name VARCHAR(45) NOT NULL, address_id INT NOT NULL, CONSTRAINT pk_customer PRIMARY KEY (customer_id), CONSTRAINT fk_customer_address FOREIGN KEY (address_id) REFERENCES address(address_id) ); CREATE TABLE actor ( actor_id SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT, first_name VARCHAR(45) NOT NULL, last_name VARCHAR(45) NOT NULL, last_update TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (actor_id), KEY idx_actor_last_name (last_name) )ENGINE=InnoDB DEFAULT CHARSET=utf8;
JOIN Elimination
Looking at the query below, you'll probably notice that the table address isn't really used anywhere in the query (other than in the join's ON clause), and has no actual contribution to the query. The columns are selected from the customer table, and there are no predicates for the address table (or any predicates for that matter). The existence of a PRIMARY KEY for address_id in the table address and the FOREIGN KEY for the same column in the table customer should provide the database's optimizer the confidence to conclude that the join to the table address can be spared.
SELECT c.first_name, c.last_name FROM customer c JOIN address a ON c.address_id = a.address_id
Yet, as you can see from the EXPLAIN, MySQL executes the join, without applying the expected query transformation.
When submitting the same query and schema structure to EverSQL, it will output the following query and recommendation:
In a similar manner, MySQL 8.0.16 will execute the join for all other examples in the Join Elimination section of the original post, so I saw no reason to repeat them here.
Unneeded Self JOIN
In the following query, the table actor is joined to itself. It can be proven that a1 = a2, because the tables are joined using the primary key actor_id. Therefore, anything we can do with a2, can actually be done with a1 as well. Therefore, we can can modify the references to a2 in the SELECT clause to the same columns in a1, and remove the redundant join to a2.
SELECT a1.first_name, a2.last_name FROM actor a1 JOIN actor a2 ON a1.actor_id = a2.actor_id;
MySQL 8.0.16 will execute the join in this case as well, without applying the expected query transformation.
When submitting the same query and schema structure to EverSQL, it will output the following query and recommendation:
Original queries and information for this section can be found here.
Predicate Pushdown
We should always strive to have our SQL queries process as less data as possible, especially if the filtering is done using indexes. In the following query, we expect MySQL to push the condition from the outer query to both parts of the UNION ALL, to make sure we filter out as much data as we can, as early as possible.
SELECT * FROM (SELECT first_name, last_name, 'actor' type FROM actor UNION ALL SELECT first_name, last_name, 'customer' type FROM customer) people WHERE people.last_name = 'DAVIS';
As you can see below, the transformation isn't applied by MySQL, and both tables, actor and customer are scanned in full.
Submitting the query and schema structure to EverSQL will result in the following query and recommendation:
When looking at the execution plan of the optimized query, you can see that the indexes are used, and significantly less data is scanned:
Wrapping up
Query transformations can be very powerful, and it's important to understand which of them will be applied automatically by the database and which won't. In this post, we listed three examples (originally posted by Lukas Eder), in which MySQL 8.0.16 didn't apply the expected transformations, which eventually resulted in non-optimal execution plans.