Need To Improve Performance In MySQL? Try Prepared Statements
There are two main ways to execute MySQL queries against databases, through simple statements and through prepared statements. This article will cover the benefits of using prepared statements.
What is a normal Statement?
Simple SQL queries need no detailed explanation, since anyone who has ever sent an SQL query to a database server has most likely used a simple statement. So the question is, why wouldn't this be optimal when everyone uses it and it's the standard?
First, let's look at how a query is administered by MySQL. I don't want to go into every detail but only give a rough insight. We take a statement that we want to use to query all invoices from a table that are still open. How this data is created is not interesting, let's just assume that the statement fits our database schema.
SELECT * FROM invoice WHERE paid = FALSE;
Now when we send this statement to the server, it has to check a few things. For example, it must check whether all the statements it contains are syntactically correct. Since we have only sent one statement, which is also extremely short, the effort should be very small, but it is there and also grows with the length of the statement. If the server has come to the conclusion that the statement is correct, it must be checked whether the table and the column to which we refer exist at all. In addition, the optimizer tries to find a way to solve the statement as quickly as possible. For example, if there is an index that contains paid as the first column, this index can be used to answer the statement. If there is no matching index, MySQL has to go through the whole table. As you can see, before actually looking at data, there are some questions to be answered. If the index can be used to find out that there are no matching rows, then no data will be considered at all.
If we now send this statement to the server again, it has to answer all these questions again. Since most queries are executed by applications while they are being used by many users, this can quickly become a few thousand queries per second. That's exactly the situation where prepared statements become interesting. We can use them to tell the server: I need you to figure something out for me, and please make a note of how you figured it out, because I'll be asking the question again soon.
How do Prepared Statements differ?
If you use a prepared statement instead of a simple statement, this is usually already communicated to the server when it is created. Regardless of whether the statement is to be executed at that moment or not, the server can already perform the basic checks. In addition, the optimizer can already create a plan, how the statement can be solved best. If you want to execute the statement, the server will be informed and it can use the information it has calculated before. If a statement is now executed very often, a considerable part of the runtime can be saved. In case you do not want to execute the statement again, you can still tell the server not to store the information for the statement any further.
Our previous example is simplified in the sense that it is completely static. If you want to have the paid or unpaid invoices for certain accounts, for example, the whole thing could look like this:
SELECT * FROM invoice WHERE paid = ? AND account_id = ?;
The placeholders ? can be substituted by any input data, depending on the account and status for which you want to query the information. The advantage of this is that you can send this statement with the placeholders to the server as a prepared statement. As soon as you request concrete information, you only have to tell the server, please execute the prepared statement with the following parameters. Otherwise, not much changes and the complete process is relatively clear.
mysql> PREPARE stmt FROM 'SELECT * FROM invoice WHERE paid = ? AND account_id = ?;';
mysql> SET @paid = FALSE;
mysql> SET @accountId = 12;
mysql> EXECUTE stmt USING @paid, @account_id;
...
mysql> DEALLOCATE PREPARE stmt;
When to use Prepared Statements?
However, as the number of different statements increases, it quickly becomes impossible to prepare all statements by hand. In addition, you have to consider that preparing the information for the statements on the server side also takes up memory, so the question arises: when is it worth it?
For example, if you use frameworks like Hibernate, then by default everything is a prepared statement and for almost all applications that use Hibernate, this makes sense, because practically all statements are executed over and over again with different input data. If you use PyMySQL, however, prepared statements are not supported there (see discussion). In such cases it is therefore questionable whether it is worthwhile to rebuild the whole thing in your own application.
An interesting plan, which is pursued with PyMySQL is the implementation of the X Plugin of MySQL, which was introduced with version 8.0. Reqeuests sent to the server via the new protocol benefit from server-side prepared statements, which in effect means that you send simple statements to the server and it takes care of the rest. This possibility is available as of MySQL 8.0.16 (MySQL Docs).
In conclusion, prepared statements are worthwhile much more often than simple statements, not without reason this feature became the default behavior in the new X protocol.