After my graduation, I knew how to write queries, create tables and use basic SQL commands. I thought I knew MySQL very well, but after a year of working in the industry and rewriting all my queries again and again, I realized that there is more to writing queries than just getting accurate results. Therefore, I decided to share some of these small but essential hints on MySQL with you all.
- Choose column data types carefully. Do not use
datetime columns (believe me, I have seen people using it) or
varchar for saving IP addresses. Best way to store IP address is using unsigned int data type and functions
INET_NTOA() to convert between two notations.
- Using timestamp is a good practice, but bear in mind that different versions of MySQL treat timestamp differently. Read your MySQL version specific documentation to understand more.
- Try not to set default values to
null while creating a table. In other words, it is better to include
NOT NULL in column definition to achieve faster index look up and smaller column size.
- Foreign key is a good choice to maintain data integrity across multiple tables. However, they slow down insertions. Similar functionality with a better performance can be achieved with a well written software/application.
- Using auto increment columns might cause table level locks which is not desirable, if you have high concurrency on the table. I personally use it, but it’s something you need to be aware of. The workaround would be redesigning table or application to use an alternative or configuring. innodb_autoinc_lock_mode which is more advance.
- It is not ideal to have a table with too many columns. I would say 10 to 20 columns per table are normal. Normalization is a technique to break down bulky tables to small set of tables with relevant data. This technique can be applied to create smaller tables with less redundant data. However, too normalized table design slows down query throughput. Finding common ground is a little bit of a challenge.
- If you don’t know what engine to use or do not have any specific needs, use Innodb as your default engine.
- utf8 is good for storing mix encoded characters in the database e.i Arabic and Chinese. The downside is that utf8 can occupy disc and memory space up to 3 times higher than ASCII character set. So, if stored data are plain, English and numbers go on with ASCII characters.
Things to keep in mind before creating indexes:
- Benchmark and document the performance before and after creating the index. While benchmarking your queries, make sure to add
SQL_NO_CACHE after the
SELECT keyword to avoid cache hits while benchmarking. mysqlslap is a useful tool for benchmarking.
- Index the columns that MySQL uses to locate rows. For example, columns in
- Columns with lower redundant values are good candidates for indexing. Highly selective index is desirable because it lets MySQL filter out more rows when it looks for matches.
- Data type plays an important role on indexes. Two main elements in selecting a data type are performance and memory space.
Date are among the best and
text among the worst data types for indexing. If
varchar is your only option for indexing, you may index only the first few characters of the
varchar column to achieve a good performance while preserving space.
- B-Tree indexes (most of the indexes in MySQL) work well for lookups by the full key value (e.i a = 10), a key range (e.i a > 10), or a key prefix (e.i b LIKE 'jo%').
- Using Multiple-Column Indexes is more efficient than creating separate indexes on each column. If you index more than one column, the column order is very important. Index is used only if the lookup uses the leftmost prefix of the index and equations are not range.
- Using Multiple-Column Indexes is desirable for creating covering index.
- In order to utilize indexes, you need to do the look up on indexed columns. Try using indexed columns in your
- Do not modify the indexed column during look up (Column Isolation). Modifying the value will disable the index on that column. e.g
DATE(datetimeClm) = NOW() will not use the index on
- Always keep the indexed column on the left side of expression and the variable on the right side.
- When limit and offset are used in a query, MySQL retrieves limit + offset number of rows and sends back the number of rows specified in offset. Therefore, you will save time on data transmission between the server and the client, but a lot of rows are retrieved and discarded right after. This is fairly inefficient activity with lots of overhead. If your table is filled in a particular order or your table has an auto increment primary key, use them to specify the range, e.g.:
SELECT * FROM tbl_name WHERE id BETWEEN 100 AND 200
SELECT * FROM tbl_name LIMIT 100,100
EXPLAIN keyword in front of a query displays the server execution plan. Information provided can be used to tweak the query performance. This page illustrates the meanings of the output.
join is more optimized when one of the two columns in the
join expression is indexed. Do remember to name the column with index on the left side of
join expression so the MySQL server can use the index, e.g.:
SELECT * FROM tblA a INNER JOIN tblB b ON b.indexedClm = a.notIndexedClm;
SQL_CALC_FOUND_ROWS runs the query again to calculate the number of rows returned, so it is not the optimal option.
- MySQL general log is very handy. I personally use it to get the queries and users connecting to MySQL Server in a certain period of time. By default, the result of
general_log is saved in a file but you can change it to save it on table which is handier.
SET GLOBAL log_output='TABLE';
SET GLOBAL general_log=1;
After query executed, run the following to find the list of activities and queries that took place while general log was enabled.
SELECT * FROM mysql.general_log;
Do remember to turn off the logging process when you done using it. You don’t want extra overhead on your server.
SET GLOBAL general_log=0;
- MySQL Query Browser, workbench and Heidi are very useful tools that allow you to view query results in grid view, modify tables, and view running processes and much more. MySQL Query Browser is deprecated and it is very unstable, but I still like it.
- MySQL has a configuration file named my.ini (on Windows). By default, it is located in the installation directory. MySQL Server can be optimized by tweaking the variables. But do not change values without completely understanding the impact. If you have lots of RAM on your server and you want MySQL to use some of it, you may increase the
innodb_buffer_pool_size up to a third of physical memory (just to be on the safe side). This will reserve more memory for MySQL (only if tables are using innodb engine) but make sure you read about it before making any changes.
- Values in my.ini will take effect after a MySQL server restart. If the server did not start after changes are made, do not freak out and uninstall MySQL Server. Probably you have given a wrong value to a variable or it is in a wrong format. Note that values are key sensitive and keep a backup of the original my.ini before changing it.
- If the application connected to MySQL has a different character set than the table it is accessing, all data transmitted between MySQL Server and the application has to be converted to the receiver character set. This is a very costly activity. The application character set can be configured per session by SQL command
SET NAMES 'charset_name' COLLATE 'collation_name' or set the default application character set from my.ini. You can learn more from this article.
- Designing a database and table structure for efficient queries is much easier if you know what the queries are before you start. In other words, design the reports / webpages before the database. Otherwise, be prepared to restructure in production databases.