In this blog post, we are going to discuss the top 10 SQL query optimization techniques to get precise data from the database.
Structured Query Language (SQL) performance tuning can be an incredibly challenging task, especially when working with large-scale data where even the smallest change can have a dramatic performance effect, positively or negatively. In order to get the exact data we’re looking for we need to provide the appropriate query.
1) Define the requirements
Frame the optimal requirements before starting to write the query. This will help refine the query to avoid fetching
unwanted data from the table.
2) SELECT fields, rather than using SELECT *
Use the SELECT statement optimally, instead of always fetching all data from the table. Fetch only the necessary data from the table, thereby avoiding the costs of transferring unwanted data and processing it.
Inefficient:
1 2 3 |
SELECT * FROM Business |
Efficient:
1 2 3 |
SELECT Name, Phone, Address, CompanyZip FROM BusinessTop 10 SQL Query Opti |
This query is much simpler and only pulls the required details from the table.
3) Avoid DISTINCT in SELECT query
SELECT DISTINCT is a simple way of removing duplicates from a database. SELECT DISTINCT works to generate distinct outcomes by using the GROUP BY clause, which groups all the fields in the query. However, a large amount of processing power is required to do this. So, avoid DISTINCT in SELECT queries.
Inefficient:
1 2 3 |
SELECT DISTINCT FName, LName, Country FROM Customers |
Multiple people in the same country might have the same first and last name.
Efficient:
1 2 3 |
SELECT ID, FName, LName, Country, State, City, Zip FROM Customers |
Unduplicated records are returned without using SELECT DISTINCT by adding more fields.
4) Indexing
Indexing in SQL Server helps retrieve data more quickly from a table, thereby giving a tremendous boost to SQL query performance. Allow effective use of clustered and non-clustered indexes. Understand the query’s intent and choose the right form for your scenario.
Use a covering index to reduce the time needed for the execution of commonly used statements. Indexes occupy disk space. The more indexes you have, the greater the space used on the disk. In SQL Server
, a clustered index requires no additional disk space, but any non-clustered index needs additional disk space as it is stored separately from the list.
5) To check the existence of records, use EXISTS() rather than COUNT()
Both EXISTS() and COUNT() methods can be used to check the existence of a record entry in the table. The EXISTS() method is more effective as it exits processing as soon as it finds the first entry of the record in the table. The COUNT() method would scan the entire table to return the number of records in the table that match the provided constraint.
Inefficient:
1 2 3 4 |
IF (SELECT COUNT(Id) FROM Business WHERE Name like ‘ABC%’) > 0 PRINT ‘YES’ |
Efficient:
1 2 3 4 |
IF EXISTS (SELECT Id, Name FROM Business WHERE Name like ‘ABC%’) PRINT ‘YES’ |
6) Limit your working data set size
The fewer data retrieved, the faster the query will run. Instead of adding too many client-side filters, filter the data at the server as much as possible. This limits the data sent on the wire, and you will be able to see the results much more quickly.
7) Use WHERE instead of HAVING
The HAVING clause filters the rows after all the rows are selected. It works just like a filter. Do not apply the HAVING clause for any other purpose. HAVING statements are determined in the SQL operating order after WHERE statements. Therefore, it is quicker to execute the WHERE query.
Inefficient:
1 2 3 4 5 6 |
SELECT c.ID, c.CompanyName, b.CreatedDate FROM Business b JOIN Company c ON b.CompanyID = c.ID GROUP BY c.ID, c.CompanyName, b.CreatedDate HAVING b.CreatedDate BETWEEN ‘2020-01-01’ AND ‘2020-12-31’ |
Efficient:
1 2 3 4 5 6 |
SELECT c.ID, c.CompanyName, b.CreatedDate FROM Business b JOIN Company c ON b.CompanyID = c.ID WHERE b.CreatedDate BETWEEN ‘2020-01-01’ AND ‘2020-12-31’ GROUP BY c.ID, c.CompanyName, b.CreatedDate |
8) Ignore linked subqueries
A linked subquery depends on the query from the parent or from an external source. It runs row by row, so the average cycle speed is greatly affected.
Inefficient:
1 2 3 |
SELECT b.Name, b.Phone, b.Address, b.Zip, (SELECT CompanyName FROM Company WHERE ID = b.CompanyID) AS CompanyName FROM Business b |
For each row returned by the external query, the inner query is run every time. Alternatively, JOIN can be used to solve these problems for SQL database optimization.
Efficient:
1 2 3 4 |
SELECT b.Name, b.Phone, b.Address, b.Zip, c. CompanyName FROM Business b Join Company c ON b.CompanyID = c.ID |
9) Use of temp table
This is yet another issue that is very difficult to solve. In many cases, we use a temp table to stop double-dipping into large tables. A temp table can also be used to significantly reduce the mandatory computing power when dealing with large volumes of data. When linking data from a table to a large table, add a large subset to reduce the efficiency hindrance.
10) Don’t run queries in a loop
Coding SQL queries in loops slow the entire sequence. Instead of writing a question and running it in a loop, bulk insert and update can be used depending on the situation.
Inefficient:
1 2 3 4 5 6 |
for (int i = 0; i < 10; i++) { $query = “INSERT INTO Business (X,Y,Z) VALUES . . . .”; printf (“New Record has been inserted”); } |
Efficient:
1 2 3 |
INSERT INTO Business (X,Y,Z) VALUES (1,2,3), (4,5,6). . . . |
Summary
In this blog, we’ve covered the top 10 tips for optimizing SQL queries. The most important part is learning the rules of how to use and understanding the nuances of working with the main objects in a database, such as tables and indexes. With these skills, optimizing and analyzing SQL should be fun and simple.
So, try these techniques and let us know how well they work for you in the comments section below!
Leave a Comment