Database Optimization
Table of Content
- An Overview
- SQL vs NoSQL
- Technique 1 : Indexing
- Technique 2 : Identifying & tuning slow queries
- Technique 2 : Joins
An Overview
What is database optimization ?
Database optimization refers to a variety of strategies for reducing database system response time.
Why we need database optimization?
As the data increase, the load on the system increases. This results in a degraded system performance often leading to slower response times.
The process of tweaking the system to avoid any delays and increase response time is known as database optimization.
The process of database optimization aims to maximize resource usage while minimizing idle time.
How to optimize database performance?
There are several key practices which are important for any database optimization process.
- Indexing
- Query optimization
- Limit data retrieval
NoSQL Vs SQL
High-Level Differences
SQL | NoSQL |
---|---|
Relational Databases (RDBMS) | non-relational or distributed database. |
table based databases | document based, key-value pairs |
predefined schema | dynamic schema for unstructured data. |
vertically scalable scaled by increasing the horse-power of the hardware. |
horizontally scalable scaled by increasing the databases servers. |
SQL databases are best fit for heavy duty transactional type applications, as it is more stable and promises the atomicity as well as integrity of the data. | NoSQL database are highly preferred for large data set Hbase(big data) |
Technique 1
Indexing
What is Database Index?
-
Data structuring which is based on one or more columns of the database.
-
To fasten the data retrieval.
-
Helps in locating the data easily, needless to go through each and every row, every time the database is accessed.
-
Using columns, indexing helps in minimizing the disk accesses for each query which is processed.
- This makes the database indexing a powerful technique for database optimization improving the overall performance of the database.
How are the structure of a Database Index?
An index consists of two columns
-
Search Key
-
mainly a copy of the primary key of the table which is stored in a sorted manner, for quick access to data.
-
mainly a copy of the primary key of the table which is stored in a sorted manner, for quick access to data.
-
Data Reference
- the set of pointers, that has the address of the respective disk block where the particular key value can be found.
Indexes could be of two types
-
Ordered Index
-
Based on the sorted ordering of the primary key values.
-
Based on the sorted ordering of the primary key values.
-
Hash Index
- Based on the uniform distribution of values across a range of buckets.
- These assigning of buckets for a value is determined by hash function.
- But then the hash index is not sorted in nature.
- Some of the queries which require sorted data, hash indexes might not be helpful.
How does Indexing help in Database Optimization?
It minimizes the number of data access for processing a query.
When to use it?
The database indexing especially effective when the data is at scale with a large number of fields.
When not to use it?
What are the catch?
Indexing results in extra memory allocation.
Too much indexing could make your database slow due to memory allocation
Use indexing only when required.
Technique 2
Identifying And Tuning
Quaries
What are Slow Queries?
Queries which has a large turnaround time to access the required data.
These queries tend to process a lot of data as compared to the required data causing unwanted delays in access.
What can cause Slow Queries?
Some of the major reasons causing slow queries are:
-
Queries are overworking. This means they are processing more than the required processing of data for a given query.
- Queries might be waiting for their turn to execute. Often queueing might cause unexpected delays.
How to identify Slow Queries?
Here are few methods to find slow queries for different databases:
- Using SQL DMVs (Dynamic Management Views) to find Slow Queries
- Using MySQL slow query log analyzer
- Using Oracle slow query log analyzer
- Enabling mariaDB slow query log
How to tune Slow Queries for database optimization?
It can done by choosing the most efficient means of executing a SQL statement.
- Identify slow queries using slow query logs
- Analyse the slow query in detail to find alternative efficient means
- Write complex subqueries using temporary tables to improve turnaround time
- When working on joins, use inner join instead of outer join
Technique 3
Joins
What are Database Joins?
Joins are operations in the SQL, which are used to establish connections between two or more tables in a database based on the logical relationships of the columns content of the two tables. The values of the two tables fields are familiar to one another.
What are Inner and Outer Joins?
Inner and Outer Joins are used to retrieve the combined data from two or more database tables, yet both operate in a different manner.
Inner Joins | Outer Joins |
---|---|
non-matching values are excluded | non-matching values are included |
Theory Inner Join -Intersection | Theory Inner Join - Union |
|
How are Joins important for Database Optimization?
Joins are unavoidable in complex queries.
Joins also might be a reason for the slow processing of the queries causing a slow data retrieval and a slower performance.
Data to be processed for a query using Outer Joins is much more compared to Inner Joins.
How to optimize Database while using Joins?
It is therefore advisable to prefer using Inner Joins over Outer Joins.
Database Optimization
By Syafiq bin abdul rahman
Database Optimization
- 281