Index optimization
from B-tree to ICP
YAPC::EU 2016
Follow this talk
Who I am
Pavel Scherbinin (Dzirtik)
- Leader of Moscow.PM
- Organisator YAPC::Russia
- Perl programmer (8+ years)
- Team Lead
- DBA
- Lecturer

Technopark @ Mail.Ru
Bauman State University + Mail.Ru
- 2 years
- 14 courses
- 60 students every half-year
- additional courses:
Perl!!!
...
https://park.mail.ru/

Indexes
Phonebook
B-Tree

Multiple-Column Indexes
CREATE TABLE People (
id int(11) autoincrement,
last_name varchar(50) not null,
first_name varchar(50) not null,
dob date not null,
gender enum(m, f) not null,
primary key (id),
key(last_name, first_name, dob)
);Multiple-Column Indexes

Multiple-Column Indexes

Text
Multiple-Column Indexes
-
Akroyd
-
Christian
- 1958-12-07
-
Debbie
- 1978-11-02
- ...
-
Christian
-
Allen
-
Cuba
- 1960-01-01
-
Kim
- 1930-07-12
- ...
-
Cuba
- ...
-
Basinger
-
Viven
- 1976-12-08
- 1979-01-24
-
Viven
What we can do
- Search by full equal
- Search by left prefix
- Search by column prefix
- Search by range
- Search by full equal plus by range
- Queries only by index (covered indexes)
Search by full equal
SELECT
COUNT(*)
FROM
topics
WHERE
forum_id = 2;
-- index(forum_id)Search by full equal
SELECT
COUNT(*)
FROM
topics
WHERE
forum_id = 2
AND author = "ex@ex.com";
-- index(forum_id, author)
-- || index(author, forum_id)Search by left prefix
-- index(forum_id, author)
SELECT COUNT(*) FROM topics
WHERE forum_id = 2;
-- using index
SELECT COUNT(*) FROM topics
WHERE author = "ex@ex.com";
-- not using indexSearch by column prefix
-- index(title)
SELECT COUNT(*) FROM topics
WHERE title LIKE 'help%';
-- using index
SELECT COUNT(*) FROM topics
WHERE title LIKE '%help%';
-- not using indexSearch by range
-- index(forum_id)
SELECT COUNT(*) FROM topics
WHERE forum_id > 500;
SELECT COUNT(*) FROM topics
WHERE forum_id BETWEEN 500 AND 750; Search by full equal plus by range
-- index(forum_id, replies_cnt)
SELECT COUNT(*) FROM topics
WHERE forum_id = 500
AND replies_cnt > 1000;
-- using index
SELECT COUNT(*) FROM topics
WHERE replies_cnt > 100;
-- not using indexWhat we can't do
- Search without left part of index
- Skip colums
- Search after range seek
Search without left part of index
-- index(
-- forum_id, replies_cnt, author)
SELECT COUNT(*) FROM topics
WHERE replies_cnt > 1000;
-- not using index
SELECT COUNT(*) FROM topics
WHERE author = "ex@ex.com";
-- not using indexSkip colums
-- index(
-- forum_id, replies_cnt, author)
SELECT COUNT(*) FROM topics
WHERE forum_id = 500
AND author = "ex@ex.com";
-- using only forum_id partSearch after range seek
-- index(
-- forum_id, replies_cnt, author)
SELECT COUNT(*) FROM topics
WHERE forum_id > 500
AND replies_cnt > 1000;
-- using only forum_id partIndex Condition Pushdown

Index Condition Pushdown
-- index(
-- forum_id, replies_cnt, author)
SELECT COUNT(*) FROM topics
WHERE forum_id > 500
AND replies_cnt > 1000;
-- since MySQL 5.6
-- take forum_id rows and after then
-- scan info about replies_cnt
-- in indexIndex Condition Pushdown
-- index(forum_id, title)
SELECT COUNT(*) FROM topics
WHERE forum_id > 500
AND title LIKE '%help%';When we use indexes
-
Search by WHERE
-
JOIN tables
-
Sorting and grouping tables (ORDER BY & GROUP BY)
-
MAX and MIN for key fields
-
To retrieve data not from the table data, and from only the index file (covering indexes)
Search by WHERE

Search by WHERE
-- index(???)
SELECT COUNT(*) FROM topics
WHERE title LIKE '%help%'
AND replies_cnt > 100;Search by WHERE
-- index(replies_cnt, title)
SELECT COUNT(*) FROM topics
WHERE title LIKE '%help%'
AND replies_cnt > 100;Search by WHERE
-- index(???)
SELECT COUNT(*) FROM topics
WHERE forum_id > 500
AND replies_cnt > 100;Search by WHERE
-- index(forum_id, replies_cnt)
-- index(replies_cnt, forum_id) too
SELECT COUNT(*) FROM topics
WHERE forum_id > 500
AND replies_cnt > 100;JOIN tables

JOIN tables

JOIN tables
SELECT
tbl1.col1, tbl2.col2
FROM tbl1
INNER JOIN tbl2 USING(col3)
WHERE
tbl1.col1 IN (5, 6);JOIN tables
outer_iter = iterator over tbl1 where col1 IN(5,6)
outer_row = outer_iter.next
while outer_row
inner_iter = iterator over tbl2 where col3 = outer_row.col3
inner_row = inner_iter.next
while inner_row
output [ outer_row.col1, inner_row.col2 ]
inner_row = inner_iter.next
end
outer_row = outer_iter.next
endJOIN tables
outer_iter = iterator over tbl1 where col1 IN(5,6)
outer_row = outer_iter.next
while outer_row
inner_iter = iterator over tbl2 where col3 = outer_row.col3
inner_row = inner_iter.next
if inner_row
while inner_row
output [ outer_row.col1, inner_row.col2 ]
inner_row = inner_iter.next
end
else
output [ outer_row.col1, NULL ]
end
outer_row = outer_iter.next
endJOIN tables

JOIN tables

JOIN tables
SELECT COUNT(*)
FROM forums f
INNER JOIN topics t ON (f.id = t.fid)
WHERE f.id = 5 AND t.replies_cnt > 500;
-- indexes:
-- forums (???)
-- topics (???)JOIN tables
SELECT COUNT(*)
FROM forums f
INNER JOIN topics t ON (f.id = t.fid)
WHERE f.id = 5 AND t.replies_cnt > 500;
-- indexes:
-- forums (id)
-- topics (fid, replies_cnt)JOIN tables
SELECT COUNT(*)
FROM forums f
LEFT JOIN topics t ON (f.id = t.fid)
WHERE f.id = 5;
-- indexes:
-- forums (???)
-- topics (???)JOIN tables
SELECT COUNT(*)
FROM forums f
LEFT JOIN topics t
ON (f.id = t.fid)
AND t.replies_cnt > 500
WHERE f.id = 5;
-- indexes:
-- forums (???)
-- topics (???)JOIN tables
SELECT COUNT(*)
FROM forums f
LEFT JOIN topics t
ON (f.id = t.fid)
WHERE f.id = 5 AND t.replies_cnt > 500;
-- indexes:
-- forums (???)
-- topics (???)Sorting and grouping tables
SELECT COUNT(*)
FROM topics
WHERE forum_id = 42
ORDER BY created_at DESC;
-- index (forum_id, created_st)
-- the same for ASCSorting and grouping tables
SELECT COUNT(*)
FROM topics
WHERE forum_id = 42
GROUP BY created_at;
-- index (forum_id, created_st)Sorting and grouping tables
SELECT COUNT(*)
FROM topics
WHERE forum_id = 42
GROUP BY created_at;
-- index (forum_id, created_st)MAX and MIN for key fields
SELECT MAX(replies_cnt)
FROM topics
WHERE forum_id = 42
-- index (forum_id, replies_cnt)Covering Indexes
- Compact
- Sorted
- Cached
Covering Indexes
SELECT author
FROM topics
WHERE forum_id = 42
AND replies_cnt > 300;
-- index (???)Clustered index

Clustered index


Clustered index


Clustered index


Clustered index

Clustered index

Clustered index


The Story about News
| id | time | body | ... |
|---|
| id | time | title | ... |
|---|
news
story
The Story about News
| id | time | body | ... |
|---|
| id | time | title | ... |
|---|
news
story
| news_id | story_id |
|---|
sn
The Story about News
| id | time | body | ... |
|---|
| id | time | title | ... |
|---|
news
story
| news_id | story_id |
|---|
sn
SELECT *
FROM story
INNER JOIN sn ON story.id = sn.story_id
INNER JOIN news ON sn.news_id = news.id
WHERE story.id = ?
ORDER BY news.time;The Story about News
| id | time | body | ... |
|---|
| id | time | title | ... |
|---|
news
story
| news_id | story_id |
|---|
sn
SELECT *
FROM sn
INNER JOIN news ON sn.news_id = news.id
WHERE sn.story_id = ?
ORDER BY news.time;The Story about News
| id | time | body | ... |
|---|
| id | time | title | ... |
|---|
news
story
| news_id | story_id |
|---|
sn
SELECT news_id FROM sn
WHERE sn.story_id = ?;
FOR EACH ROW:
SELECT * FROM news
WHERE news.id = ?;
SORT BY timeThe Story about News
| id | time | body | ... |
|---|
| id | time | title | ... |
|---|
news (id)
story
| news_id | story_id |
|---|
sn(story_id, news_id)
SELECT news_id FROM sn
WHERE sn.story_id = ?;
FOR EACH ROW:
SELECT * FROM news
WHERE news.id = ?;
SORT BY timeThe Story about News
| id | time | body | ... |
|---|
| id | time | title | ... |
|---|
news (id)
story
| news_id | story_id | time |
|---|
sn(story_id, time, news_id)
SELECT news_id FROM sn
WHERE sn.story_id = ?
ORDER BY time;
FOR EACH ROW:
SELECT * FROM news
WHERE news.id = ?;The Story about News
| id | time | body | ... |
|---|
| id | time | title | ... |
|---|
news (id)
story
| news_id | story_id | time |
|---|
sn(story_id, time, news_id) + FK ON news(id, time)
SELECT news_id FROM sn
WHERE sn.story_id = ?
ORDER BY time;
FOR EACH ROW:
SELECT * FROM news
WHERE news.id = ?;Follow me:
dzirtik
dzirtik
dzirtik



QUESTIONS?

YE2016
By dzirtik
YE2016
- 1,822