High performance MySQL

By James Harvey

Database performance is an issue that creeps up on you

Early-stage startup problems

- Finding customers

- Rapid feature development

- Iterating to find product-market fit

- Fixing bugs and bad product decisions

 

Scaling startup problems

- All that stuff from before

- Heaps of customers

- A bunch of really big customers

- Even bigger customers coming down the pipeline

- Your database servers go to 100% CPU randomly

- You start getting weird errors from the database

 

How did this emerge for us?

Errors, slow queries and app performance

And helpful informative internal bug reports

Okay let's take a step back

You will make mistakes in your database design

You will think you know how MySQL will behave at scale and you will be wrong

It is a good thing when you start hitting scaling issues in your app because it means people are actually using it!

What we'll cover

  • Optimization vs scaling
  • Optimizing a query
  • Transaction tips
  • Database design tips

Optimization

Making maximum use of each MySQL node

  • Avoiding N+1 queries
  • Fast queries
  • Quick transactions
  • Avoid deadlocks

Scaling

Adding more MySQL nodes
 

  • Adding read replicas
  • Splitting out databases across clusters
  • Sharding

Example time!

  • Multiple employees can be allocated to a given shift
  • ~5 million shifts in total

Requirement 1
Grab the shifts that start on a given date

Please explain!
MySQL is conducting a full table scan to find the shifts

Let's add an index

Getting better
But we can still take this further

The culprit

MySQL can't make full use of the index when a function is used.

Let's get rid of the function

Requirement 2
Grab the shifts end time as well

Why did we go from nothing to 140ms?

InnoDB Indexes

What are our options?

  1. Wear the 140ms (will be less in production)
  2. Add end_time to the index

Requirement 3
Search on the end time

The order of the index matters!
Composite indexes are basically stored as a concatenated string.

Does join order matter?
No, but MySQL can only pick one index to use.
 

Let's try another index

The downsides of indexes

  • Indexes can grow very large over time
  • It's much easier to add an index than it is to remove it, so have a plan
  • Indexes slow down write speed
  • The primary key is stored with the index, so keep your primary key size reasonable

Key takeaways

  • It's far easier to add an index than to remove them
  • Add indexes only once you know the read patterns
  • Select only the columns you need as it may result only reading the index
  • Sometimes adding more filters can give MySQL more options to use better indexes
  • Sometimes MySQL will think a table scan is more effective if your indexes are too broad

Transactions and Deadlocks

  • Used to ensure a change completely works or completely fails
  • All inserts, updates and deletes use transactions
  • InnoDB has a solid transaction system

Lock timeouts

  • A transaction was held open too long

Deadlocks

  • Two transactions competed for changes where InnoDB cannot resolve them both

Transaction tips

  • Keep transactions as fast as possible
  • Ensure update/delete queries are optimized
  • Don't mix storage engines and transactions

Keeping your database lean

  • Smaller fields are usually better
  • Move text fields to alternative storage (eg. AWS S3)
  • Move out optional data into sub tables to reduce storage waste
  • Checksum data as a key
  • Unsign your integer fields unless you need negative numbers
  • Consider storing dollars as cents in an integer field
  • Keep the maximum joins in queries under ~15
  • If you have flat reporting tables, limit the number of hops to get display information

N+1 query tips

  • Hard to spot in production because they don't show in slow query logs
  • Use query logging to spot them
  • Some packages can highlight them
  • If using PDO, create a PDO wrapper object with query logging to find them

Getting started with optimising

  1. Activate slow query logs in production and check for new entries daily
  2. Set CPU alarms and run `show full processlist` when they are triggered
  3. Write a query logging middleware to show:
    • Frequent queries (the same query was run > 3 times)
    • Slow queries (query is slower than 50ms)
  4. Look at the size of the tables in production and see where the space is going
  5. Explore APM tools like New Relic to give you better insight into production

References

About me

James Harvey

Chief Software Architect - foundU

High performance MySQL

By James Harvey

High performance MySQL

  • 391