the n+1 problem

example

We want to report the last 100 matches duration and the game played.

 

If we are using Rails and ActiveRecord models that could look like this.

 

Although Game information lives in another table Active Record is capable of retrieving that information for us through the association.

class Match < ApplicationRecord
  # attrs...
  has_one: :game
end

class Game < ApplicationRecord
  # attrs...
end


Match.last(100).each do |match|
  Report.call(
    duration: match.duration,
    game_name: match.game.name,
    #  ...
  )
end

LAZY LOADING

Active Record will make use of lazy loading to take care of it and load the required data for us.

 

In our case, we make the call to retrieve the game's name at line 13 when we need that to pass down to the Report.

 

Lazy loading is handy because we can easily query the desired data on demand.

 

class Match < ApplicationRecord
  # attrs...
  has_one: :game
end

class Game < ApplicationRecord
  # attrs...
end

Match.last(100).each do |match|
  Report.call(
    duration: match.duration,
    game_name: match.game.name,
    #  ...
  )
end

N+1

We will fire up one query to load the last 100 matches, but we will also loop 100 times to call the report.

 

The resulting number of queries would be 1 for the matches + 100 for loading games = 101 queries.

 

—boom that's our N+1.

class Match < ApplicationRecord
  # attrs...
  has_one: :game
end

class Game < ApplicationRecord
  # attrs...
end

Match.last(100).each do |match|
  Report.call(
    duration: match.duration,
    game_name: match.game.name,
    #  ...
  )
end
SELECT * FROM matches
ORDER BY matches.id DESC
LIMIT 100;

SELECT * FROM games WHERE games.match_id = 100;
SELECT * FROM games WHERE games.match_id = 99;
# 98 queries later...
SELECT * FROM games WHERE games.match_id = 1;

THE PROBLEM

The N+1 problem is when we execute N additional queries to fetch the same data that we could get from running a single query.

the n+1 problem

The N+1 problem becomes more evident as our applications grow because its impact is multiplied by the new scale.

 

ORMs, like ActiveRecord, abstract how the queries are executed that makes it easier for us to fire many queries without noticing.

 

However, it's not limited to them:

  • We can always manually write code that fires multiple SQLs with or without using an ORM.
  • We can always fetch data from other data sources (API, cache, ...) in a similar manner.

EAGER LOADING

Eager loading is a way to solve the N+1.

 

The strategy consists in loading upfront any data of interest alongside your main query.

 

So whenever you need to access that data it would already be available in memory.

 

class Match < ApplicationRecord
  # attrs...
  has_one: :game
end

class Game < ApplicationRecord
  # attrs...
end

Match.includes(:game)
     .limit(100)
     .each do |match|
  Report.call(
    duration: match.duration,
    game_name: match.game.name,
    #  ...
  )
end

EAGER LOADING

The code uses the includes query method to indicate what relationships we need to query alongside the Match one, it does so by leveraging the relationship between Match and Game.

 

Active Record will ensure that all of the specified associations are loaded using the minimum possible number of queries.

 

No lazy load anymore.

class Match < ApplicationRecord
  # attrs...
  has_one: :game
end

class Game < ApplicationRecord; end

Match.includes(:game)
     .last(100)
     .each do |match|
  Report.call(
    duration: match.duration,
    game_name: match.game.name,
    #  ...
  )
end
SELECT * FROM matches
ORDER BY matches.id DESC
LIMIT 100;

SELECT * FROM games WHERE games.match_id IN (100..1);

n+1 on cache

Suppose we have the game's name cached by the match id.

 

We can similarly use the Rails cache here to retrieve that information.

 

However, we still have an N+1 we are just targeting another dependency of our system.

 

For example, it could be Redis or Memcached.

class Match < ApplicationRecord
  # attrs...
  has_one: :game
end

class Game < ApplicationRecord
  # attrs...
end

Match.last(100).each do |match|
  Report.call(
    duration: match.duration,
    game_name: Rails.cache.read(match.id),
    #  ...
  )
end

CACHE multi

The solution here to use the cache read_multi function to batch load all the values based on the match ids at once.

 

Similar to what we have done before with the includes we have now 2 queries instead of 101.

class Match < ApplicationRecord
  # attrs...
  has_one: :game
end

class Game < ApplicationRecord; end

matches = Match.includes(:game).limit(100)
keys = matches.map(&:id)
game_names = Rails.cache.read_multi(keys)

matches.each do |match|
  Report.call(
    duration: match.duration,
    game_name: game_names[match.id],
    #  ...
  )
end

batch loading

There are cases that we do not have associations to eager load from, like unassociated tables, APIs calls, different databases...

 

It's always possible to implement the batch loading ourselves or to use other solutions to mitigate the N+1 like data-loaders.

 

THE IMPACT

You would be surprised about the performance gains!

  • Healthier dependencies
    • Fewer queries to our data sources.
  • Fewer roundtrips.
    • Our apps perform fewer roundtrips to fetch data.

 

Of course, that comes with a memory cost to load everything upfront. but if you need to load everything at some point, then eager loading should be adequate.

 

🚀 More room to grow!

Preventing lazy loading 🙅‍♂️

Manu alternatives

  • Transforming Models into hashes or classes
  • Auditing features and  code
  • ...

 

 

👍 We can safely work with them without worrying about dispatching  new queries

👎 We lose the rich API we need to re-expose any Model's  functionality

 

Preventing lazy loading 🙅‍♂️

Automated alternatives

  • Monitoring production env
  • Tracking queries during development, e.g. bullet gem
  • ...

 

 

👍 We let the tools alert us and so we can act fast

👎 We are still reactive to alerts (prod or dev)

Preventing lazy loading 🙅‍♂️

Rails 6.1 was released with strict loading and that introduces an optional strict mode for models to prevent lazy load!

 

👍 We are proactive - lazy queries cannot be created

👍 We can change it on demand

👎 Limited to our models and associations

class Match < ApplicationRecord
  # attrs...
  has_one: :game, strict_loading: true
end

class Game < ApplicationRecord
  # attrs...
end

Match.first.game.name
# => ActiveRecord::StrictLoadingViolationError Exception: Match is marked as strict_loading and Game cannot be lazily loaded.

quick Summary

  • The N+1 problem is not always noticed, but it starts to become an issue as our applications grow and scale.
  • Often related to ORM but not limited to it.
  • Batch loading strategies like eager loading and data loaders help us to mitigate the issue.
  • Strict load in Rails can help us to prevent the issue!
  • There is no silver bullet! A mix of tools and strategies contributes to the overall quality of our system.
  • Mitigating it helps our systems to stay healthy as they grow.
Made with Slides.com