We want to report the last 100 matches duration and the game played.
If we are using Rails and ActiveRecord models that could look like this.
Although Game information lives in another table Active Record is capable of retrieving that information for us through the association.
class Match < ApplicationRecord
# attrs...
has_one: :game
end
class Game < ApplicationRecord
# attrs...
end
Match.last(100).each do |match|
Report.call(
duration: match.duration,
game_name: match.game.name,
# ...
)
end
Active Record will make use of lazy loading to take care of it and load the required data for us.
In our case, we make the call to retrieve the game's name at line 13 when we need that to pass down to the Report.
Lazy loading is handy because we can easily query the desired data on demand.
class Match < ApplicationRecord
# attrs...
has_one: :game
end
class Game < ApplicationRecord
# attrs...
end
Match.last(100).each do |match|
Report.call(
duration: match.duration,
game_name: match.game.name,
# ...
)
end
We will fire up one query to load the last 100 matches, but we will also loop 100 times to call the report.
The resulting number of queries would be 1 for the matches + 100 for loading games = 101 queries.
—boom that's our N+1.
class Match < ApplicationRecord
# attrs...
has_one: :game
end
class Game < ApplicationRecord
# attrs...
end
Match.last(100).each do |match|
Report.call(
duration: match.duration,
game_name: match.game.name,
# ...
)
end
SELECT * FROM matches
ORDER BY matches.id DESC
LIMIT 100;
SELECT * FROM games WHERE games.match_id = 100;
SELECT * FROM games WHERE games.match_id = 99;
# 98 queries later...
SELECT * FROM games WHERE games.match_id = 1;
The N+1 problem is when we execute N additional queries to fetch the same data that we could get from running a single query.
The N+1 problem becomes more evident as our applications grow because its impact is multiplied by the new scale.
ORMs, like ActiveRecord, abstract how the queries are executed that makes it easier for us to fire many queries without noticing.
However, it's not limited to them:
Eager loading is a way to solve the N+1.
The strategy consists in loading upfront any data of interest alongside your main query.
So whenever you need to access that data it would already be available in memory.
class Match < ApplicationRecord
# attrs...
has_one: :game
end
class Game < ApplicationRecord
# attrs...
end
Match.includes(:game)
.limit(100)
.each do |match|
Report.call(
duration: match.duration,
game_name: match.game.name,
# ...
)
end
The code uses the includes query method to indicate what relationships we need to query alongside the Match one, it does so by leveraging the relationship between Match and Game.
Active Record will ensure that all of the specified associations are loaded using the minimum possible number of queries.
No lazy load anymore.
class Match < ApplicationRecord
# attrs...
has_one: :game
end
class Game < ApplicationRecord; end
Match.includes(:game)
.last(100)
.each do |match|
Report.call(
duration: match.duration,
game_name: match.game.name,
# ...
)
end
SELECT * FROM matches
ORDER BY matches.id DESC
LIMIT 100;
SELECT * FROM games WHERE games.match_id IN (100..1);
Suppose we have the game's name cached by the match id.
We can similarly use the Rails cache here to retrieve that information.
However, we still have an N+1 we are just targeting another dependency of our system.
For example, it could be Redis or Memcached.
class Match < ApplicationRecord
# attrs...
has_one: :game
end
class Game < ApplicationRecord
# attrs...
end
Match.last(100).each do |match|
Report.call(
duration: match.duration,
game_name: Rails.cache.read(match.id),
# ...
)
end
The solution here to use the cache read_multi function to batch load all the values based on the match ids at once.
Similar to what we have done before with the includes we have now 2 queries instead of 101.
class Match < ApplicationRecord
# attrs...
has_one: :game
end
class Game < ApplicationRecord; end
matches = Match.includes(:game).limit(100)
keys = matches.map(&:id)
game_names = Rails.cache.read_multi(keys)
matches.each do |match|
Report.call(
duration: match.duration,
game_name: game_names[match.id],
# ...
)
end
There are cases that we do not have associations to eager load from, like unassociated tables, APIs calls, different databases...
It's always possible to implement the batch loading ourselves or to use other solutions to mitigate the N+1 like data-loaders.
You would be surprised about the performance gains!
Of course, that comes with a memory cost to load everything upfront. but if you need to load everything at some point, then eager loading should be adequate.
🚀 More room to grow!
Manu alternatives
👍 We can safely work with them without worrying about dispatching new queries
👎 We lose the rich API we need to re-expose any Model's functionality
Automated alternatives
👍 We let the tools alert us and so we can act fast
👎 We are still reactive to alerts (prod or dev)
Rails 6.1 was released with strict loading and that introduces an optional strict mode for models to prevent lazy load!
👍 We are proactive - lazy queries cannot be created
👍 We can change it on demand
👎 Limited to our models and associations
class Match < ApplicationRecord
# attrs...
has_one: :game, strict_loading: true
end
class Game < ApplicationRecord
# attrs...
end
Match.first.game.name
# => ActiveRecord::StrictLoadingViolationError Exception: Match is marked as strict_loading and Game cannot be lazily loaded.