Rails ORM
踩坑經驗談
新手向 COSCUP 2023
- 小名文月,簡稱 fumi
- ❤: 喝淺焙咖啡、唱日卡、嚐甜食
一年 365 天歡迎餵食,
請多指教 ヽ(●´∀`●)ノ
Meng-Ying Tsai
Rails ORM
踩坑經驗談
新手向 COSCUP 2023
Object Relational Mapping
-
連接物件導向程式 & (關聯式)資料庫之間的技巧
-
ORM tools:幫助開發者與(關聯式)資料庫互動的工具
Active Record
兩種意思:
- 一種架構模式
- 透過這個 pattern 實作的 ORM 框架
eg. Default ORM tool in Rails
為什麼會踩坑?
不了解工具
導致問題
工具的設計有缺點
導致問題
為什麼會踩坑?
SQL Query
Rails ORM
為什麼會踩坑?
SQL Query
Rails ORM
實際如何執行?
額外的程式邏輯?
案例:scope 回 nil 時會回傳所有結果?
https://sagarjunnarkar.github.io/blogs/2019/09/15/activerecord-scope/
- first 回傳一筆 record:拿回目前最新 publish 的 record
- first 回傳 nil:回傳所有 record
class Post < ApplicationRecord
scope :recent_published,
-> { where(published: true).order('published_at DESC').first }
end
Post.recent_published
Post Load (0.3ms) SELECT "posts".* FROM "posts" WHERE "posts"."published" = ? ORDER BY published_at DESC LIMIT ? [["published", 1], ["LIMIT", 1]]
Post Load (0.1ms) SELECT "posts".* FROM "posts" LIMIT ? [["LIMIT", 11]]
=> <ActiveRecord::Relation ...>
案例:scope 回 nil 時會回傳所有結果?
https://guides.rubyonrails.org/active_record_querying.html#scopes
Scoping allows you to specify commonly-used queries which can be referenced as method calls on the association objects or models. With these scopes, you can use every method previously covered such as where
, joins
and includes
. All scope bodies should return an ActiveRecord::Relation
or nil
to allow for further methods (such as other scopes) to be called on it.
Model.scope_a.scope_b.scope_c
案例:scope 回 nil 時會回傳所有結果?
class Order < ApplicationRecord
scope :created_before,
->(time) { where(created_at: ...time) if time.present? }
end
回傳 nil 的使用情境:
案例:Criteria.first 會變成 Criteria.last?
在用 mongoid 7.x 時遇到這樣的狀況
criteria = Model.my_scope # no sorting
criteria.first.id
# => 64b63169f8a9b80348a14bc7
criteria.last.id
# => 64c366a23932ad003b9c5e46
criteria.first.id
# => 64c366a23932ad003b9c5e46
criteria.last.id
# => 64c366a23932ad003b9c5e46
但只要有指定 sorting 就不會有事 👻
案例:Criteria.first 會變成 Criteria.last?
在沒有 cache 的情況下重下 query 發現:
Mongoid::QueryCache.uncached do
test_criteria = Model.my_scope
puts test_criteria.first.id
puts test_criteria.last.id
puts test_criteria.first.id
end
DEBUG -- : MONGODB | ... "sort"=>{"_id"=>1} ...
DEBUG -- : MONGODB | ... "sort"=>{"_id"=>-1} ...
DEBUG -- : MONGODB | ... "sort"=>{"_id"=>-1} ...
?!!!
案例:Criteria.first 會變成 Criteria.last?
經過一番研究跟翻 code,
發現 #last 會讓 Mongo::Collection::View 被改動
test_criteria = Model.my_scope
test_criteria.view.sort
# => nil
test_criteria.last
# ...
test_criteria.view.sort
# => {"_id"=>-1}
案例:Criteria.first 會變成 Criteria.last?
案例:Criteria.first 會變成 Criteria.last?
原來是踩到 bug (鬆一口氣)
如何改善?
- 多看文件
- 多看 log 了解 ORM 組出什麼 DB query
這次想多講講 ActiveRecord 有趣的機制:
- QueryCache
- Prepared Statement
以及初步可能會遇到的 Performance Issue:
- N+1 Queries
SQL Query Cache
- 同一個請求中多次執行相同的查詢,只會有一次實際的 DB 查詢
- 生命週期只在一個 controller action 中
Bank.first
Bank Load (0.5ms) SELECT `banks`.* FROM `banks` ORDER BY `banks`.`id` ASC LIMIT 1
Bank.first
CACHE Bank Load (0.0ms) SELECT `banks`.* FROM `banks` ORDER BY `banks`.`id` ASC LIMIT 1
Rails Console 裡有 QueryCache?
Default 是 false,可用 enable_query_cache! 打開設定來做測試
Q
ActiveRecord::Base.connection.query_cache_enabled
# => false
ActiveRecord::Base.connection.enable_query_cache!
ActiveRecord::Base.connection.query_cache_enabled
# => true
既然有 QueryCache,memoization 會顯得沒必要嗎?
Query Cache 少了多次對 DB 查詢的開銷,但並未減少 ActiveRecord Result 轉換成 ActiveRecord Relation 的開支
Memoization 仍然是有意義的
Q
def published_post
@posts ||= Post.published
end
哪裡可以看我有哪些 QueryCache?
Q
ActiveRecord::Base.connection.query_cache
# => {"SELECT \"users\".* FROM \"users\" ORDER BY \"users\".\"id\" DESC LIMIT 1"=> {[]=> #<ActiveRecord::Result... > },
# "SELECT ... "=> {[]=> #<ActiveRecord::Result... > }
ActiveRecord::Base.connection.query_cache['SELECT .... '][[]]
#=> #<ActiveRecord::Result:0x0000000280c3f140
# @column_types= { ... },
# @columns= [ ... ],
# @hash_rows=nil,
# @rows=
# [["column_1_result",
# "column2_result"],
# ["column_1_result",
# "column2_result"]]]
# >
Problem
拿到的資料有可能不是最新的,但有些操作需要確保資料是即時的
- 👉 手動清除 QueryCache Model#reload
- 👉 disable QueryCache ActiveRecord::Base.uncached
Bank.first
# Bank Load (0.5ms) SELECT `banks`.* FROM `banks` ORDER BY `banks`.`id` ASC LIMIT 1
Bank.first
# CACHE Bank Load (0.0ms) SELECT `banks`.* FROM `banks` ORDER BY `banks`.`id` ASC LIMIT 1
ActiveRecord::Base.uncached { Bank.first }
# Bank Load (0.5ms) SELECT `banks`.* FROM `banks` ORDER BY `banks`.`id` ASC LIMIT 1
Problem
Query Cache 越長越大造成 memory 負擔過重
卻沒有清除 Cache 的機制 🤢
做個實驗:
# 不使用 prepare statement 的情況下
(1..100000).each {|i| puts TestModel.limit(i) }
# => ...
ActiveRecord::Base.connection.query_cache.keys.count
# => 100000
# 使用 prepare statement 的情況下
(1..100000).each {|i| puts TestModel.limit(i) }
# => ...
ActiveRecord::Base.connection.query_cache['SELECT ...'].keys.count
# => 100000
Problem
QueryCache
SQL Query & Result
QueryCache
Problem
https://code.jjb.cc/turning-off-activerecord-query-cache-to-improve-memory-consumption-in-background-jobs
Problem
在未來的 Rails 版本,可望 QueryCache 會改為 LRU(Least Recently Used Cache)
https://github.com/rails/rails/pull/48110
Problem
將來可以設定 QueryCache Max Size 或關閉 QueryCache
https://github.com/rails/rails/pull/48110
development:
adapter: mysql2
query_cache: 200 # default: 100
development:
adapter: mysql2
query_cache: false
Prepared Statement
- 很多資料庫都有 Prepared Statement 的設計
PREPARE test_prepare (INT) AS
SELECT id, name FROM stores LIMIT $1;
-- PREPARE
EXECUTE test_prepare(3);
-- id | name
-- ----+---------
-- 1 | Store A
-- 2 | Store B
-- 3 | Store C
SELECT * FROM pg_prepared_statements;
-- name | statement | ...
-- --------------+--------------------------------------+--------
-- test_prepare | PREPARE test_prepare (INT) AS +| 20...
-- | SELECT id, name FROM banks LIMIT $1; |
Prepared Statement
https://www.interdb.jp/pg/pgsql03.html
statement 會多次重複使用的情況下:
- 避免重複的 parse, analyze, rewrite
- 不需要傳完整 sql,減少 network 流量
Simple Query:
Prepared Statement
Rails:
- postgres adapter: by default 開啟
- mysql2 adapter: 需設定 prepared_statements: true
development:
adapter: mysql2
prepared_statements: true
Prepared Statement
# when prepared_statements is set to true (default in postgresql adapter)
ActiveRecord::Base.connection.prepared_statements
# => true
User.last
# => User Load (2.6ms) SELECT "users".* FROM "users" ORDER BY "users"."id" DESC LIMIT $1 [["LIMIT", 1]]
# when prepared_statements is set to false
ActiveRecord::Base.connection.prepared_statements
# => false
User.last
# => User Load (2.7ms) SELECT "users".* FROM "users" ORDER BY "users"."id" DESC LIMIT 1
ActiveRecord::PreparedStatementCacheExpired
使用 Prepared Statement 有可能會遇到這個錯誤:
- 當 schema 變動時,原先的 prepared statement 會過期
- 如果不在 transaction 中,Rails 會主動 deallocate 並重試
- 如果在 transaction 中, Rails 會 deallocate 並 raise error
ActiveRecord::PreparedStatementCacheExpired
ActiveRecord::Base.transaction do
retried ||= false
# ...
rescue ActiveRecord::PreparedStatementCacheExpired => e
raise e if retried
retried = true
end
如果 transaction 內沒有不可重試的操作,可以加上 retry
enumerate_columns_in_select_statements
- Rails 7 新增的 config
- Rails 自動產生的 SELECT * Query 會導致每次有任何 schema change,prepared statement 都會過期
- 當 enumerate_columns_in_select_statements 為 true,Rails 會列清楚 SELECT 的欄位
https://github.com/rails/rails/pull/41718
User.limit(10)
# enumerate_columns_in_select_statements is off
# => SELECT * FROM users LIMIT 10
# enumerate_columns_in_select_statements is on
# => SELECT "first_name,last_name,email ..." FROM users LIMIT 10
Performance Issue
N+1 Queries
Started GET "/" for ::1 at 2023-07-29 20:44:10 +0800
Processing by PagesController#index as HTML
[1m[36mOrder Load (6.3ms)[0m [1m[34mSELECT "orders".* FROM "orders" LIMIT $1[0m [["LIMIT", 10]]
↳ app/controllers/pages_controller.rb:4
[1m[36mDistributor Load (1.1ms)[0m [1m[34mSELECT "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m [["id", "..."], ["LIMIT", 1]]
↳ app/controllers/pages_controller.rb:5
[1m[36mDistributor Load (2.8ms)[0m [1m[34mSELECT "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m [["id", "..."], ["LIMIT", 1]]
↳ app/controllers/pages_controller.rb:5
[1m[36mDistributor Load (1.6ms)[0m [1m[34mSELECT "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m [["id", "..."], ["LIMIT", 1]]
↳ app/controllers/pages_controller.rb:5
[1m[36mDistributor Load (3.6ms)[0m [1m[34mSELECT "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m [["id", "..."], ["LIMIT", 1]]
↳ app/controllers/pages_controller.rb:5
[1m[36mDistributor Load (2.0ms)[0m [1m[34mSELECT "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m [["id", "..."], ["LIMIT", 1]]
↳ app/controllers/pages_controller.rb:5
[1m[36mDistributor Load (2.7ms)[0m [1m[34mSELECT "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m [["id", "..."], ["LIMIT", 1]]
↳ app/controllers/pages_controller.rb:5
[1m[36mDistributor Load (6.1ms)[0m [1m[34mSELECT "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m [["id", "..."], ["LIMIT", 1]]
↳ app/controllers/pages_controller.rb:5
[1m[36mDistributor Load (4.4ms)[0m [1m[34mSELECT "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m [["id", "..."], ["LIMIT", 1]]
↳ app/controllers/pages_controller.rb:5
[1m[36mDistributor Load (2.5ms)[0m [1m[34mSELECT "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m [["id", "..."], ["LIMIT", 1]]
↳ app/controllers/pages_controller.rb:5
[1m[36mDistributor Load (5.1ms)[0m [1m[34mSELECT "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m [["id", "..."], ["LIMIT", 1]]
↳ app/controllers/pages_controller.rb:5
Rendering pages/index.html.erb
Rendered pages/index.html.erb (17.5ms)
Completed 200 OK in 164ms (Views: 19.2ms | ActiveRecord: 73.4ms)
N+1 Queries
在取得 main query 相關的 relation 的 data 時,額外執行了 N 次查詢的情況
@user = User.find(params[:id])
@posts = user.posts # 1
@posts.each do |post|
puts post.comments.count # n
end
N+1 Queries
多次 DB Query 導致增加 DB loading & 拉長 response 時間
Started GET "/" for ::1 at 2023-07-29 20:44:10 +0800
Processing by PagesController#index as HTML
[1m[36mOrder Load (6.3ms)[0m [1m[34mSELECT "orders".* FROM "orders" LIMIT $1[0m [["LIMIT", 10]]
↳ app/controllers/pages_controller.rb:4
[1m[36mMerchant Load (1.1ms)[0m [1m[34mSELECT "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m [["id", "..."], ["LIMIT", 1]]
↳ app/controllers/pages_controller.rb:5
[1m[36mMerchant Load (2.8ms)[0m [1m[34mSELECT "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m [["id", "..."], ["LIMIT", 1]]
↳ app/controllers/pages_controller.rb:5
[1m[36mMerchant Load (1.6ms)[0m [1m[34mSELECT "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m [["id", "..."], ["LIMIT", 1]]
↳ app/controllers/pages_controller.rb:5
[1m[36mMerchant Load (3.6ms)[0m [1m[34mSELECT "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m [["id", "..."], ["LIMIT", 1]]
↳ app/controllers/pages_controller.rb:5
Rendering pages/index.html.erb
Rendered pages/index.html.erb (17.5ms)
Completed 200 OK in 164ms (Views: 19.2ms | ActiveRecord: 73.4ms)
N+1 Queries 很難避免?
There's nothing wrong
with my code
老生常談卻依舊不停發生(?
N+1 Queries 很難避免?
controller
model
view
def index
@posts = Post.limit(20)
end
<% @posts.each do |post| %>
<%= post.category.name %>
<%= post.post_liked %> Likes
<% end %>
def post_liked
liked_users.count
end
在這裡 preload
在這裡注意該 preload 的 association
Kill N+1 Queries
preload
With preload, Active Record loads each specified association using one query per association.
Payment.preload(:wallet).limit(10)
# Payment Load (1.3ms) SELECT `payments`.* FROM `payments`
# Wallet Load (0.8ms) SELECT `wallets`.* FROM `wallets` WHERE `wallets`.`id` IN (1, 2, 3, 5, 6)
https://guides.rubyonrails.org/active_record_querying.html#preload
Kill N+1 Queries
eager_load
With eager_load, Active Record loads all specified associations using a LEFT OUTER JOIN associations.
Payment.eager_load(:wallet).limit(10)
# SQL (1.8ms) SELECT `payments`.`id` AS t0_r0, ...
# `wallets`.`id` AS t1_r0, `wallets`.`type` AS t1_r1, ...
# FROM `payments`
# LEFT OUTER JOIN `wallets` ON `wallets`.`id` = `payments`.`wallet_id`
# LIMIT 10
https://guides.rubyonrails.org/active_record_querying.html#eager-load
Kill N+1 Queries
includes
Specify associations args to be eager loaded to prevent N + 1 queries. A separate query is performed for each association, unless a join is required by conditions.
Payment.preload(:wallet)
# Payment Load (1.3ms) SELECT `payments`.* FROM `payments`
# Wallet Load (0.8ms) SELECT `wallets`.* FROM `wallets` WHERE `wallets`.`id` IN (1, 2, 3, 5, 6)
Payment.includes(:wallet).where(wallet: { type: 'activated' }).limit(10)
# SQL (0.8ms) SELECT `payments`.`id` AS t0_r0, `payments`.`type` AS t0_r1,...
# `wallets`.`id` AS t1_r0, `wallets`.`type` AS t1_r1,...
# FROM `payments`
# LEFT OUTER JOIN `wallets` ON `wallets`.`id` = `payments`.`wallet_id`
# WHERE `wallets`.`type` = 'aaa'
# LIMIT 10
https://github.com/rails/rails/pull/47253/files
Bullet gem
輔助檢查:
-
N + 1 Queries
-
Unused eager loading
-
Unnecessary COUNT queries (改用 counter_cache)
https://github.com/flyerhzm/bullet
Bullet gem
Bullet.enable = true
Bullet.alert = true
Bullet.bullet_logger = true
Bullet.console = true
Bullet.rails_logger = true
Bullet.add_footer = true
config/environments/development.rb
https://github.com/flyerhzm/bullet
GET /
USE eager loading detected
Order => [:merchant]
Add to your finder: :includes => [:merchant]
Call stack
.../app/controllers/pages_controller.rb:5:in `index'
...
log/bullet.log
Bullet.bullet_logger = true
Bullet gem
GET /
USE eager loading detected
Order => [:merchant]
Add to your finder: :includes => [:merchant]
Call stack
.../app/controllers/pages_controller.rb:5:in `index'
...
log/development.log
Bullet.rails_logger = true
Bullet gem
alert messege box
Bullet.alert = true
Bullet gem
html div
Bullet.add_footer = true
Bullet gem
browser console
Bullet.console = true
Bullet gem
N+1 Queries Detected!!!
(uniform notifier)
alert
slack
console
rollbar
Bullet gem
總結
- 字詞釋義
- Object Relational Mapping
- Active Record
- 為什麼會踩坑?
- 兩個案例:
- scope nil 的行為
- Mongoid Bug
- ActiveRecord Query Cache
- ActiveRecord Prepared Statment
- Performance Issue - N+1 Queries
Thanks for listening
Rails ORM 踩坑經驗談
By Meng-Ying Tsai
Rails ORM 踩坑經驗談
- 356