Rails ORM
踩坑經驗談

新手向 COSCUP 2023

- 小名文月,簡稱 fumi
- ❤: 喝淺焙咖啡、唱日卡、嚐甜食
一年 365 天歡迎餵食,
請多指教 ヽ(●´∀`●)ノ
Meng-Ying Tsai

Rails ORM
踩坑經驗談

新手向 COSCUP 2023

Object Relational Mapping
-
連接物件導向程式 & (關聯式)資料庫之間的技巧
-
ORM tools:幫助開發者與(關聯式)資料庫互動的工具
Active Record
兩種意思:
- 一種架構模式
- 透過這個 pattern 實作的 ORM 框架
eg. Default ORM tool in Rails
為什麼會踩坑?
不了解工具
導致問題
工具的設計有缺點
導致問題
為什麼會踩坑?
SQL Query
Rails ORM



為什麼會踩坑?
SQL Query
Rails ORM
實際如何執行?
額外的程式邏輯?

案例:scope 回 nil 時會回傳所有結果?
https://sagarjunnarkar.github.io/blogs/2019/09/15/activerecord-scope/
- first 回傳一筆 record:拿回目前最新 publish 的 record
- first 回傳 nil:回傳所有 record
class Post < ApplicationRecord
scope :recent_published,
-> { where(published: true).order('published_at DESC').first }
endPost.recent_published
Post Load (0.3ms) SELECT "posts".* FROM "posts" WHERE "posts"."published" = ? ORDER BY published_at DESC LIMIT ? [["published", 1], ["LIMIT", 1]]
Post Load (0.1ms) SELECT "posts".* FROM "posts" LIMIT ? [["LIMIT", 11]]
=> <ActiveRecord::Relation ...>案例:scope 回 nil 時會回傳所有結果?
https://guides.rubyonrails.org/active_record_querying.html#scopes
Scoping allows you to specify commonly-used queries which can be referenced as method calls on the association objects or models. With these scopes, you can use every method previously covered such as where, joins and includes. All scope bodies should return an ActiveRecord::Relation or nil to allow for further methods (such as other scopes) to be called on it.

Model.scope_a.scope_b.scope_c
案例:scope 回 nil 時會回傳所有結果?
class Order < ApplicationRecord
scope :created_before,
->(time) { where(created_at: ...time) if time.present? }
end回傳 nil 的使用情境:
案例:Criteria.first 會變成 Criteria.last?
在用 mongoid 7.x 時遇到這樣的狀況
criteria = Model.my_scope # no sorting
criteria.first.id
# => 64b63169f8a9b80348a14bc7
criteria.last.id
# => 64c366a23932ad003b9c5e46
criteria.first.id
# => 64c366a23932ad003b9c5e46
criteria.last.id
# => 64c366a23932ad003b9c5e46但只要有指定 sorting 就不會有事 👻
案例:Criteria.first 會變成 Criteria.last?
在沒有 cache 的情況下重下 query 發現:
Mongoid::QueryCache.uncached do
test_criteria = Model.my_scope
puts test_criteria.first.id
puts test_criteria.last.id
puts test_criteria.first.id
endDEBUG -- : MONGODB | ... "sort"=>{"_id"=>1} ...
DEBUG -- : MONGODB | ... "sort"=>{"_id"=>-1} ...
DEBUG -- : MONGODB | ... "sort"=>{"_id"=>-1} ...?!!!
案例:Criteria.first 會變成 Criteria.last?
經過一番研究跟翻 code,
發現 #last 會讓 Mongo::Collection::View 被改動
test_criteria = Model.my_scope
test_criteria.view.sort
# => nil
test_criteria.last
# ...
test_criteria.view.sort
# => {"_id"=>-1}案例:Criteria.first 會變成 Criteria.last?

案例:Criteria.first 會變成 Criteria.last?

原來是踩到 bug (鬆一口氣)
如何改善?
- 多看文件
- 多看 log 了解 ORM 組出什麼 DB query
這次想多講講 ActiveRecord 有趣的機制:
- QueryCache
- Prepared Statement
以及初步可能會遇到的 Performance Issue:
- N+1 Queries
SQL Query Cache
- 同一個請求中多次執行相同的查詢,只會有一次實際的 DB 查詢
- 生命週期只在一個 controller action 中
Bank.first
Bank Load (0.5ms) SELECT `banks`.* FROM `banks` ORDER BY `banks`.`id` ASC LIMIT 1
Bank.first
CACHE Bank Load (0.0ms) SELECT `banks`.* FROM `banks` ORDER BY `banks`.`id` ASC LIMIT 1Rails Console 裡有 QueryCache?
Default 是 false,可用 enable_query_cache! 打開設定來做測試
Q
ActiveRecord::Base.connection.query_cache_enabled
# => false
ActiveRecord::Base.connection.enable_query_cache!
ActiveRecord::Base.connection.query_cache_enabled
# => true既然有 QueryCache,memoization 會顯得沒必要嗎?
Query Cache 少了多次對 DB 查詢的開銷,但並未減少 ActiveRecord Result 轉換成 ActiveRecord Relation 的開支
Memoization 仍然是有意義的
Q
def published_post
@posts ||= Post.published
end哪裡可以看我有哪些 QueryCache?
Q
ActiveRecord::Base.connection.query_cache
# => {"SELECT \"users\".* FROM \"users\" ORDER BY \"users\".\"id\" DESC LIMIT 1"=> {[]=> #<ActiveRecord::Result... > },
# "SELECT ... "=> {[]=> #<ActiveRecord::Result... > }
ActiveRecord::Base.connection.query_cache['SELECT .... '][[]]
#=> #<ActiveRecord::Result:0x0000000280c3f140
# @column_types= { ... },
# @columns= [ ... ],
# @hash_rows=nil,
# @rows=
# [["column_1_result",
# "column2_result"],
# ["column_1_result",
# "column2_result"]]]
# >Problem
拿到的資料有可能不是最新的,但有些操作需要確保資料是即時的
- 👉 手動清除 QueryCache Model#reload
- 👉 disable QueryCache ActiveRecord::Base.uncached
Bank.first
# Bank Load (0.5ms) SELECT `banks`.* FROM `banks` ORDER BY `banks`.`id` ASC LIMIT 1
Bank.first
# CACHE Bank Load (0.0ms) SELECT `banks`.* FROM `banks` ORDER BY `banks`.`id` ASC LIMIT 1
ActiveRecord::Base.uncached { Bank.first }
# Bank Load (0.5ms) SELECT `banks`.* FROM `banks` ORDER BY `banks`.`id` ASC LIMIT 1Problem
Query Cache 越長越大造成 memory 負擔過重
卻沒有清除 Cache 的機制 🤢
做個實驗:
# 不使用 prepare statement 的情況下
(1..100000).each {|i| puts TestModel.limit(i) }
# => ...
ActiveRecord::Base.connection.query_cache.keys.count
# => 100000# 使用 prepare statement 的情況下
(1..100000).each {|i| puts TestModel.limit(i) }
# => ...
ActiveRecord::Base.connection.query_cache['SELECT ...'].keys.count
# => 100000Problem


QueryCache
SQL Query & Result
QueryCache
Problem
https://code.jjb.cc/turning-off-activerecord-query-cache-to-improve-memory-consumption-in-background-jobs

Problem
在未來的 Rails 版本,可望 QueryCache 會改為 LRU(Least Recently Used Cache)

https://github.com/rails/rails/pull/48110
Problem
將來可以設定 QueryCache Max Size 或關閉 QueryCache
https://github.com/rails/rails/pull/48110
development:
adapter: mysql2
query_cache: 200 # default: 100
development:
adapter: mysql2
query_cache: false
Prepared Statement
- 很多資料庫都有 Prepared Statement 的設計
PREPARE test_prepare (INT) AS
SELECT id, name FROM stores LIMIT $1;
-- PREPARE
EXECUTE test_prepare(3);
-- id | name
-- ----+---------
-- 1 | Store A
-- 2 | Store B
-- 3 | Store C
SELECT * FROM pg_prepared_statements;
-- name | statement | ...
-- --------------+--------------------------------------+--------
-- test_prepare | PREPARE test_prepare (INT) AS +| 20...
-- | SELECT id, name FROM banks LIMIT $1; | Prepared Statement

https://www.interdb.jp/pg/pgsql03.html
statement 會多次重複使用的情況下:
- 避免重複的 parse, analyze, rewrite
- 不需要傳完整 sql,減少 network 流量
Simple Query:
Prepared Statement
Rails:
- postgres adapter: by default 開啟
- mysql2 adapter: 需設定 prepared_statements: true
development:
adapter: mysql2
prepared_statements: true
Prepared Statement
# when prepared_statements is set to true (default in postgresql adapter)
ActiveRecord::Base.connection.prepared_statements
# => true
User.last
# => User Load (2.6ms) SELECT "users".* FROM "users" ORDER BY "users"."id" DESC LIMIT $1 [["LIMIT", 1]]
# when prepared_statements is set to false
ActiveRecord::Base.connection.prepared_statements
# => false
User.last
# => User Load (2.7ms) SELECT "users".* FROM "users" ORDER BY "users"."id" DESC LIMIT 1ActiveRecord::PreparedStatementCacheExpired
使用 Prepared Statement 有可能會遇到這個錯誤:
- 當 schema 變動時,原先的 prepared statement 會過期
- 如果不在 transaction 中,Rails 會主動 deallocate 並重試
- 如果在 transaction 中, Rails 會 deallocate 並 raise error
ActiveRecord::PreparedStatementCacheExpired
ActiveRecord::Base.transaction do
retried ||= false
# ...
rescue ActiveRecord::PreparedStatementCacheExpired => e
raise e if retried
retried = true
end如果 transaction 內沒有不可重試的操作,可以加上 retry
enumerate_columns_in_select_statements
- Rails 7 新增的 config
- Rails 自動產生的 SELECT * Query 會導致每次有任何 schema change,prepared statement 都會過期
- 當 enumerate_columns_in_select_statements 為 true,Rails 會列清楚 SELECT 的欄位
https://github.com/rails/rails/pull/41718
User.limit(10)
# enumerate_columns_in_select_statements is off
# => SELECT * FROM users LIMIT 10
# enumerate_columns_in_select_statements is on
# => SELECT "first_name,last_name,email ..." FROM users LIMIT 10Performance Issue
N+1 Queries
Started GET "/" for ::1 at 2023-07-29 20:44:10 +0800
Processing by PagesController#index as HTML
[1m[36mOrder Load (6.3ms)[0m [1m[34mSELECT "orders".* FROM "orders" LIMIT $1[0m [["LIMIT", 10]]
↳ app/controllers/pages_controller.rb:4
[1m[36mDistributor Load (1.1ms)[0m [1m[34mSELECT "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m [["id", "..."], ["LIMIT", 1]]
↳ app/controllers/pages_controller.rb:5
[1m[36mDistributor Load (2.8ms)[0m [1m[34mSELECT "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m [["id", "..."], ["LIMIT", 1]]
↳ app/controllers/pages_controller.rb:5
[1m[36mDistributor Load (1.6ms)[0m [1m[34mSELECT "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m [["id", "..."], ["LIMIT", 1]]
↳ app/controllers/pages_controller.rb:5
[1m[36mDistributor Load (3.6ms)[0m [1m[34mSELECT "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m [["id", "..."], ["LIMIT", 1]]
↳ app/controllers/pages_controller.rb:5
[1m[36mDistributor Load (2.0ms)[0m [1m[34mSELECT "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m [["id", "..."], ["LIMIT", 1]]
↳ app/controllers/pages_controller.rb:5
[1m[36mDistributor Load (2.7ms)[0m [1m[34mSELECT "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m [["id", "..."], ["LIMIT", 1]]
↳ app/controllers/pages_controller.rb:5
[1m[36mDistributor Load (6.1ms)[0m [1m[34mSELECT "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m [["id", "..."], ["LIMIT", 1]]
↳ app/controllers/pages_controller.rb:5
[1m[36mDistributor Load (4.4ms)[0m [1m[34mSELECT "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m [["id", "..."], ["LIMIT", 1]]
↳ app/controllers/pages_controller.rb:5
[1m[36mDistributor Load (2.5ms)[0m [1m[34mSELECT "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m [["id", "..."], ["LIMIT", 1]]
↳ app/controllers/pages_controller.rb:5
[1m[36mDistributor Load (5.1ms)[0m [1m[34mSELECT "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m [["id", "..."], ["LIMIT", 1]]
↳ app/controllers/pages_controller.rb:5
Rendering pages/index.html.erb
Rendered pages/index.html.erb (17.5ms)
Completed 200 OK in 164ms (Views: 19.2ms | ActiveRecord: 73.4ms)N+1 Queries
在取得 main query 相關的 relation 的 data 時,額外執行了 N 次查詢的情況
@user = User.find(params[:id])
@posts = user.posts # 1
@posts.each do |post|
puts post.comments.count # n
endN+1 Queries
多次 DB Query 導致增加 DB loading & 拉長 response 時間
Started GET "/" for ::1 at 2023-07-29 20:44:10 +0800
Processing by PagesController#index as HTML
[1m[36mOrder Load (6.3ms)[0m [1m[34mSELECT "orders".* FROM "orders" LIMIT $1[0m [["LIMIT", 10]]
↳ app/controllers/pages_controller.rb:4
[1m[36mMerchant Load (1.1ms)[0m [1m[34mSELECT "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m [["id", "..."], ["LIMIT", 1]]
↳ app/controllers/pages_controller.rb:5
[1m[36mMerchant Load (2.8ms)[0m [1m[34mSELECT "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m [["id", "..."], ["LIMIT", 1]]
↳ app/controllers/pages_controller.rb:5
[1m[36mMerchant Load (1.6ms)[0m [1m[34mSELECT "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m [["id", "..."], ["LIMIT", 1]]
↳ app/controllers/pages_controller.rb:5
[1m[36mMerchant Load (3.6ms)[0m [1m[34mSELECT "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m [["id", "..."], ["LIMIT", 1]]
↳ app/controllers/pages_controller.rb:5
Rendering pages/index.html.erb
Rendered pages/index.html.erb (17.5ms)
Completed 200 OK in 164ms (Views: 19.2ms | ActiveRecord: 73.4ms)N+1 Queries 很難避免?


There's nothing wrong
with my code
老生常談卻依舊不停發生(?
N+1 Queries 很難避免?
controller
model
view
def index
@posts = Post.limit(20)
end<% @posts.each do |post| %>
<%= post.category.name %>
<%= post.post_liked %> Likes
<% end %>def post_liked
liked_users.count
end在這裡 preload
在這裡注意該 preload 的 association
Kill N+1 Queries
preload
With preload, Active Record loads each specified association using one query per association.
Payment.preload(:wallet).limit(10)
# Payment Load (1.3ms) SELECT `payments`.* FROM `payments`
# Wallet Load (0.8ms) SELECT `wallets`.* FROM `wallets` WHERE `wallets`.`id` IN (1, 2, 3, 5, 6) https://guides.rubyonrails.org/active_record_querying.html#preload
Kill N+1 Queries
eager_load
With eager_load, Active Record loads all specified associations using a LEFT OUTER JOIN associations.
Payment.eager_load(:wallet).limit(10)
# SQL (1.8ms) SELECT `payments`.`id` AS t0_r0, ...
# `wallets`.`id` AS t1_r0, `wallets`.`type` AS t1_r1, ...
# FROM `payments`
# LEFT OUTER JOIN `wallets` ON `wallets`.`id` = `payments`.`wallet_id`
# LIMIT 10https://guides.rubyonrails.org/active_record_querying.html#eager-load
Kill N+1 Queries
includes
Specify associations args to be eager loaded to prevent N + 1 queries. A separate query is performed for each association, unless a join is required by conditions.
Payment.preload(:wallet)
# Payment Load (1.3ms) SELECT `payments`.* FROM `payments`
# Wallet Load (0.8ms) SELECT `wallets`.* FROM `wallets` WHERE `wallets`.`id` IN (1, 2, 3, 5, 6)
Payment.includes(:wallet).where(wallet: { type: 'activated' }).limit(10)
# SQL (0.8ms) SELECT `payments`.`id` AS t0_r0, `payments`.`type` AS t0_r1,...
# `wallets`.`id` AS t1_r0, `wallets`.`type` AS t1_r1,...
# FROM `payments`
# LEFT OUTER JOIN `wallets` ON `wallets`.`id` = `payments`.`wallet_id`
# WHERE `wallets`.`type` = 'aaa'
# LIMIT 10https://github.com/rails/rails/pull/47253/files
Bullet gem
輔助檢查:
-
N + 1 Queries
-
Unused eager loading
-
Unnecessary COUNT queries (改用 counter_cache)
https://github.com/flyerhzm/bullet
Bullet gem
Bullet.enable = true
Bullet.alert = true
Bullet.bullet_logger = true
Bullet.console = true
Bullet.rails_logger = true
Bullet.add_footer = trueconfig/environments/development.rb
https://github.com/flyerhzm/bullet
GET /
USE eager loading detected
Order => [:merchant]
Add to your finder: :includes => [:merchant]
Call stack
.../app/controllers/pages_controller.rb:5:in `index'
...log/bullet.log
Bullet.bullet_logger = trueBullet gem
GET /
USE eager loading detected
Order => [:merchant]
Add to your finder: :includes => [:merchant]
Call stack
.../app/controllers/pages_controller.rb:5:in `index'
...log/development.log
Bullet.rails_logger = trueBullet gem

alert messege box
Bullet.alert = trueBullet gem
html div

Bullet.add_footer = trueBullet gem
browser console
Bullet.console = true
Bullet gem

N+1 Queries Detected!!!
(uniform notifier)
alert
slack
console
rollbar
Bullet gem
總結
- 字詞釋義
- Object Relational Mapping
- Active Record
- 為什麼會踩坑?
- 兩個案例:
- scope nil 的行為
- Mongoid Bug
- ActiveRecord Query Cache
- ActiveRecord Prepared Statment
- Performance Issue - N+1 Queries
Thanks for listening

Rails ORM 踩坑經驗談
By Meng-Ying Tsai
Rails ORM 踩坑經驗談
- 712