Rails ORM

踩坑經驗談

新手向  COSCUP 2023

  • 小名文月,簡稱 fumi
  • ❤: 喝淺焙咖啡、唱日卡、嚐甜食

 

 

一年 365 天歡迎餵食,

請多指教 ヽ(●´∀`●)ノ

Meng-Ying Tsai

Rails ORM

踩坑經驗談

新手向  COSCUP 2023

Object Relational Mapping 

  • 連接物件導向程式 & (關聯式)資料庫之間的技巧

  • ORM tools:幫助開發者與(關聯式)資料庫互動的工具

 

Active Record

兩種意思:

  • 一種架構模式
  • 透過這個 pattern 實作的 ORM 框架

      eg. Default ORM tool in Rails

為什麼會踩坑?

不了解工具

導致問題

工具的設計有缺點

導致問題

為什麼會踩坑?

SQL Query

Rails ORM

為什麼會踩坑?

SQL Query

Rails ORM

實際如何執行?

額外的程式邏輯?

案例:scope 回 nil 時會回傳所有結果?

https://sagarjunnarkar.github.io/blogs/2019/09/15/activerecord-scope/

  • first 回傳一筆 record:拿回目前最新 publish 的 record
  • first 回傳 nil:回傳所有 record
class Post < ApplicationRecord
  scope :recent_published,
        -> { where(published: true).order('published_at DESC').first }
end
Post.recent_published

Post Load (0.3ms)  SELECT  "posts".* FROM "posts" WHERE "posts"."published" = ? ORDER BY published_at DESC LIMIT ?  [["published", 1], ["LIMIT", 1]]
Post Load (0.1ms)  SELECT  "posts".* FROM "posts" LIMIT ?  [["LIMIT", 11]]
=> <ActiveRecord::Relation ...>

案例:scope 回 nil 時會回傳所有結果?

https://guides.rubyonrails.org/active_record_querying.html#scopes

Scoping allows you to specify commonly-used queries which can be referenced as method calls on the association objects or models. With these scopes, you can use every method previously covered such as where, joins and includes. All scope bodies should return an ActiveRecord::Relation or nil to allow for further methods (such as other scopes) to be called on it.

Model.scope_a.scope_b.scope_c

案例:scope 回 nil 時會回傳所有結果?

class Order < ApplicationRecord
  scope :created_before, 
        ->(time) { where(created_at: ...time) if time.present? }
end

回傳 nil 的使用情境:

案例:Criteria.first 會變成 Criteria.last?

在用 mongoid 7.x 時遇到這樣的狀況

criteria = Model.my_scope # no sorting
criteria.first.id 
# => 64b63169f8a9b80348a14bc7
criteria.last.id 
# => 64c366a23932ad003b9c5e46
criteria.first.id 
# => 64c366a23932ad003b9c5e46
criteria.last.id 
# => 64c366a23932ad003b9c5e46

但只要有指定 sorting 就不會有事 👻

案例:Criteria.first 會變成 Criteria.last?

在沒有 cache 的情況下重下 query 發現:

Mongoid::QueryCache.uncached do  
  test_criteria = Model.my_scope
  puts test_criteria.first.id
  puts test_criteria.last.id
  puts test_criteria.first.id 
end
DEBUG -- : MONGODB | ... "sort"=>{"_id"=>1} ...
DEBUG -- : MONGODB | ... "sort"=>{"_id"=>-1} ...
DEBUG -- : MONGODB | ... "sort"=>{"_id"=>-1} ...

?!!!

案例:Criteria.first 會變成 Criteria.last?

經過一番研究跟翻 code,

發現 #last 會讓 Mongo::Collection::View 被改動

test_criteria = Model.my_scope
test_criteria.view.sort
# => nil
test_criteria.last
# ...
test_criteria.view.sort
# => {"_id"=>-1}

案例:Criteria.first 會變成 Criteria.last?

案例:Criteria.first 會變成 Criteria.last?

原來是踩到 bug (鬆一口氣)

如何改善?

  • 多看文件
  • 多看 log 了解 ORM 組出什麼 DB query

這次想多講講 ActiveRecord 有趣的機制:

  • QueryCache
  • Prepared Statement

以及初步可能會遇到的 Performance Issue:

  • N+1 Queries

SQL Query Cache

  • 同一個請求中多次執行相同的查詢,只會有一次實際的 DB 查詢
  • 生命週期只在一個 controller action 中
Bank.first
Bank Load (0.5ms)  SELECT `banks`.* FROM `banks` ORDER BY `banks`.`id` ASC LIMIT 1

Bank.first
CACHE Bank Load (0.0ms)  SELECT `banks`.* FROM `banks` ORDER BY `banks`.`id` ASC LIMIT 1

Rails Console 裡有 QueryCache?

Default 是 false,可用 enable_query_cache! 打開設定來做測試

Q

ActiveRecord::Base.connection.query_cache_enabled
# => false

ActiveRecord::Base.connection.enable_query_cache!

ActiveRecord::Base.connection.query_cache_enabled
# => true

既然有 QueryCache,memoization 會顯得沒必要嗎?

Query Cache 少了多次對 DB 查詢的開銷,但並未減少 ActiveRecord Result 轉換成 ActiveRecord Relation 的開支

 

Memoization 仍然是有意義的

Q

def published_post
  @posts ||= Post.published
end

哪裡可以看我有哪些 QueryCache?

Q

ActiveRecord::Base.connection.query_cache
# => {"SELECT  \"users\".* FROM \"users\" ORDER BY \"users\".\"id\" DESC LIMIT 1"=> {[]=> #<ActiveRecord::Result... > },
#     "SELECT  ... "=> {[]=> #<ActiveRecord::Result... > }
  

ActiveRecord::Base.connection.query_cache['SELECT .... '][[]]
#=> #<ActiveRecord::Result:0x0000000280c3f140
#     @column_types= { ... },
#     @columns= [ ... ],
#     @hash_rows=nil,
#     @rows=
#     [["column_1_result",
#       "column2_result"],
#      ["column_1_result",
#       "column2_result"]]]
#    >

Problem

拿到的資料有可能不是最新的,但有些操作需要確保資料是即時的

  • 👉 手動清除 QueryCache Model#reload
  • 👉 disable QueryCache ActiveRecord::Base.uncached
Bank.first
#  Bank Load (0.5ms)  SELECT `banks`.* FROM `banks` ORDER BY `banks`.`id` ASC LIMIT 1

Bank.first
# CACHE Bank Load (0.0ms)  SELECT `banks`.* FROM `banks` ORDER BY `banks`.`id` ASC LIMIT 1

ActiveRecord::Base.uncached { Bank.first }
#  Bank Load (0.5ms)  SELECT `banks`.* FROM `banks` ORDER BY `banks`.`id` ASC LIMIT 1

Problem

Query Cache 越長越大造成 memory 負擔過重

卻沒有清除 Cache 的機制 🤢

做個實驗:

# 不使用 prepare statement 的情況下
(1..100000).each {|i| puts TestModel.limit(i) }
# => ...
ActiveRecord::Base.connection.query_cache.keys.count
# => 100000
# 使用 prepare statement 的情況下
(1..100000).each {|i| puts TestModel.limit(i) }
# => ...
ActiveRecord::Base.connection.query_cache['SELECT ...'].keys.count
# => 100000

Problem

QueryCache

SQL Query & Result

QueryCache

Problem

https://code.jjb.cc/turning-off-activerecord-query-cache-to-improve-memory-consumption-in-background-jobs

Problem

在未來的 Rails 版本,可望 QueryCache 會改為 LRU(Least Recently Used Cache)

https://github.com/rails/rails/pull/48110

Problem

將來可以設定 QueryCache Max Size 或關閉 QueryCache

https://github.com/rails/rails/pull/48110

development:
  adapter: mysql2
  query_cache: 200  # default: 100
development:
  adapter: mysql2
  query_cache: false

Prepared Statement

  • 很多資料庫都有 Prepared Statement 的設計
PREPARE test_prepare (INT) AS
SELECT id, name FROM stores LIMIT $1;
-- PREPARE

EXECUTE test_prepare(3);
--  id |  name           
-- ----+---------
--  1  | Store A
--  2  | Store B
--  3  | Store C

SELECT * FROM pg_prepared_statements;
--      name     |              statement               |   ...
-- --------------+--------------------------------------+--------
--  test_prepare | PREPARE test_prepare (INT) AS       +| 20...
--               | SELECT id, name FROM banks LIMIT $1; |   

Prepared Statement

https://www.interdb.jp/pg/pgsql03.html

statement 會多次重複使用的情況下:

  • 避免重複的 parse, analyze, rewrite
  • 不需要傳完整 sql,減少 network 流量

Simple Query:

Prepared Statement

Rails:

  • postgres adapter: by default 開啟
  • mysql2 adapter: 需設定 prepared_statements: true
development:
  adapter: mysql2
  prepared_statements: true

Prepared Statement

# when prepared_statements is set to true (default in postgresql adapter)
ActiveRecord::Base.connection.prepared_statements
# => true
User.last
# => User Load (2.6ms)  SELECT  "users".* FROM "users" ORDER BY "users"."id" DESC LIMIT $1  [["LIMIT", 1]]


# when prepared_statements is set to false
ActiveRecord::Base.connection.prepared_statements
# => false
User.last
# => User Load (2.7ms)  SELECT  "users".* FROM "users" ORDER BY "users"."id" DESC LIMIT 1

ActiveRecord::PreparedStatementCacheExpired 

使用 Prepared Statement 有可能會遇到這個錯誤:

 

  • 當 schema 變動時,原先的 prepared statement 會過期
  • 如果不在 transaction 中,Rails 會主動 deallocate 並重試
  • 如果在 transaction 中, Rails 會 deallocate 並 raise error

ActiveRecord::PreparedStatementCacheExpired 

ActiveRecord::Base.transaction do
  retried ||= false
  # ...
rescue ActiveRecord::PreparedStatementCacheExpired => e
  raise e if retried
  retried = true
end

如果 transaction 內沒有不可重試的操作,可以加上 retry

enumerate_columns_in_select_statements

  • Rails 7 新增的 config
  • Rails 自動產生的 SELECT * Query 會導致每次有任何 schema change,prepared statement 都會過期
  • 當 enumerate_columns_in_select_statements 為 true,Rails 會列清楚 SELECT 的欄位

https://github.com/rails/rails/pull/41718

User.limit(10)

# enumerate_columns_in_select_statements is off
# => SELECT * FROM users LIMIT 10

# enumerate_columns_in_select_statements is on
# => SELECT "first_name,last_name,email ..." FROM users LIMIT 10

Performance Issue

N+1 Queries

Started GET "/" for ::1 at 2023-07-29 20:44:10 +0800
Processing by PagesController#index as HTML
  [1m[36mOrder Load (6.3ms)[0m  [1m[34mSELECT  "orders".* FROM "orders" LIMIT $1[0m  [["LIMIT", 10]]
  ↳ app/controllers/pages_controller.rb:4
  [1m[36mDistributor Load (1.1ms)[0m  [1m[34mSELECT  "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m  [["id", "..."], ["LIMIT", 1]]
  ↳ app/controllers/pages_controller.rb:5
  [1m[36mDistributor Load (2.8ms)[0m  [1m[34mSELECT  "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m  [["id", "..."], ["LIMIT", 1]]
  ↳ app/controllers/pages_controller.rb:5
  [1m[36mDistributor Load (1.6ms)[0m  [1m[34mSELECT  "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m  [["id", "..."], ["LIMIT", 1]]
  ↳ app/controllers/pages_controller.rb:5
  [1m[36mDistributor Load (3.6ms)[0m  [1m[34mSELECT  "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m  [["id", "..."], ["LIMIT", 1]]
  ↳ app/controllers/pages_controller.rb:5
  [1m[36mDistributor Load (2.0ms)[0m  [1m[34mSELECT  "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m  [["id", "..."], ["LIMIT", 1]]
  ↳ app/controllers/pages_controller.rb:5
  [1m[36mDistributor Load (2.7ms)[0m  [1m[34mSELECT  "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m  [["id", "..."], ["LIMIT", 1]]
  ↳ app/controllers/pages_controller.rb:5
  [1m[36mDistributor Load (6.1ms)[0m  [1m[34mSELECT  "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m  [["id", "..."], ["LIMIT", 1]]
  ↳ app/controllers/pages_controller.rb:5
  [1m[36mDistributor Load (4.4ms)[0m  [1m[34mSELECT  "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m  [["id", "..."], ["LIMIT", 1]]
  ↳ app/controllers/pages_controller.rb:5
  [1m[36mDistributor Load (2.5ms)[0m  [1m[34mSELECT  "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m  [["id", "..."], ["LIMIT", 1]]
  ↳ app/controllers/pages_controller.rb:5
  [1m[36mDistributor Load (5.1ms)[0m  [1m[34mSELECT  "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m  [["id", "..."], ["LIMIT", 1]]
  ↳ app/controllers/pages_controller.rb:5
  Rendering pages/index.html.erb
  Rendered pages/index.html.erb (17.5ms)
Completed 200 OK in 164ms (Views: 19.2ms | ActiveRecord: 73.4ms)

N+1 Queries

在取得 main query 相關的 relation 的 data 時,額外執行了 N 次查詢的情況

@user = User.find(params[:id])
@posts = user.posts # 1
@posts.each do |post|
  puts post.comments.count # n
end

N+1 Queries

多次 DB Query 導致增加 DB loading & 拉長 response 時間

Started GET "/" for ::1 at 2023-07-29 20:44:10 +0800
Processing by PagesController#index as HTML
  [1m[36mOrder Load (6.3ms)[0m  [1m[34mSELECT  "orders".* FROM "orders" LIMIT $1[0m  [["LIMIT", 10]]
  ↳ app/controllers/pages_controller.rb:4
  [1m[36mMerchant Load (1.1ms)[0m  [1m[34mSELECT  "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m  [["id", "..."], ["LIMIT", 1]]
  ↳ app/controllers/pages_controller.rb:5
  [1m[36mMerchant Load (2.8ms)[0m  [1m[34mSELECT  "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m  [["id", "..."], ["LIMIT", 1]]
  ↳ app/controllers/pages_controller.rb:5
  [1m[36mMerchant Load (1.6ms)[0m  [1m[34mSELECT  "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m  [["id", "..."], ["LIMIT", 1]]
  ↳ app/controllers/pages_controller.rb:5
  [1m[36mMerchant Load (3.6ms)[0m  [1m[34mSELECT  "merchants".* FROM "merchants" WHERE "merchants"."id" = $1 LIMIT $2[0m  [["id", "..."], ["LIMIT", 1]]
  ↳ app/controllers/pages_controller.rb:5
  Rendering pages/index.html.erb
  Rendered pages/index.html.erb (17.5ms)
Completed 200 OK in 164ms (Views: 19.2ms | ActiveRecord: 73.4ms)

N+1 Queries 很難避免?

There's nothing wrong

with my code

老生常談卻依舊不停發生(?

N+1 Queries 很難避免?

controller

model

view

def index
  @posts = Post.limit(20)
end
<% @posts.each do |post| %>
  <%= post.category.name %>
  <%= post.post_liked %> Likes
<% end %>
def post_liked
  liked_users.count
end

在這裡 preload

在這裡注意該 preload 的 association

Kill N+1 Queries

preload

With preload, Active Record loads each specified association using one query per association.

Payment.preload(:wallet).limit(10)
#  Payment Load (1.3ms)  SELECT `payments`.* FROM `payments`
#  Wallet Load (0.8ms)  SELECT `wallets`.* FROM `wallets` WHERE `wallets`.`id` IN (1, 2, 3, 5, 6)  

https://guides.rubyonrails.org/active_record_querying.html#preload

Kill N+1 Queries

eager_load

With eager_load, Active Record loads all specified associations using a LEFT OUTER JOIN associations.

Payment.eager_load(:wallet).limit(10)
#  SQL (1.8ms)  SELECT `payments`.`id` AS t0_r0, ...
#  `wallets`.`id` AS t1_r0, `wallets`.`type` AS t1_r1, ...
#  FROM `payments` 
#  LEFT OUTER JOIN `wallets` ON `wallets`.`id` = `payments`.`wallet_id` 
#  LIMIT 10

https://guides.rubyonrails.org/active_record_querying.html#eager-load

Kill N+1 Queries

includes

Specify associations args to be eager loaded to prevent N + 1 queries. A separate query is performed for each association, unless a join is required by conditions.

Payment.preload(:wallet)
#  Payment Load (1.3ms)  SELECT `payments`.* FROM `payments`
#  Wallet Load (0.8ms)  SELECT `wallets`.* FROM `wallets` WHERE `wallets`.`id` IN (1, 2, 3, 5, 6)  

Payment.includes(:wallet).where(wallet: { type: 'activated' }).limit(10)
#  SQL (0.8ms)  SELECT `payments`.`id` AS t0_r0, `payments`.`type` AS t0_r1,...
#  `wallets`.`id` AS t1_r0, `wallets`.`type` AS t1_r1,...
#  FROM `payments` 
#  LEFT OUTER JOIN `wallets` ON `wallets`.`id` = `payments`.`wallet_id` 
#  WHERE `wallets`.`type` = 'aaa' 
#  LIMIT 10

https://github.com/rails/rails/pull/47253/files

Bullet gem

輔助檢查:

  • N + 1 Queries

  • Unused eager loading

  • Unnecessary COUNT queries (改用 counter_cache)

https://github.com/flyerhzm/bullet

Bullet gem

      Bullet.enable        = true
      Bullet.alert         = true
      Bullet.bullet_logger = true
      Bullet.console       = true
      Bullet.rails_logger  = true
      Bullet.add_footer    = true

config/environments/development.rb

https://github.com/flyerhzm/bullet

GET /
USE eager loading detected
  Order => [:merchant]
  Add to your finder: :includes => [:merchant]
Call stack
  .../app/controllers/pages_controller.rb:5:in `index'
  ...

log/bullet.log

Bullet.bullet_logger = true

Bullet gem

GET /
USE eager loading detected
  Order => [:merchant]
  Add to your finder: :includes => [:merchant]
Call stack
  .../app/controllers/pages_controller.rb:5:in `index'
  ...

log/development.log

Bullet.rails_logger = true

Bullet gem

alert messege box 

Bullet.alert = true

Bullet gem

html div

Bullet.add_footer = true

Bullet gem

browser console

Bullet.console = true

Bullet gem

N+1 Queries Detected!!!

(uniform notifier)

alert

slack

console

rollbar

Bullet gem

總結

  • 字詞釋義
    • Object Relational Mapping
    • Active Record
  • 為什麼會踩坑?
  • 兩個案例:
    • scope nil 的行為
    • Mongoid Bug
  • ActiveRecord Query Cache
  • ActiveRecord Prepared Statment
  • Performance Issue - N+1 Queries

Thanks for listening

Rails ORM 踩坑經驗談

By Meng-Ying Tsai

Rails ORM 踩坑經驗談

  • 401