Wroclaw, Poland
April 13, 2024
meedan.com/check
github.com/meedan/check-api
github.com/meedan/check
The focus of this talk is on...
GraphQL
Ruby On Rails
... but it doesn't need to be!
Many concepts and architectures are applied to other frameworks and technical stacks.
More disclaimers!
GraphQL
Media
Comment
Tag
1
*
User
*
*
*
1
GET /api/v1/medias/1
GET /api/v1/medias/1/comments
GET /api/v1/medias/1/tags
GET /api/v1/medias/1/comments/1
GET /api/v1/users/1?fields=avatar,name
GET /api/v1/users/2?fields=avatar,name
GET /api/v1/users/3?fields=avatar,name
...
GET /api/v1/medias/1?include=comments&count=5
GET /api/v1/medias/1?include=comments,tags
&comments_count=5&tags_count=5
GET /api/v1/medias/1?fields=comments(text,date)
&tags(tag)
...
GET /api/v1/media_and_comments/1
GET /api/v1/media_comments_and_tags/1
GET /api/v1/media_comments_tags_and_users/1
GET /api/v1/medias/1?include=comments&count=5
GET /api/v1/medias/1?include=comments,tags
&comments_count=5&tags_count=5
GET /api/v1/medias/1?fields=comments(text,date)
&tags(tag)
...
GET /api/v1/media_and_comments/1
GET /api/v1/media_comments_and_tags/1
GET /api/v1/media_comments_tags_and_users/1
Too many requests!
POST /graphql
POST /api/graphql?query=
{
media(id: 1) {
title
embed
tags(first: 3) {
tag
}
comments(first: 5) {
created_at
text
user {
name,
avatar
}
}
}
}
POST /api/graphql?query=
{
media(id: 1) {
title
embed
tags(first: 3) {
tag
}
comments(first: 5) {
created_at
text
user {
name,
avatar
}
}
}
}
Media
Comment
Tag
1
*
User
*
*
*
1
~
POST /api/graphql?query=
{
media(id: 1) {
title
embed
tags(first: 3) {
tag
}
comments(first: 5) {
created_at
text
user {
name,
avatar
}
}
}
}
{
"media": {
"title": "Avangers Hulk Smash",
"embed": "<iframe src=\"...\"></iframe>",
"tags": [
{ "tag": "avengers" },
{ "tag": "operation" }
],
"comments": [
{
"text": "This is true",
"created_at": "2016-09-18 15:04:39",
"user": {
"name": "Ironman",
"avatar": "http://[...].png"
}
},
...
]
}
}
GraphQL
Ruby On Rails
mutation {
createMedia(
input: {
url: "http://youtu.be/7a_insd29fk"
clientMutationId: "1"
}
)
{
media {
id
}
}
}
Mutations make changes on your server side.
CRUD:
Queries: Read
Mutations:
# mutation {
createMedia(
# input: {
url: "http://youtu.be/7a_insd29fk"
# clientMutationId: "1"
# }
# )
{
media {
id
}
}
# }
Mutation name
Input parameters
Desired output
😊
😔
query {
teams(first: 1000) {
name
profile_image
users(first: 1000) {
name
email
posts(first: 1000) {
title
body
tags(first: 1000) {
tag_name
}
}
}
}
}
Nested queries can become a real problem.
The actual complexity of a query and cost of some fields can get hidden by the expressiveness of the language.
Let's see some strategies to handle this.
# Some controller test
gql_query = 'query { posts(first: 10) { title, user { name } } }'
assert_queries 5 do
post :create, params: { query: gql_query }
end
Keep track if some refactoring or code change introduces regressions on how some GraphQL queries are executed.
# Some test helper
def assert_queries(max, &block)
query_cache_enabled = ApplicationRecord.connection.query_cache_enabled
ApplicationRecord.connection.enable_query_cache!
queries = []
callback = lambda { |_name, _start, _finish, _id, payload|
if payload[:sql] =~ /^SELECT|UPDATE|INSERT/ && !payload[:cached]
queries << payload[:sql]
end
}
ActiveSupport::Notifications.subscribed(callback, "sql.active_record", &block)
queries
ensure
ApplicationRecord.connection.disable_query_cache! unless query_cache_enabled
message = "#{queries.size} expected to be less or equal to #{max}."
assert queries.size <= max, message
end
Under the hood, one way is:
query {
posts(first: 5) {
id
author {
name
}
}
}
Post Load (0.9ms) SELECT "posts".* FROM "posts"
User Load (0.3ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 1], ["LIMIT", 1]]
User Load (0.3ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 2], ["LIMIT", 1]]
User Load (0.3ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 3], ["LIMIT", 1]]
User Load (0.3ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 4], ["LIMIT", 1]]
User Load (0.3ms) SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT $2 [["id", 5], ["LIMIT", 1]]
class User < ApplicationRecord
has_many :posts
end
class Post < ApplicationRecord
belongs_to :user
has_many :tags
end
In a REST endpoint, you'd typically predict returning both posts and their authors, prompting eager-loading in your query. However, since we can't anticipate what the client will request here, we can't always preload the owner.
# app/graphql/types/post_type.rb
field :author, Types::UserType do
resolve -> (post, _args, _context) {
RecordLoader.for(User).load(post.user_id)
}
end
Post Load (0.5ms) SELECT "posts".* FROM "posts" ORDER BY "posts"."id" DESC LIMIT $1 [["LIMIT", 5]]
User Load (0.4ms) SELECT "users".* FROM "users" WHERE "users"."id" IN (1, 2, 3, 4, 5)
BatchLoader could be used as well.
# app/graphql/types/post_type.rb
field :tags, !types[Types::TagType] do
preload :tags
resolve -> (post, _args, _ctx) { post.tags }
end
But still, it can suffer when dealing with more complex queries... What if we could predict queried data precisely and create a dynamic hash to preload associations?
field :users, [Types::UserType], null: false, extras: [:lookahead]
def users(lookahead:)
# Do something with lookahead
end
query
└── users
├── id
├── name
└── posts
├── id
└── title
The lookahead object is like a tree structure that represents the information you need in order to optimize your query. In practice, it's way more complicated than this.
query {
teams(first: 1000) {
users(first: 1000) {
name
posts(first: 1000) {
tags(first: 1000) {
tag_name
}
author {
posts(first: 1000) {
title
}
}
}
}
}
}
# app/graphql/your_schema.rb
YourSchema = GraphQL::Schema.define do
max_depth 4 # adjust as required
use GraphQL::Batch
enable_preloading
mutation(Types::MutationType)
query(Types::QueryType)
end
# Added to the bottom of app/graphql/your_schema.rb
YourSchema.middleware <<
GraphQL::Schema::TimeoutMiddleware.new(max_seconds: 5) do |e, q|
Rails.logger.info("GraphQL Timeout: #{q.query_string}")
end
class PostType < BaseObject
field :id, ID, null: false
field :title, String, null: false, cache_fragment: true
end
class QueryType < BaseObject
field :post, PostType, null: true do
argument :id, ID, required: true
end
def post(id:)
last_updated_at = Post.select(:updated_at).find_by(id: id)&.updated_at
cache_fragment(last_updated_at, expires_in: 5.minutes) { Post.find(id) }
end
end
Our own approach to caching fields for high-demanding fields (events-based, meta-programming)
# app/models/media_rb
cached_field :last_seen,
start_as: proc { |media| media.created_at },
update_es: true,
expires_in: 1.hour,
update_on: [
{
model: TiplineRequest,
if: proc { |request| request.associated_type == 'Media' },
affected_ids: proc { |request| request.associated.medias },
events: {
create: proc { |request| request.created_at },
}
},
{
model: Relationship,
if: proc { |relationship| relationship.is_confirmed? },
affected_ids: proc { |relationship| relationship.parent.medias },
events: {
save: proc { |relationship| relationship.created_at },
destroy: :recalculate
}
}
]
Then, the backend is able to process those concurrently, using graphql gem's multiplex
# Prepare the context for each query:
context = {
current_user: current_user,
}
# Prepare the query options:
queries = [
{
query: "query postsList { posts { title } }",
variables: {},
operation_name: 'postsList',
context: context,
},
{
query: "query latestPosts ($num: Int) { posts(last: $num) }",
variables: { num: 3 },
operation_name: 'latestsPosts',
context: context,
}
]
# Execute concurrently
results = YourSchema.multiplex(queries)
"With great systems comes great responsibility"
Prioritize performance optimization to enhance user experience and application scalability.
Identify and Address Bottlenecks: Regularly monitor and profile your application to identify performance bottlenecks, focusing on database queries, Ruby code execution, and GraphQL query optimization.
Optimization is an Ongoing Process: Performance optimization is not a one-time task; it's an ongoing process that requires continuous monitoring, analysis, and improvement.
https://ca.ios.ba
@caiosba