Our solution
# Returns true if this host has caught up to the given transaction
# write location.
#
# location - The transaction write location as reported by a primary.
def caught_up?(location)
string = connection.quote(location)
# In case the host is a primary pg_last_wal_replay_lsn/pg_last_xlog_replay_location() returns
# NULL. The recovery check ensures we treat the host as up-to-date in
# such a case.
query = <<-SQL.squish
SELECT NOT pg_is_in_recovery()
OR pg_wal_lsn_diff(pg_last_wal_replay_lsn(), #{string}) >= 0
AS result
SQL
row = query_and_release(query)
::Gitlab::Utils.to_boolean(row['result'])
rescue *CONNECTION_ERRORS
false
endChecking if a secondary has caught up:
We can't reliably use the same sticking mechanism as we have no way of knowing whether a job should use the primary or not as many jobs are not directly tied to a user.
In order to utilize Sidekiq read-only database replicas capabilities, jobs can have a data_consistency attribute set, which can be:
To set a job’s data consistency, we can use the data_consistency class method:
class DelayedWorker
include ApplicationWorker
data_consistency :delayed, feature_flag: :load_balancing_for_delayed_worker
# ...
end
If the current replica has a WAL pointer that exceeds the provided location, we know they are in sync and we can safely use the replica for our read-only queries