Remote calls are a crucial part of our systems then we need to perform some interaction with external services.
And these become also points of failure of our applications.
The more critical that supplier it, the more significant the impact of its failure will generate and propagate in our apps.
While an error here and there is tolerated, what would happen to your system if the supplier service is degraded?
Imagine something went really wrong and all requests to that service are timing out.
Your consumer application will also end up in a degraded and taking at least the timeout time to return with an error.
Circuit Breaker is a software design pattern that proposes the monitoring of the remote calls failures.
If a certain error rate is reached, the circuit will open and skip all the following requests within a sleep window.
When the sleep window expires it will request the external service again, but if the response is an error the circuit remains open.
When the downstream API start responding accordingly under the same thresholds, then the circuit will close and enable the main flow again.
app.get("/", function (req, res) {
const delay = req.query.speed === "fast" ? 50 : 200;
setTimeout(function () {
res.sendStatus(200);
}, delay);
});
def index
speed = params[:speed]
answers = Array(0..9).map do |i|
sleep(1)
puts "Request number #{i+1}"
HTTP.get("http://localhost:4000/?speed=#{speed}")
.status
end
render json: answers
end
def without_timeout
speed = params[:speed]
answers = Array(0..9).map do |i|
sleep(1)
puts "Request number #{i+1}"
HTTP.get("http://localhost:4000/?speed=#{speed}")
.status
end
render json: answers
end
def with_timeout
speed = params[:speed]
answers = Array(0..9).map do |i|
sleep(1)
puts "Request number #{i+1}"
HTTP.timeout(0.1)
.get("http://localhost:4000/?speed=#{speed}")
.status
end
render json: answers
end
class DefaultConfigClient
def initialize(speed:)
@speed = speed
@circuit = ::Circuitbox.circuit(:node_client, exceptions: [HTTP::TimeoutError])
end
def call
circuit.run do
HTTP.timeout(0.1).get("http://localhost:4000/?speed=#{speed}").status
end
end
private
attr_reader :speed, :circuit
end
sleep_window: seconds the circuit stays open once it has passed the error threshold. Defaults to 90 sec.
time_window: length of the interval (in seconds) over which it calculates the error rate. Defaults to 60 sec.
volume_threshold: number of requests within time_window seconds before it calculates error rates. Defaults to 5 requests.
error_threshold: exceeding this rate will open the circuit (checked on failures). Defaults to 50%.
circuit name
exceptions to monitor
default value
class CustomConfigClient
def initialize(speed:)
@speed = speed
@circuit = ::Circuitbox.circuit(
:node_client_custom_config,
exceptions: [HTTP::TimeoutError],
time_window: 5,
volume_threshold: 2,
sleep_window: 2
)
end
def call
circuit.run do
HTTP.timeout(0.1).get("http://localhost:4000/?speed=#{speed}").status
end
end
#...
We are setting time_window (the interval observed) as 5 seconds.
We are setting volume_threshold to 2, so we need to get at least 2 requests to calculate the error rate.
Finally, we override sleep_window to be 2 seconds.
# ...
def recover_cb
speed = params[:speed]
answers = Array(0..9).map do |i|
sleep(1)
puts "Request number #{i + 1}"
speed = 'fast' if i > 3
CustomConfigClient.new(speed: speed).call
end
render json: answers
end
# ...
After 4 requests switch to fast mode.
speed is slow, circuit closed
speed is fast, circuit open
speed is fast, circuit closed
The app depends on various queries to answer with an optimal response and there is a time constraint it cannot wait too long for any of the APIs.
If any of the APIs is down, the circuit opens, but the system keeps working in a partial state.
places
gov
shows
A circuit breaker is a pattern that helps us to keep our systems operating or partially operating.
The idea is to stop pursuing unresponsive services and get back querying them when they are healthy.
We manage to make our applications more resilient when we fail gracefully.
There are implementations available for us to plug in and start using.
Fine-tuning the configurations to better suit the application needs.