Circuit breaker pattern
Radim Štěpaník
Radim Štěpaník
Circuit breaker pattern
⚡️Circuit breaker
When you should consider it
- distributed environment
- to fail fast
- fault tolerance
Mistakes are everywhere
- application level errors
- network errors
- service outages
- datacenter outages
Typical scenario
Something starts burning
Typical scenario
A
B
C
GW
A
B
C
GW
🔥
Something starts burning
Retry
- Just retry it - regardless of the conditions
- Improvement - time distribution, wait for a while
- It's good when everything is successful
💥
👉
A
B
C
GW
🔥
How will it turn out?
A
B
C
GW
🔥
🔥
How will it turn out?
A
B
C
GW
🔥
🔥
🔥
🔥
How will it turn out?
A
B
C
GW
🔥
🔥
🔥
🔥
🔥
🔥
🔥
🔥
🔥
🔥
🔥
🔥
🔥
Complete disaster
How will it turn out?
Complete disaster
A
B
C
GW
🔥
What could CB do for you?
A
B
C
GW
🔥
Open
A
B
C
GW
Half-open
Closed again
A
B
C
GW
What could CB do for you?
Open
We are waiting for recovery
CB will close itself after a while
Circuit breaker
- State of CB
- errors, timeouts
- Operation - function
- Fallback
- Settings
- error threshold
- timeout
- ....protection parameters - time,
amount of errors, etc.
Support
- your own implementation
-
libraries
- nodejs - https://www.npmjs.com/package/opossum
- resilience4j - https://github.com/resilience4j/resilience4j
- php - https://github.com/upwork/phystrix
- ....
Code example
- Stav jističe
- chyby, timeouty
- Samotná operace
- Fallback
- Nastavení
- práh erroru
- timeout
- .... ochranné pásmo - čas,
množství, etc.
import CircuitBreaker from 'opossum';
function asyncFunctionThatCouldFail(x, y) {
return new Promise((resolve, reject) => {
// Do something, maybe on the network or a disk
});
}
const options = {
timeout: 3000, // If our function takes longer than 3 seconds, trigger a failure
errorThresholdPercentage: 50, // When 50% of requests fail, trip the circuit
resetTimeout: 30000, // After 30 seconds, try again.
};
const breaker = new CircuitBreaker(asyncFunctionThatCouldFail, options);
breaker.fire(x, y).then(console.log).catch(console.error);
Code example
import CircuitBreaker from 'opossum';
function asyncFunctionThatCouldFail(x, y) {
return new Promise((resolve, reject) => {
// Do something, maybe on the network or a disk
});
}
const options = {
timeout: 3000, // If our function takes longer than 3 seconds, trigger a failure
errorThresholdPercentage: 50, // When 50% of requests fail, trip the circuit
resetTimeout: 30000, // After 30 seconds, try again.
};
const breaker = new CircuitBreaker(asyncFunctionThatCouldFail, options)
.fallback(() => 'do something else');
breaker.fire(x, y).then(console.log).catch(console.error);
Use cases
- IO operations
- http clients
- database operations - e.g. Redis
- reading from shared file
- High level logic
- setting of fallback mechanisms
-
💡Usage with graphql?
Things you can't rely on
Use case
Products
Categories
Banners
GQL GW
50+ services
search
customer
cart
bestsellers
banners
menu
category
products
filters
content
# Calls category
getCategory(categoryUrl: $categoryUrl) {
id
howto
# Calls content service
... on ContentCategory {
content {
id
title
body
}
}
# Calls product service
... on ProductCategory {
productCollection {
items {
... on Product {
...productForList
}
# Calls estimated delivery service
... on BonusSet {
estimatedDeliveries {
...productEstimatedDeliveryFragment
}
}
# Calls banners service
... on SectionBannerSlideImage {
...sectionBannerSlide
}
}
}
}
}
// 👉 every resolver has unique name - Query:category, Category:productCollection
const resolvers = {
Query: {
category: resolveCategory,
},
Category: {
productCollection: resolveProductCollection,
},
};
// 🌍 global configuration of specific resolvers
export const resolverConfig: Map<string, ResolverConfig> = new Map<string, ResolverConfig>()
.set('Query:category', { timeout: 3000 })
.set('Category:productCollection', { timeout: 3000, fallback: () => [] });
// 👷 Usage of high order function. Resolver is wrapped by circuit breaker
const resolversWithCircuitBreakers = mapResolvers(
resolvers,
//
({ resolver, name }: CreateResolverInput): IFieldResolver<unknown, unknown, unknown> => {
const { ...breakerDefaultConfig } = resolverConfig.get(name) ?? {};
const config = { name, ...breakerDefaultConfig };
const breaker = createCircuitBreaker(config, resolver);
return async (...params: ResolverParams): Promise<unknown> => breaker.fire(...params);
},
);
Failsafe solution❓
Monitoring hystrix
Monitoring prometheus/grafana
- sms notification
- teams notification
Is it the answer for everything?
- increase application complexity
- correct setting of timeout
- must be defined in the code
- increase processing requirements
👋 Thanks for your attention
Allegro Circuit breaker
By Radim Štěpaník
Allegro Circuit breaker
- 235