Basic System Design 🌚
For Frontend Developer
About Me
莫力全 Kyle Mo
Software Engineer @OneDegree
雜食性軟體工程師,人生不長,但也不短,有興趣的技術我都要學。
Starbugs Weekly
BESG
Why should I learn SD as a frontend engineer ? 🧐
Interview
Frontend Engineer
Backend Engineer
DevOps Engineer
Data Engineer
Software Engineer
Cloud Engineer
曾經,我也有一次系統設計的面試經驗...🙈
I think it‘s not just for the interviews. 🎃
🌚 - Ability to work with complex system
☺️ - Junior Developer -> Senior Developer
😀 - Communicate with different roles
🧐 - FE tends to need BE skills today
From Line Developer talk in conference
Distributed System 分散式系統 👌
Assume that...
How about..
And how about..
And if...
???
把剛剛的人想成網路流量,交通工具想成伺服器,我們知道單機的 Vertical Scaling 有局限性,也暴露了 Single Point Of Failure 的風險
分散式系統是一組電腦,透過網路相互連接傳遞訊息與通訊後並協調他們行為而成的系統。
運算
服務分流
儲存
只要是分散式,就要特別注意資料一致性的問題。
分散式系統是一個超級複雜的 Topic,今天只會講到皮毛,有興趣可以參考 MIT 6.824 分散式系統課程
我們希望設計出來的系統擁有哪些特性? 👀
「效能增長程度」與「資源投入」成正比
當系統遇到的流量漸漸變大時,我們會希望系統的伺服器或儲存空間也能夠跟著擴展,避免無法負荷的狀況。
Scalability 可擴展性/可擴縮性
可靠性代表一個系統在它開始執行之後到某個時間點,系統正常執行的機率,也就是系統無故障執行的概率。
Reliability 可靠性
Availability 可用性
可用性是一個容易跟可靠性搞混的指標,它的定義為系統在面對各種異常時可以正確提供服務的能力,更嚴謹的定義為「系統服務不中斷運行時間占實際運行時間的比例。」如果以公式來看:
Availability % = (available time / total time) *100
Availability 可用性
Efficiency 高效率
Latency 延遲:執行一個操作要花費的「時間長度」。
Throughput 吞吐量:以一個時間區間作為單位,單位時間內可以執行「幾次」操作,或運算的「次數」。
High Concurrency
故名思義,代表一個系統是不是方便管理,是不是能快速迭代新功能?是不是能夠快速追蹤 bug?或是能不能把 infrastructure 抽象化,讓應用工程師可以專注在程式邏輯的開發。
Manageability 可管理性
IaC (Infrastructure as Code)
# Cloudformation example
Parameters:
ExistingSecurityGroup:
Description: An existing security group ID (optional).
Default: NONE
Type: String
AllowedValues:
- default
- NONE
Conditions:
CreateNewSecurityGroup: !Equals [!Ref ExistingSecurityGroup, NONE]
Resources:
MyInstance:
Type: "AWS::EC2::Instance"
Properties:
ImageId: "ami-0ff8a91507f77f867"
SecurityGroups: !If [CreateNewSecurityGroup, !Ref NewSecurityGroup, !Ref ExistingSecurityGroup]
NewSecurityGroup:
Type: "AWS::EC2::SecurityGroup"
Condition: CreateNewSecurityGroup
Properties:
GroupDescription: Enable HTTP access via port 80
SecurityGroupIngress:
-
IpProtocol: tcp
FromPort: 80
ToPort: 80
CidrIp: 0.0.0.0/0
Outputs:
SecurityGroupId:
Description: Group ID of the security group used.
Value: !If [CreateNewSecurityGroup, !Ref NewSecurityGroup, !Ref ExistingSecurityGroup]
Key Components in SD 🧠
Cache
Proxy
Replication & Redundant
CDN
Load Balancing
Message Queue
Schema Design
Database (SQL vs NoSQL)
Partition & Sharding
DB Master-Slave
CAP Theorem
😩
Cache
Replication & Redundant
CDN
Load Balancing
Schema Design
Database (SQL vs NoSQL)
Partition & Sharding
CAP Theorem
😊
DB Master-Slave
Message Queue
Proxy
Caching 👻
Cache
Application
DB
1. Cache Hit 直接返回
2. Cache Miss 從 DB 拿資料
3. 把資料寫進 Cache
CDN 🌍
Load Balancing 🧜♀️
Round-Robin
Least Connected
IP-Hash
面對系統設計時的思維走向 🐝
Step 1. 釐清系統需求
🌚 - Non-Functional Requirements
🌞 - Functional Requirements
😀 - Communicate with different roles
🧐 - FE tends to need BE skills today
Step 2. 關於系統流量、容量、網路頻寬等指標的粗略估算
DAU ? Network Traffic ? Storage Capacity ?Memory For Caching ? Read : Write = {} : {} ?.......
Step 3. 定義 System Interface
uploadVideo(user_id, video_content, video_location, user_location, ……)
addVideoToFavorite(user_id, video_id, timestamp, …….)
Step 4. 定義 Data Model | DB Schema
User: UserID, Name, Email, DoB, CreationDate, LastLogin, ….
Video: VideoID,VideoLink, VideoLocation, NumberOfLikes, TimeStamp, …
Step 5. High Level Design
Step 6. System Detailed Design
Step 7. Find Trade Off And Try To Solve It
Never feel that your design is perfect
Let's design a Netflix like streaming system 🙈
Step 1. 釐清系統需求
🌞 - Functional Requirements
Step 1. 釐清系統需求
🌚 - Non-Functional Requirements
Step 2. 關於系統流量、容量、網路頻寬等指標的粗略估算
Total users: 1.5 Billion
DAU: 800 Million
A user view 5 videos per day on average
So, the total video view per second of our system will be...
800M * 5 / 86400 sec => 46K videos/sec
Step 2. 關於系統流量、容量、網路頻寬等指標的粗略估算
upload : view ratio => 1 : 200
46K / 200 => 230 videos/sec
230 videos uploaded per second
What about Storage ?
every minute 500 hours worth of videos are uploaded to our system
One minute of video needs 50MB of storage
Total storage needed for videos uploaded in a minute would be ...
500 hours * 60 min * 50MB => 1500 GB/min (25 GB/sec)
Step 2. 關於系統流量、容量、網路頻寬等指標的粗略估算
What about Bandwidth ?
With 500 hours of video uploads per minute , assuming uploading each minute of the video takes 10MB of the bandwidth, we would be getting 300GB of uploads every minute.
500 hours * 60 mins * 10MB => 300GB/min (5GB/sec)
Assuming an upload:view ratio of 1:200, we would need 1TB/s outgoing bandwidth.
Step 3. 定義 System Interface
Step 4. 定義 Data Model | DB Schema
Step 4. 定義 Data Model | DB Schema
Video metadata storage - MySQL
Videos metadata can be stored in a SQL database. The following information should be stored with each video
Step 5. High Level Design
Step 6. System Detailed Design
🌚 - Load Balancing
🎃 - Caching
🌍 - CDN
所以...你現在覺得...我當初面試表現的如何? 🙈
工商時間
Thank You 🙏
Reference
https://www.educative.io/courses/grokking-the-system-design-interview/xV26VjZ7yMl
https://reurl.cc/3aQnr0
https://www.educative.io/courses/grokking-adv-system-design-intvw
Thank You 🙏