Unique ID Generators in Distributed Systems
Backend 讀書會
@Universe Tech
Unique IDs
What's the requirements of Unique IDs?
Unique IDs
What's the considerations of Unique IDs?
Unique IDs
- Length of ID
- Numeric?
- Sortable?
- Ability to differ from different machine?
- Collision Rate
- Amounts of per second
- Scalability
Unique IDs
Requirements in Case:
- IDs must be unique.
- IDs are numerical values only.
- IDs fit into 64-bit.
- IDs are ordered by date.
- Ability to generate over 10,000 unique IDs per second.
Unique IDs
Multi-Master Replication
Unique IDs
Multi-Master Replication
- auto_increment
Increase by k
Hard to scale with multiple data centers
IDs do not go up with time across multiple servers
It does not scale well when a server is added or removed
Unique IDs
- Universally Unique Identifier
- 128-bit long
Unique IDs
Unique IDs
Unique IDs
Ticket Server
Unique IDs
Ticket Server
Centralized auto_increment
- Numeric
- Single Point of Failure
- Poor Scalability
Unique IDs
Twitter's snowflake
64 bit long
- Numeric
- 2^41 can support 69 years of time
Datacenter IDs and machine IDs are chosen at the startup time
A machine can support a maximum of 4096 new IDs per ms
Unique IDs
Twitter's snowflake
Time clock synchronization is crucial
Cost on getting time from system is higher than random generation
Unique IDs
UIDGenerator by BaiDu
Leaf-Segment by 美团
Leaf-Snowflake by 美团
Seqsvr by Tencent
Design a URL Shorterner
URL Shortening
URL Redirecting
High Availability
Fault Tolerance
Design a URL Shorterner
- Write operation: 100 million URLs are generated per day.
Write operation per second: 100 million / 24 /3600 = 1160
Read operation: Assuming ratio of read operation to write operation is 10:1, read operation per second: 1160 * 10 = 11,600
Assuming the URL shortener service will run for 10 years, this means we must support 100 million * 365 * 10 = 365 billion records
Assume average URL length is 100.
Storage requirement over 10 years: 365 billion * 100 bytes * 10 years = 365 TB
Design a URL Shorterner
API Endpoint
URL shortening. To create a new short URL, a client sends a POST request, which contains one parameter: the original long URL. The API looks like this:
POST api/v1/data/shorten
request parameter: {longUrl: longURLString}
return shortURL
URL redirecting. To redirect a short URL to the corresponding long URL, a client sends a GET request. The API looks like this:
GET api/v1/shortUrl
Return longURL for HTTP redirection
Design a URL Shorterner
URL Redirecting
Design a URL Shorterner
URL Shortening
- Each longURL must be hashed to one hashValue
Each hashValue can be mapped back to the longURL
Design a URL Shorterner
Design for URL Shortening
- Data Model: <shortURL, longURL> mapping in RDB
Hash function
hash + collision resolution
base 62 conversion
Hash value length
[0-9, a-z, A-Z], containing 10 + 26 + 26 = 62 possible characters
62^n ≥ 365 billion
When n = 7, 62 ^ n = ~3.5 trillion
Design a URL Shorterner
Hash + Collision Resolution
Design a URL Shorterner
Hash + Collision Resolution
- Recursively append a new predefined string
Design a URL Shorterner
Base 62 conversion
- commonly used for URL shorteners
Design a URL Shorterner
Comparison of 2 Approaches
Design a URL Shorterner
Chosen Flow
Design a URL Shorterner
Chosen Flow
- Assuming the input longURL is: https://en.wikipedia.org/wiki/Systems_design
Unique ID generator returns ID: 2009215674938
Convert the ID to shortURL using the base 62 conversion. ID (2009215674938) is converted to "zn9edcu"
Design a URL Shorterner
Design for URL Redirecting
- Can be stored in a cache to improve performance
Design a URL Shorterner
- Databases: RDB or NoSQL?
- Scalability: Database and We
- How to check if a URL exists in DB efficiently?
- How to avoid malicious scanning?
- Short URL validation
- Rate Limiting
- How to cache effectively?
Unique ID Generator in Distributed Systems - Backend 讀書會
By Albert Chen
Unique ID Generator in Distributed Systems - Backend 讀書會
- 76