Unique ID Generators in Distributed Systems

Backend 讀書會

11/01

15:00~16:30

@Universe Tech

Unique IDs

What's the requirements of Unique IDs?

Unique IDs

What's the considerations of Unique IDs?

Unique IDs

  • Considerations

    • Length of ID
    • Numeric?
    • Sortable?
    • Ability to differ from different machine?
    • Collision Rate
    • Amounts of per second
    • Scalability

Unique IDs

  • Requirements in Case:

    • ​​​​​​​IDs must be unique.
    • IDs are numerical values only.
    • IDs fit into 64-bit.
    • IDs are ordered by date.
    • Ability to generate over 10,000 unique IDs per second.

Unique IDs

  • Multi-Master Replication

Unique IDs

  • Multi-Master Replication

    • auto_increment
    • Increase by k

    • Hard to scale with multiple data centers

    • IDs do not go up with time across multiple servers

    • It does not scale well when a server is added or removed

Unique IDs

  • UUID

    • Universally Unique Identifier
    • 128-bit long

Unique IDs

  • UUID v1

Unique IDs

  • UUID v4

Unique IDs

  • Ticket Server

Unique IDs

  • Ticket Server

    • Centralized auto_increment

    • Numeric
    • Single Point of Failure
    • Poor Scalability

Unique IDs

  • Twitter's snowflake

    • 64 bit long

    • Numeric
    • 2^41 can support 69 years of time
    • Datacenter IDs and machine IDs are chosen at the startup time

    • A machine can support a maximum of 4096 new IDs per ms

Unique IDs

  • Twitter's snowflake

    • Time clock synchronization is crucial

    • Cost on getting time from system is higher than random generation

Unique IDs

  • Alternatives

    • UIDGenerator by BaiDu

    • Leaf-Segment by 美团

    • Leaf-Snowflake by 美团

    • Seqsvr by Tencent

Design a URL Shorterner

  • Requirements

    • URL Shortening

    • URL Redirecting

    • High Availability

    • Scalability

    • Fault Tolerance

Design a URL Shorterner

  • Estimation

    • Write operation: 100 million URLs are generated per day.
    • Write operation per second: 100 million / 24 /3600 = 1160

    • Read operation: Assuming ratio of read operation to write operation is 10:1, read operation per second: 1160 * 10 = 11,600

    • Assuming the URL shortener service will run for 10 years, this means we must support 100 million * 365 * 10 = 365 billion records

    • Assume average URL length is 100.

    • Storage requirement over 10 years: 365 billion * 100 bytes * 10 years = 365 TB

Design a URL Shorterner

  • API Endpoint

    • URL shortening. To create a new short URL, a client sends a POST request, which contains one parameter: the original long URL. The API looks like this:

      • POST api/v1/data/shorten

        • request parameter: {longUrl: longURLString}

        • return shortURL

    • URL redirecting. To redirect a short URL to the corresponding long URL, a client sends a GET request. The API looks like this:

      • GET api/v1/shortUrl

        • Return longURL for HTTP redirection

Design a URL Shorterner

  • URL Redirecting

Design a URL Shorterner

  • URL Shortening

    • Each longURL must be hashed to one hashValue
    • Each hashValue can be mapped back to the longURL

Design a URL Shorterner

  • Design for URL Shortening

    • Data Model: <shortURL, longURL> mapping in RDB
    • Hash function

      • hash + collision resolution

      • base 62 conversion

    • Hash value length

      • [0-9, a-z, A-Z], containing 10 + 26 + 26 = 62 possible characters

      • 62^n ≥ 365 billion

      • When n = 7, 62 ^ n = ~3.5 trillion

Design a URL Shorterner

  • Hash + Collision Resolution

Design a URL Shorterner

  • Hash + Collision Resolution

    • Recursively append a new predefined string

Design a URL Shorterner

  • Base 62 conversion

    • commonly used for URL shorteners
    • https://tinyurl.com/2TX

Design a URL Shorterner

  • Comparison of 2 Approaches

Design a URL Shorterner

  • Chosen Flow

Design a URL Shorterner

  • Chosen Flow

    • Assuming the input longURL is: https://en.wikipedia.org/wiki/Systems_design
    • Unique ID generator returns ID: 2009215674938

    • Convert the ID to shortURL using the base 62 conversion. ID (2009215674938) is converted to "zn9edcu"

Design a URL Shorterner

  • Design for URL Redirecting

    • Can be stored in a cache to improve performance

Design a URL Shorterner

  • Discussions

    • Databases: RDB or NoSQL?
    • Scalability: Database and We
    • How to check if a URL exists in DB efficiently?
    • How to avoid malicious scanning?
      • Short URL validation
      • Rate Limiting
    • How to cache effectively?

Discussion

Unique ID Generator in Distributed Systems - Backend 讀書會

By Albert Chen

Unique ID Generator in Distributed Systems - Backend 讀書會

  • 76