Copyright © 直通硅谷
http://www.zhitongguigu.com
Copyright © 直通硅谷
http://www.zhitongguigu.com
Copyright © 直通硅谷
http://www.zhitongguigu.com
Copyright © 直通硅谷
http://www.zhitongguigu.com
Copyright © 直通硅谷
http://www.zhitongguigu.com
Copyright © 直通硅谷
http://www.zhitongguigu.com
What is the key point for this question?
Database Design
Copyright © 直通硅谷
http://www.zhitongguigu.com
What do we need?
First: Extract basic elements
User, Post, Profile, Reply
Copyright © 直通硅谷
http://www.zhitongguigu.com
Copyright © 直通硅谷
http://www.zhitongguigu.com
Design endpoint to fulfill these requirements
Backend: How to join the table to give the results
Copyright © 直通硅谷
http://www.zhitongguigu.com
Copyright © 直通硅谷
http://www.zhitongguigu.com
Apache
Nginx
Lighttpd
IIS
Tomcat
Jetty
Netty
Copyright © 直通硅谷
http://www.zhitongguigu.com
Copyright © 直通硅谷
http://www.zhitongguigu.com
Copyright © 直通硅谷
http://www.zhitongguigu.com
Copyright © 直通硅谷
http://www.zhitongguigu.com
Copyright © 直通硅谷
http://www.zhitongguigu.com
Database Sharding
Copyright © 直通硅谷
http://www.zhitongguigu.com
Scaling up
Copyright © 直通硅谷
http://www.zhitongguigu.com
Load Balancer
Copyright © 直通硅谷
http://www.zhitongguigu.com
Server & Database Replication
Copyright © 直通硅谷
http://www.zhitongguigu.com
Caching
Fetching from DB is slow
Copyright © 直通硅谷
http://www.zhitongguigu.com
CDN
Content Distributed Network
Copyright © 直通硅谷
http://www.zhitongguigu.com
Copyright © 直通硅谷
http://www.zhitongguigu.com
Copyright © 直通硅谷
http://www.zhitongguigu.com
Copyright © 直通硅谷
http://www.zhitongguigu.com
In Twitter, Weibo, you have charactor limit. Also normal URL will contain some invalid ASCII code that you cannot put into the content.
So a lot of company has its own Short URL
Example: http://t.cn/RGKGR3T
Copyright © 直通硅谷
http://www.zhitongguigu.com
Design A System that can generate this Short URL
Copyright © 直通硅谷
http://www.zhitongguigu.com
UUID
MD5
Are these too long? Could we find a better way?
Think about the requirement
one URL could be used released after some day
Copyright © 直通硅谷
http://www.zhitongguigu.com
We do not generate a short URL everytime, we just lookup the table
Copyright © 直通硅谷
http://www.zhitongguigu.com
Apache Tomcat
Copyright © 直通硅谷
http://www.zhitongguigu.com
Short URL -> Long URL, time
Pretty Easy? If we have a high QPS?
Copyright © 直通硅谷
http://www.zhitongguigu.com
Relational Database: Oracle, MySql
Non-Relational Database (NO Sql): MangoDB, CouchDB, Couchbase, Redis
Copyright © 直通硅谷
http://www.zhitongguigu.com
RDBMS(Relational Database Management System)
Copyright © 直通硅谷
http://www.zhitongguigu.com
NoSQL (Not Only SQL)
Copyright © 直通硅谷
http://www.zhitongguigu.com
Copyright © 直通硅谷
http://www.zhitongguigu.com
WE could just use NoSQL
Reason: we do not have complicated query to fetch data, we only need original URL and short URL (timestamp maybe)
Copyright © 直通硅谷
http://www.zhitongguigu.com
How to update the Store?
How to remove them?
Copyright © 直通硅谷
http://www.zhitongguigu.com
short -> Long
short -> time
time -> short
Is it okay?
Copyright © 直通硅谷
http://www.zhitongguigu.com
If we could stretch the time
short -> Long
short -> time
Daily data purge job
Copyright © 直通硅谷
http://www.zhitongguigu.com
Do we really need to be accurate to seconds to purge the data?
Since we have a daily job, it seems not necassary
Messaging System
Apache Kafka
Copyright © 直通硅谷
http://www.zhitongguigu.com
Relational Database cannot handle high reading QPS
NOSQL has a lot limit in relation and maintenance
So sometimes we just hybrid them together
Copyright © 直通硅谷
http://www.zhitongguigu.com
For common get queries, we store them in NOSQL
For specific queries and write operation, we still use RDBMS
How to update NOSQL? Event Listener
Copyright © 直通硅谷
http://www.zhitongguigu.com
Database: Relational and NoSql -> consider to use which one
Network: Apache, Tomcat, etc -> Rest API, JSON
Distributed System: Load balancing, Sticky Session
Security: HTTPS, Encryption/Decryption