Distributed File System
Topics
1
DFS
2
GFS
3
HDFS
Philosophy
Distrbute the storage with local like performance and experience
A distributed file system (DFS) is a file system that spans across multiple file servers or multiple locations
DFS
Img Url: https://scaleyourapp.com/wp-content/uploads/2022/01/distributed-client-server-min-1200x675.png
- Access to the same data from multiple locations
- Transparent local access
- Location independence
- Scale-out capabilities
- Fault tolerance
Why DFS ?
- Network File System (NFS)
- Google File System (GFS)
- Hadoop Distributed File System (HDFS)
- Colossus
Different DFS
- Distribution: First, a DFS distributes datasets across multiple clusters or nodes. Each node provides its own computing power, which enables a DFS to process the datasets in parallel.
How it works ?
- Replication: A DFS will also replicate datasets onto different clusters by copying the same pieces of information into multiple clusters. This helps the distributed file system to achieve fault tolerance.
How it works ?
NFS
Img Url: https://ars.els-cdn.com/content/image/3-s2.0-B9780124201583000186-f18-01-9780124201583.jpg
GFS
Stanford Deck: Link
HDFS
U Waterloo Deck: Link
HDFS Architecture
Img Url: https://www.interviewbit.com/blog/wp-content/uploads/2022/06/HDFS-Architecture-1024x550.png
Applications of HDFS
Industry | Use Cases |
---|---|
Finance | Stock predictions |
E-Commerce | product recommendations |
Social Media | Social graph analysis |
Ads | Conversions |
Healthcare | Patient data analysis |
Thank You!
Questions?
Palette
By arvind ram
Palette
- 77