Session 3.7: Object Storage & Distributed File Systems

Course → Module 3: Storage, Databases & Caching

Three Kinds of Storage

When engineers say "storage," they might mean three fundamentally different things. Block storage gives you raw disk volumes. File storage gives you a hierarchical file system. Object storage gives you a flat namespace of immutable blobs with metadata. Each model serves different workloads, and choosing wrong means paying too much, building unnecessary complexity, or hitting walls that should not exist.

Dimension	Block Storage	File Storage	Object Storage
Abstraction	Raw disk (sectors, volumes)	Hierarchical (directories, files)	Flat (bucket + key + metadata)
Access pattern	Random read/write, low latency	POSIX file operations	HTTP GET/PUT, whole-object
Mutability	Mutable (overwrite bytes in place)	Mutable (edit files)	Immutable (replace entire object)
Scalability	Limited by volume size (16 TB typical)	Limited by file system (PB range)	Virtually unlimited
Durability	99.999% (replicated volumes)	Depends on implementation	99.999999999% (11 nines, S3)
Cost	$$$ (EBS gp3: $0.08/GB/mo)	$$ (EFS: $0.30/GB/mo)	$ (S3 Standard: $0.023/GB/mo)
Typical use	Databases, boot volumes, OLTP	Shared file access, CMS, legacy apps	Media, backups, data lakes, static assets

Object Storage: S3-Style

Amazon S3 is the canonical object store, and its API has become a de facto standard (MinIO, Cloudflare R2, and Backblaze B2 all implement S3-compatible APIs). The model is simple: you have buckets (containers) and objects (files with metadata). Every object has a unique key within its bucket. You interact with it over HTTP.

Object storage treats data as immutable blobs accessed by key. You cannot edit byte 500 of a file. You replace the entire object. This constraint enables massive durability (11 nines), global replication, and near-infinite scale at low cost.

Objects in S3 are replicated across at least 3 availability zones automatically. The durability guarantee of 99.999999999% means that if you store 10 million objects, you can statistically expect to lose one every 10,000 years. No database offers this durability at this cost.

Storage Cost Comparison

The price difference between storage tiers is dramatic. Choosing the right tier based on access frequency saves substantial money at scale.

EBS block storage costs 80x more per GB than Glacier Deep Archive. Even S3 Standard is 3.5x cheaper than EBS. The tradeoff is access latency: EBS gives you sub-millisecond random reads, while Glacier Deep Archive can take 12 hours to retrieve data. Choose your tier based on how often you access the data, not how important it is.

Pre-Signed URLs

A pre-signed URL is a time-limited URL that grants temporary access to a private S3 object. Your backend generates the URL using its AWS credentials, and the client uses it to upload or download directly from S3. The backend never touches the file bytes.

sequenceDiagram participant Client participant Backend participant S3 Client->>Backend: POST /upload-request (filename, content-type) Backend->>Backend: Generate pre-signed PUT URL (expires in 5 min) Backend-->>Client: Return pre-signed URL + object key Client->>S3: PUT file directly to pre-signed URL S3-->>Client: 200 OK Client->>Backend: POST /upload-complete (object key) Backend->>Backend: Save object key in database

This pattern has three advantages. First, the file upload goes directly from the client to S3, so your backend servers do not handle large file transfers. This saves bandwidth, memory, and CPU. Second, the pre-signed URL expires, so even if it leaks, the exposure window is short. Third, S3 handles multipart uploads, retries, and storage durability. Your backend stays simple.

For downloads, the same pattern applies. The backend generates a pre-signed GET URL, and the client downloads directly from S3. To serve files even faster, put a CDN (CloudFront, Cloudflare) in front of the S3 bucket. The CDN caches popular objects at edge locations close to users, reducing latency from hundreds of milliseconds to single digits.

CDN Integration

The standard architecture for serving static assets at scale combines object storage with a CDN:

On the first request, the CDN edge fetches the object from S3 and caches it. Subsequent requests for the same object are served from the edge, which is geographically close to the user. For popular content (profile photos, product images, CSS files), the cache hit rate typically exceeds 95%. You pay S3 retrieval costs only on cache misses.

Distributed File Systems: HDFS and GFS

For big data workloads (log processing, analytics, machine learning training), the access pattern is different. You need to read and write very large files (gigabytes to terabytes) sequentially, and you want computation to move to the data rather than the other way around.

HDFS (Hadoop Distributed File System) splits large files into fixed-size blocks (128 MB default), replicates each block across 3 nodes, and tracks block locations in a central NameNode. Computation frameworks like MapReduce and Spark process data on the nodes where it lives, avoiding network transfer.

HDFS was inspired by Google's GFS paper (2003). The design optimizes for throughput over latency: sequential reads of entire blocks are fast, but random access to individual bytes is not supported. The NameNode is a single point of failure (mitigated by standby NameNodes in production), and the system is designed for append-only writes. You do not edit files in place.

HDFS and object storage serve different needs. Object storage is for serving individual files to many users (a profile photo, a PDF). HDFS is for processing massive datasets across a compute cluster. In modern architectures, the line is blurring. Systems like Apache Iceberg and Delta Lake store data as Parquet files in S3, combining the durability of object storage with the query patterns of big data.

Systems Thinking Lens

Storage is a system with competing feedback loops. The cost loop pushes data toward colder, cheaper tiers. The latency loop pulls frequently accessed data toward hotter, faster tiers. The durability loop demands replication, which multiplies cost. A well-designed storage strategy does not pick one tier. It classifies data by access pattern and moves it through tiers automatically (S3 Intelligent-Tiering does this for $0.0025 per 1,000 objects per month).

The leverage point is lifecycle policy, not storage selection. The decision of where data starts is less important than the rules that govern where it moves over time.

Assignment

Users of your application upload profile photos. Currently, photos are uploaded to your backend server, which saves them to local disk. The application has 500,000 users, and about 10,000 photos are uploaded per day. Average photo size is 2 MB.

Design a new upload flow using object storage. Answer these questions:

Where does the image go? Which storage service and tier do you choose, and why?
How does the client get the upload URL? Walk through the full request flow from "user clicks upload" to "photo is stored."
How do you serve the photo fast to users around the world? Draw the read path.
Estimate the monthly storage cost after one year, assuming no deletions and 2 MB average size with 10,000 uploads per day.

Object Storage & Distributed File Systems