Authors: Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson Hsieh, et al.
Published: 2006, OSDI
Link: research.google/pubs/pub27898
ππ Whatβs the Problem?
Google needed a scalable, highly available storage system that could handle petabytes of structured data β across thousands of machines β for diverse workloads like web indexing, Google Earth, and personalization data.
ππ‘ Core Idea
Bigtable introduced a sparse, distributed, persistent multidimensional sorted map β where:
(row key, column key, timestamp) β value
Rows are lexicographically ordered. Columns are grouped into families. Versions are timestamped.
This flexible model allows efficient range scans, time-versioned data, and horizontally scalable reads/writes across commodity hardware.
πβοΈ How It Works (Short Version)
- Data Model: Schema-less, but structured β supports semi-structured use cases.
- Storage: SSTables (Sorted String Tables) stored on GFS.
- Indexing: MemTable (in-memory), flushed into immutable SSTables.
- Tablet Splits: Rows are grouped into tablets, which are dynamically split and migrated.
- Coordination: Uses Chubby (Googleβs Paxos-based lock service) to elect a tablet master and coordinate metadata.
ππ€ What I Liked
- The column-family model anticipates today's needs for schema evolution and sparse data.
- The use of timestamped cells makes versioned datasets first-class.
- Clear modularity: storage (SSTable), coordination (Chubby), and serving (tablet servers) are separated for scaling.
ππ§± Mental Models or Lessons
- Think of Bigtable as a distributed, time-aware hash map that can efficiently handle both OLTP and analytical workloads.
- This paper shows that simplicity in design (e.g., sorted string tables) scales better than early general-purpose DBMSs.
- Isolation of responsibility boundaries (storage, consistency, serving) is a design that persists in modern data systems.