Why Netflix Doesn’t Trust Auto-Increment IDs: The Untold Power of UUIDs in a Distributed World

At first glance, an ID might seem like the most boring part of your application. It's just a unique identifier, right? But if you're building systems that scale - across regions, across teams, across microservices - your ID generation strategy can be the silent hero or the hidden landmine. And that’s exactly why companies like Netflix, Twitter, Stripe, and Shopify have ditched traditional auto-incrementing IDs in favor of UUIDs and Snowflake-like systems. Let’s explore why UUIDs are not just random gibberish, but a critical architectural decision in high-scale systems - and what lessons we can steal from the giants. The Problem With Auto-Increment IDs Auto-incrementing integers are deceptively simple and convenient. They work great when: You have a single database. You can guarantee a single source of truth. You're not worried about collisions across systems. But modern systems don’t live in that world anymore. The problems start to show when: You scale horizontally (e.g., microservices writing to different DBs). You have geo-redundant deployments. You ingest millions of concurrent events (e.g., Netflix's stream logs, Stripe’s transactions, Shopify’s orders). What breaks? ❌ Collisions and Race Conditions Multiple databases can't safely share an auto-increment counter without introducing locking or orchestration. ❌ Poor Mergeability Data from separate systems (say, multiple regions or services) becomes a nightmare to merge. ❌ Predictability Auto-increment IDs can leak sensitive information: How many users have signed up Volume of orders or transactions Sequence of operations In fact, I once worked with a system where simply knowing the current user ID could let you enumerate every customer in the database with /users/{id}. Enter UUIDs: Globally Unique by Design UUID (Universally Unique Identifier) is a 128-bit number used to uniquely identify information in distributed systems - no central authority needed. It looks like this: 123e4567-e89b-12d3-a456-426614174000 That randomness is not for show. It’s your ticket to generating globally unique IDs without coordination. There are different versions of UUIDs - each designed with a specific use-case in mind. UUID Version Description Use Case v1 Timestamp + MAC address Time-sortable, but leaks host info v4 Randomly generated Most common, but unordered v5 Hash of namespace and name Deterministic UUIDs v7 (new) Timestamp-first, random suffix Perfect for databases & logs Real-World Case Study: How Netflix Handles IDs Netflix’s backend is a polyglot architecture - microservices, Kafka streams, data lakes, global deployment zones, all generating billions of events per day. They don’t use UUID v4s directly. Instead, they use a Snowflake-inspired ID generation system, pioneered by Twitter and now a standard for distributed ID generation. A typical Netflix ID might be composed of: 41 bits of timestamp (milliseconds since epoch) 10 bits for machine ID 12 bits for sequence (to avoid collision in the same millisecond) This format: Keeps IDs unique across all instances. Makes them time-sortable for logs, metrics, and debugging. Avoids the database indexing problem that plagues UUID v4. ⚠️ Fun fact: The original Snowflake system could generate 4096 unique IDs per machine per millisecond. Why They Don’t Use Auto-Increment Auto-incremented IDs simply don’t work in an architecture with: Dozens of microservices writing independently Global regions that can go into active-active mode Systems that must work even when partially disconnected Netflix has no central place to "ask for the next ID." Doing so would create latency, single points of failure, and tight coupling. Instead, every instance can independently generate its own IDs - and they're still guaranteed to be unique and ordered. UUID v7: The Future Is Timestamped UUID v7 is gaining popularity because it solves many of the long-standing issues with UUID v4: Ordered generation = better database index locality Encodes time = more useful for logs, debugging, analytics Still decentralized = no coordination needed If you're designing a new system today, strongly consider UUID v7 or a Snowflake-style ID generator over v4 or auto-increment. Tip: If you're using PostgreSQL, UUID v7 can outperform v4 for large-scale insert-heavy workloads simply because index bloat is avoided. Bonus: When UUIDs Go Wrong Here’s a story from a startup I consulted with: They used UUID v4s for every table. Three years in, as the database grew, query performance dropped off a cliff. The culprit? Random insertion pattern caused by v4 UUIDs. Every insert was touching a new part of the index tree, resulting in write amplification and cache inefficiency. They ended up migrating to ULIDs (like v7) to regain performance. Lesson: Just because a UUID is “u

May 12, 2025 - 16:29
 0
Why Netflix Doesn’t Trust Auto-Increment IDs: The Untold Power of UUIDs in a Distributed World

At first glance, an ID might seem like the most boring part of your application. It's just a unique identifier, right?

But if you're building systems that scale - across regions, across teams, across microservices - your ID generation strategy can be the silent hero or the hidden landmine. And that’s exactly why companies like Netflix, Twitter, Stripe, and Shopify have ditched traditional auto-incrementing IDs in favor of UUIDs and Snowflake-like systems.

Let’s explore why UUIDs are not just random gibberish, but a critical architectural decision in high-scale systems - and what lessons we can steal from the giants.

The Problem With Auto-Increment IDs

Auto-incrementing integers are deceptively simple and convenient. They work great when:

  • You have a single database.
  • You can guarantee a single source of truth.
  • You're not worried about collisions across systems.

But modern systems don’t live in that world anymore. The problems start to show when:

  • You scale horizontally (e.g., microservices writing to different DBs).
  • You have geo-redundant deployments.
  • You ingest millions of concurrent events (e.g., Netflix's stream logs, Stripe’s transactions, Shopify’s orders).

What breaks?

❌ Collisions and Race Conditions

Multiple databases can't safely share an auto-increment counter without introducing locking or orchestration.

❌ Poor Mergeability

Data from separate systems (say, multiple regions or services) becomes a nightmare to merge.

❌ Predictability

Auto-increment IDs can leak sensitive information:

  • How many users have signed up
  • Volume of orders or transactions
  • Sequence of operations

In fact, I once worked with a system where simply knowing the current user ID could let you enumerate every customer in the database with /users/{id}.

Enter UUIDs: Globally Unique by Design

UUID (Universally Unique Identifier) is a 128-bit number used to uniquely identify information in distributed systems - no central authority needed.

It looks like this:

123e4567-e89b-12d3-a456-426614174000

That randomness is not for show. It’s your ticket to generating globally unique IDs without coordination.

There are different versions of UUIDs - each designed with a specific use-case in mind.

UUID Version Description Use Case
v1 Timestamp + MAC address Time-sortable, but leaks host info
v4 Randomly generated Most common, but unordered
v5 Hash of namespace and name Deterministic UUIDs
v7 (new) Timestamp-first, random suffix Perfect for databases & logs

Real-World Case Study: How Netflix Handles IDs

Netflix’s backend is a polyglot architecture - microservices, Kafka streams, data lakes, global deployment zones, all generating billions of events per day.

They don’t use UUID v4s directly.

Instead, they use a Snowflake-inspired ID generation system, pioneered by Twitter and now a standard for distributed ID generation.

A typical Netflix ID might be composed of:

  • 41 bits of timestamp (milliseconds since epoch)
  • 10 bits for machine ID
  • 12 bits for sequence (to avoid collision in the same millisecond)

This format:

  • Keeps IDs unique across all instances.
  • Makes them time-sortable for logs, metrics, and debugging.
  • Avoids the database indexing problem that plagues UUID v4.

⚠️ Fun fact: The original Snowflake system could generate 4096 unique IDs per machine per millisecond.

Why They Don’t Use Auto-Increment

Auto-incremented IDs simply don’t work in an architecture with:

  • Dozens of microservices writing independently
  • Global regions that can go into active-active mode
  • Systems that must work even when partially disconnected

Netflix has no central place to "ask for the next ID." Doing so would create latency, single points of failure, and tight coupling.

Instead, every instance can independently generate its own IDs - and they're still guaranteed to be unique and ordered.

UUID v7: The Future Is Timestamped

UUID v7 is gaining popularity because it solves many of the long-standing issues with UUID v4:

  • Ordered generation = better database index locality
  • Encodes time = more useful for logs, debugging, analytics
  • Still decentralized = no coordination needed

If you're designing a new system today, strongly consider UUID v7 or a Snowflake-style ID generator over v4 or auto-increment.

Tip: If you're using PostgreSQL, UUID v7 can outperform v4 for large-scale insert-heavy workloads simply because index bloat is avoided.

Bonus: When UUIDs Go Wrong

Here’s a story from a startup I consulted with:

They used UUID v4s for every table. Three years in, as the database grew, query performance dropped off a cliff. The culprit? Random insertion pattern caused by v4 UUIDs.

Every insert was touching a new part of the index tree, resulting in write amplification and cache inefficiency. They ended up migrating to ULIDs (like v7) to regain performance.

Lesson: Just because a UUID is “unique” doesn’t mean it’s “smart.”

Final Thoughts

IDs are often an afterthought in design - but they shouldn’t be. Netflix, Twitter, and others teach us that at scale, even your identifiers must be thoughtfully engineered.

If you’re building distributed systems, event-driven pipelines, or global-scale SaaS platforms, ditch auto-increment and random v4s. Embrace timestamped, sortable, decentralized IDs.

Your future self (and your infrastructure team) will thank you.

Footnote: Netflix doesn’t publicly publish the full internals of their ID generation, but numerous engineering talks, job postings, and system diagrams confirm that they use a variant of the Snowflake pattern for large-scale event tracking.

Further Reading: