$100K/day cloud bill isn't a Bug – it's by Design

Cloud platforms are built to scale. That’s their core feature — and their hidden risk. Every request to a cloud function, database, or storage API has a cost. If enough requests arrive, even legitimate-looking ones, the backend will scale automatically and incur that cost — and the account owner will receive the bill. This is not an exception. It is the intended behavior. Real Incidents of Cost-Based Abuse Several public cases illustrate how cloud billing can be exploited or spiral out of control: $100K in 24 hours via Firebase – A WebGL hosting app saw a sudden traffic spike and was billed over $100,000. The cloud service scaled perfectly. No failure occurred — other than financial. One public file in Firebase = $98K – A single shared file led to massive egress usage and a near six-figure bill. GCP DDoS → $100K+ projected bill – Valid-looking requests during a DDoS ran up charges with no way to stop them quickly. These examples — and many others — follow the same pattern: no security breach, just usage that scaled and billed exactly as designed. Why Protections Often Fail Rate limits are global and imprecise Most limits apply per service, not per client. For example: a database may be capped at 100 queries per second. If there are 100 legitimate clients and 1,000,000 automated attackers, legitimate users may not be served at all. Limits are hard to balance across services Every backend (DB, API, cache) needs separate tuning. Too tight = outages. Too loose = runaway costs. In distributed systems, this balance is nearly impossible. Budget alerts are too late Billing data can lag by 15 minutes to several hours. By the time alerts arrive, thousands of dollars may already be spent. Attackers look like users Tokens can be pulled from apps or frontends. Even time-limited tokens — like AWS pre-signed S3 URLs — can be refreshed by any client the attacker controls. Becoming a “legitimate client” is often as simple as making an HTTPS request. What Could Help? To protect against cost-based abuse, three mechanisms can be combined: 1. Per-client real-time quota enforcement Each client gets a monetary quota. Every request (log, DB op, message) deducts from it. Clients near their limit are automatically slowed or paused — without affecting others. 2. Proof-of-work before provisioning New clients must solve a computational puzzle before access. This cost is: Negligible (milliseconds) under normal use — for both real users and attackers Increased during abuse — e.g., if mass registrations occur The mechanism uses a pool of bcrypt hashes with a dynamic seed, difficulty, and verification target. More details here 3. Optional cleanup and usage-aware control Inactive clients can be dropped. Clients near quota can trigger backend checks (how fast was quota used, is usage organic, etc.). Note: this is app-specific and may require custom business logic. Outcome: Cost-Limited Scalability When every client has a cap and must do work to onboard: Abuse becomes expensive Real users aren't throttled globally Backend resources scale safely Alerts aren’t needed to stop financial loss — enforcement is automatic The attack surface shifts: instead of “can I make this API fail?”, it becomes “can I afford to keep sending requests?” Final Thought Clouds scale. And they bill. What they don’t do — by default — is distinguish between a valuable client and a costly one. Security doesn’t end at authentication. When requests generate cost, economic boundaries matter. Systems need a way to say “no” before the invoice says “too late.”

May 8, 2025 - 21:09

$100K/day cloud bill isn't a Bug – it's by Design

Cloud platforms are built to scale. That’s their core feature — and their hidden risk. Every request to a cloud function, database, or storage API has a cost. If enough requests arrive, even legitimate-looking ones, the backend will scale automatically and incur that cost — and the account owner will receive the bill.

This is not an exception. It is the intended behavior.

Real Incidents of Cost-Based Abuse

Several public cases illustrate how cloud billing can be exploited or spiral out of control:

$100K in 24 hours via Firebase – A WebGL hosting app saw a sudden traffic spike and was billed over $100,000. The cloud service scaled perfectly. No failure occurred — other than financial.
One public file in Firebase = $98K – A single shared file led to massive egress usage and a near six-figure bill.
GCP DDoS → $100K+ projected bill – Valid-looking requests during a DDoS ran up charges with no way to stop them quickly.

These examples — and many others — follow the same pattern: no security breach, just usage that scaled and billed exactly as designed.

Why Protections Often Fail

Rate limits are global and imprecise Most limits apply per service, not per client. For example: a database may be capped at 100 queries per second. If there are 100 legitimate clients and 1,000,000 automated attackers, legitimate users may not be served at all.

Limits are hard to balance across services Every backend (DB, API, cache) needs separate tuning. Too tight = outages. Too loose = runaway costs. In distributed systems, this balance is nearly impossible.

Budget alerts are too late Billing data can lag by 15 minutes to several hours. By the time alerts arrive, thousands of dollars may already be spent.

Attackers look like users Tokens can be pulled from apps or frontends. Even time-limited tokens — like AWS pre-signed S3 URLs — can be refreshed by any client the attacker controls.

Becoming a “legitimate client” is often as simple as making an HTTPS request.

What Could Help?

To protect against cost-based abuse, three mechanisms can be combined:

1. Per-client real-time quota enforcement Each client gets a monetary quota. Every request (log, DB op, message) deducts from it. Clients near their limit are automatically slowed or paused — without affecting others.

2. Proof-of-work before provisioning New clients must solve a computational puzzle before access. This cost is:

Negligible (milliseconds) under normal use — for both real users and attackers
Increased during abuse — e.g., if mass registrations occur

The mechanism uses a pool of bcrypt hashes with a dynamic seed, difficulty, and verification target. More details here

3. Optional cleanup and usage-aware control Inactive clients can be dropped. Clients near quota can trigger backend checks (how fast was quota used, is usage organic, etc.). Note: this is app-specific and may require custom business logic.