Optimized Email lookup for millions of users

Fast, Scalable, and Reliable — How to solve the email registration problem at scale for millions of users without compromising on performance or data integrity.so, here’s the deal. When you’re dealing with millions of users, one tiny problem starts eating your system alive — checking if an email already exists. It sounds small, but if you're doing it 10 million times? Boom, memory gone. Performance drops. It is not good. But don’t worry. We got this. Let’s mix a bit of RedisBloom and PostgreSQL, and build a system that’s fast, scalable, and doesn't chew up all your memory. RedisBloom: The probabilistic memory-saver Memory efficiency: It doesn’t store the full emails. It just uses a bit array and some math magic (Bloom Filter logic). So, 10 million emails = only ~30MB memory. Yup. Not kidding. Crazy fast: Check if an email exists in constant time, O(1). Like blink-of-an-eye fast. Lightweight: No need to store actual emails in Redis. Less load, more speed.\ Downside? Well... false positives are possible (~1%). RedisBloom’s cool, but to be honest — 1% false positive is still a problem. So, when RedisBloom thinks the email is already there, we just double-check in PostgreSQL. PostgreSQL: The Trusty Double Checker Accurate: It’s our final truth source. Fast checks: We use SELECT 1 instead of pulling the whole email to keep things lean. The Workflow: How it all comes together Check RedisBloom: Check if the email might exist. If RedisBloom says no — we’re good. Move forward and register. Check PostgreSQL: If RedisBloom says “maybe”, we run a simple SELECT 1 in Postgres. Just to be safe. Insert into Both: If the email doesn’t exist in Postgres, insert it in both Postgres and RedisBloom to keep them in sync. Why it will work Efficient: RedisBloom handles most checks. Saves RAM. Super snappy. Reliablity: PostgreSQL catches the rare false positive. You get 100% accuracy. Scalable: Works smooth with 100 or 100 million users. Future-proof. Secure: We hash emails (SHA-256) so we’re not storing plain data anywhere. Real World Impact If you’re dealing with around 10 million emails, using RedisBloom can literally save you hundreds of megabytes of memory. Instead of storing every email directly, RedisBloom squeezes all that down into around 30MB. That’s crazy efficient. On the other hand, PostgreSQL still does its job reliably, but only when we really need it. We’re not hammering the database for every check—only when RedisBloom gives us a “maybe.” And even then, we’re just doing a lightweight SELECT 1, which is super fast and doesn’t pull any unnecessary data. For performance — each RedisBloom check is almost instant, like around 0.001 to 0.005 milliseconds. PostgreSQL might take a tad longer, maybe 0.005 to 0.02 milliseconds, but again, that’s only for a small chunk of the checks. Combine them and even if you do 10 million checks, it’s done in under a few minutes. So, You're saving a lot of memory You're not burning your database with unnecessary query our system stays fast and clean even at a huge scale. Boom. That’s what good architecture looks like. SHA-256: Keeping emails private Don’t save raw emails. Before storing anything, we hash each email using SHA-256, so even if someone breaks into your DB, they won’t be able to read the email addresses. Please check the following code, I hope this is understandable. async function emailExists(email: string): Promise\ { const hash = hashEmail(email); const mightExist = await redis.bf.exists('email\_bloom', hash); if (!mightExist) return false; const res = await pg.query( 'SELECT 1 FROM registered\_emails WHERE email\_hash = $1', \[hash] ); return res.rowCount > 0; } async function registerEmail(email: string) { if (await emailExists(email)) { console.log('Email already exists.'); return; } const hash = hashEmail(email); await pg.query( 'INSERT INTO registered\_emails (email\_hash) VALUES ($1)', \[hash] ); await redis.bf.add('email\_bloom', hash); console.log('Email registered!'); } Summary Use RedisBloom for fast, memory-efficient lookups. Use PostgreSQL for accurate, final checks. Hash your emails with SHA-256 for privacy. Get the best of both worlds: speed and integrity. This setup is battle-tested. It’s like the perfect combo of fast-n-loose and strict-n-safe. It works at scale, it respects your resources, and your users are protected. Give it a try. Scale with confidence.

Apr 12, 2025 - 19:54

Optimized Email lookup for millions of users

Fast, Scalable, and Reliable — How to solve the email registration problem at scale for millions of users without compromising on performance or data integrity.so, here’s the deal.

When you’re dealing with millions of users, one tiny problem starts eating your system alive — checking if an email already exists. It sounds small, but if you're doing it 10 million times? Boom, memory gone. Performance drops. It is not good.

But don’t worry. We got this. Let’s mix a bit of RedisBloom and PostgreSQL, and build a system that’s fast, scalable, and doesn't chew up all your memory.

RedisBloom: The probabilistic memory-saver

Memory efficiency: It doesn’t store the full emails. It just uses a bit array and some math magic (Bloom Filter logic). So, 10 million emails = only ~30MB memory. Yup. Not kidding.
Crazy fast: Check if an email exists in constant time, O(1). Like blink-of-an-eye fast.
Lightweight: No need to store actual emails in Redis. Less load, more speed.\
Downside? Well... false positives are possible (~1%).

RedisBloom’s cool, but to be honest — 1% false positive is still a problem. So, when RedisBloom thinks the email is already there, we just double-check in PostgreSQL.

PostgreSQL: The Trusty Double Checker

Accurate: It’s our final truth source.
Fast checks: We use SELECT 1 instead of pulling the whole email to keep things lean.

The Workflow: How it all comes together

Check RedisBloom: Check if the email might exist. If RedisBloom says no — we’re good. Move forward and register.
Check PostgreSQL: If RedisBloom says “maybe”, we run a simple SELECT 1 in Postgres. Just to be safe.
Insert into Both: If the email doesn’t exist in Postgres, insert it in both Postgres and RedisBloom to keep them in sync.

Why it will work

Efficient: RedisBloom handles most checks. Saves RAM. Super snappy.
Reliablity: PostgreSQL catches the rare false positive. You get 100% accuracy.
Scalable: Works smooth with 100 or 100 million users. Future-proof.
Secure: We hash emails (SHA-256) so we’re not storing plain data anywhere.

Real World Impact

If you’re dealing with around 10 million emails, using RedisBloom can literally save you hundreds of megabytes of memory. Instead of storing every email directly, RedisBloom squeezes all that down into around 30MB. That’s crazy efficient.

On the other hand, PostgreSQL still does its job reliably, but only when we really need it. We’re not hammering the database for every check—only when RedisBloom gives us a “maybe.” And even then, we’re just doing a lightweight SELECT 1, which is super fast and doesn’t pull any unnecessary data.

For performance — each RedisBloom check is almost instant, like around 0.001 to 0.005 milliseconds. PostgreSQL might take a tad longer, maybe 0.005 to 0.02 milliseconds, but again, that’s only for a small chunk of the checks. Combine them and even if you do 10 million checks, it’s done in under a few minutes. So,

You're saving a lot of memory
You're not burning your database with unnecessary query
our system stays fast and clean even at a huge scale.

Boom. That’s what good architecture looks like.

SHA-256: Keeping emails private

Don’t save raw emails. Before storing anything, we hash each email using SHA-256, so even if someone breaks into your DB, they won’t be able to read the email addresses. Please check the following code, I hope this is understandable.

async function **emailExists**(email: string): Promise\ {
    const hash = hashEmail(email);
    const mightExist = await redis.bf.exists('email\_bloom', hash);
    if (!mightExist) return false;
    const res = await pg.query(
        'SELECT 1 FROM registered\_emails WHERE email\_hash = $1', \[hash]
    );
    return res.rowCount > 0;
}

async function **registerEmail**(email: string) {
    if (await emailExists(email)) {
      console.log('Email already exists.');
      return;
    }
    const hash = hashEmail(email);
    await pg.query(
       'INSERT INTO registered\_emails (email\_hash) VALUES ($1)', \[hash]
    );
    await redis.bf.add('email\_bloom', hash);
    console.log('Email registered!');

}

Summary

Use RedisBloom for fast, memory-efficient lookups.
Use PostgreSQL for accurate, final checks.
Hash your emails with SHA-256 for privacy.
Get the best of both worlds: speed and integrity.

This setup is battle-tested. It’s like the perfect combo of fast-n-loose and strict-n-safe. It works at scale, it respects your resources, and your users are protected.

Give it a try. Scale with confidence.