Designing a Scalable Caching Layer for User and Tenant Metadata in a Messaging System

I'm developing a microservice-based application that processes a high volume of messages. Each message must be handled according to the user’s personal settings and some tenant-specific (customer) properties — for example, formatting rules or footer content. In our B2B environment, every user belongs to a tenant, which we refer to as a “customer” in the application. Current Setup Each microservice handles messages independently and does the following: Lazily fetches user and customer objects the first time they're needed. Caches them in memory (using a HashMap) for future use. When a user or customer updates their data, the main API sends update notifications to the microservices, which then update their in-memory cache accordingly. Why Not Just Call the API Every Time? Messages arrive very frequently, and fetching user/customer info from the API on every message would: Significantly increase network traffic Introduce latency for each message due to the API call Potentially overload the central user/customer API service That’s why I opted for this local caching approach. Concerns While this strategy has improved performance and reduced load on the central API, I'm beginning to worry about scalability: The in-memory HashMap is growing large as we support more customers and users. This results in higher RAM usage in each service. I'm also questioning whether the notification/invalidation mechanism adds unnecessary complexity. What I’m Looking For I’m trying to find a better balance between: Low latency: Message handling should remain fast. Low memory usage: I want to avoid unbounded growth in memory consumption. Simplicity and reliability: The solution shouldn’t introduce unnecessary operational complexity. Questions Is this cache + notification approach reasonable for a high-throughput environment? Are there standard patterns or best practices for managing this type of per-user and per-tenant data in microservice architectures? Edit Customer objects about 100 - 200 Users per customer 10 - 30 Size of customer object: ~ 2 KB Size of user object: ~ 1.5 KB

May 28, 2025 - 20:00

I'm developing a microservice-based application that processes a high volume of messages. Each message must be handled according to the user’s personal settings and some tenant-specific (customer) properties — for example, formatting rules or footer content.

In our B2B environment, every user belongs to a tenant, which we refer to as a “customer” in the application.

Current Setup

Each microservice handles messages independently and does the following:

Lazily fetches user and customer objects the first time they're needed.
Caches them in memory (using a HashMap) for future use.
When a user or customer updates their data, the main API sends update notifications to the microservices, which then update their in-memory cache accordingly.

Why Not Just Call the API Every Time?

Messages arrive very frequently, and fetching user/customer info from the API on every message would:

Significantly increase network traffic
Introduce latency for each message due to the API call
Potentially overload the central user/customer API service

That’s why I opted for this local caching approach.

Concerns

While this strategy has improved performance and reduced load on the central API, I'm beginning to worry about scalability:

The in-memory HashMap is growing large as we support more customers and users. This results in higher RAM usage in each service. I'm also questioning whether the notification/invalidation mechanism adds unnecessary complexity.

What I’m Looking For

I’m trying to find a better balance between:

Low latency: Message handling should remain fast.
Low memory usage: I want to avoid unbounded growth in memory consumption. Simplicity and reliability: The solution shouldn’t introduce unnecessary operational complexity.

Questions

Is this cache + notification approach reasonable for a high-throughput environment?
Are there standard patterns or best practices for managing this type of per-user and per-tenant data in microservice architectures?

Edit

Customer objects about 100 - 200
Users per customer 10 - 30
Size of customer object: ~ 2 KB
Size of user object: ~ 1.5 KB