Designing a Scalable Caching Layer for User and Tenant Metadata in a Messaging System
I'm developing a microservice-based application that processes a high volume of messages. Each message must be handled according to the user’s personal settings and some tenant-specific (customer) properties — for example, formatting rules or footer content. In our B2B environment, every user belongs to a tenant, which we refer to as a “customer” in the application. Current Setup Each microservice handles messages independently and does the following: Lazily fetches user and customer objects the first time they're needed. Caches them in memory (using a HashMap) for future use. When a user or customer updates their data, the main API sends update notifications to the microservices, which then update their in-memory cache accordingly. Why Not Just Call the API Every Time? Messages arrive very frequently, and fetching user/customer info from the API on every message would: Significantly increase network traffic Introduce latency for each message due to the API call Potentially overload the central user/customer API service That’s why I opted for this local caching approach. Concerns While this strategy has improved performance and reduced load on the central API, I'm beginning to worry about scalability: The in-memory HashMap is growing large as we support more customers and users. This results in higher RAM usage in each service. I'm also questioning whether the notification/invalidation mechanism adds unnecessary complexity. What I’m Looking For I’m trying to find a better balance between: Low latency: Message handling should remain fast. Low memory usage: I want to avoid unbounded growth in memory consumption. Simplicity and reliability: The solution shouldn’t introduce unnecessary operational complexity. Questions Is this cache + notification approach reasonable for a high-throughput environment? Are there standard patterns or best practices for managing this type of per-user and per-tenant data in microservice architectures? Edit Customer objects about 100 - 200 Users per customer 10 - 30 Size of customer object: ~ 2 KB Size of user object: ~ 1.5 KB
I'm developing a microservice-based application that processes a high volume of messages. Each message must be handled according to the user’s personal settings and some tenant-specific (customer) properties — for example, formatting rules or footer content.
In our B2B environment, every user belongs to a tenant, which we refer to as a “customer” in the application.
Current Setup
Each microservice handles messages independently and does the following:
- Lazily fetches user and customer objects the first time they're needed.
- Caches them in memory (using a HashMap) for future use.
- When a user or customer updates their data, the main API sends update notifications to the microservices, which then update their in-memory cache accordingly.
Why Not Just Call the API Every Time?
Messages arrive very frequently, and fetching user/customer info from the API on every message would:
- Significantly increase network traffic
- Introduce latency for each message due to the API call
- Potentially overload the central user/customer API service
That’s why I opted for this local caching approach.
Concerns
While this strategy has improved performance and reduced load on the central API, I'm beginning to worry about scalability:
The in-memory HashMap is growing large as we support more customers and users. This results in higher RAM usage in each service. I'm also questioning whether the notification/invalidation mechanism adds unnecessary complexity.
What I’m Looking For
I’m trying to find a better balance between:
- Low latency: Message handling should remain fast.
- Low memory usage: I want to avoid unbounded growth in memory consumption. Simplicity and reliability: The solution shouldn’t introduce unnecessary operational complexity.
Questions
- Is this cache + notification approach reasonable for a high-throughput environment?
- Are there standard patterns or best practices for managing this type of per-user and per-tenant data in microservice architectures?
Edit
- Customer objects about 100 - 200
- Users per customer 10 - 30
- Size of customer object: ~ 2 KB
- Size of user object: ~ 1.5 KB