Serverless RAG Chat with AppSync Events and Bedrock Knowledge Bases

When it comes to building serverless WebSocket APIs on AWS, there’s no shortage of options: API Gateway, IoT Core, AppSync GraphQL subscriptions, and now AppSync Events. Each option comes with its own level of control and complexity. I’ve found that AppSync Events to be simplest to work with. One of the interesting features of AppSync Events is its data sources capability. It lets you directly integrate to resources like DynamoDB, OpenSearch, Bedrock and Lambda. You can interact with these data sources using AppSyncJS (appsync’s own flavor of javascript). But to be totally fair, I lean toward direct lambda integration as it gives more control and makes the development and testing workflow more familiar, standard and manageable. Currently, Bedrock data source supports only the InvokeModel and Converse APIs. So, if you want to integrate with Knowledge Bases, the viable approach is to create a custom data source using Lambda. And that’s exactly what this blog post is about, we’ll walk through how to build this RAG-based chat application with AppSync Events and bedrock knowledge based using nodejs, TypeScript and Terraform. Solution overview Let’s take a look at how the whole setup fits together: Architecture overview The Knowledge Base is configured to use PostgreSQL as its vector store where we store the embeddings as well as the associated metadata of the documents we want to index. Using Postgres gives us control over the schema, the indexing strategy, and embedding format, all of which come in handy when fine-tuning a vector-based RAG setup. We’ve got the handleAppSyncEvents function directly integrated as a data source for the AppSync Events API. Its role is to process incoming events from AppSync and to invoke the retrieveAndGenerate from the Knowledge Base. This function is configured to be asynchronous (with the EVENT invocation type), which means AppSync doesn't wait for the function to complete before returning a response to the client. Once we receive a result from bedrock this function publishes a response back to the client’s response channel. AppSync Events supports multiple authorization methods to secure Event APIs, including API keys, Lambda authorizers, IAM, OpenID Connect, and Amazon Cognito user pools. In this setup, I’m using both Cognito user pools and IAM: Web clients use cognito for authentication And I chose IAM over using API key for publishing events from the handleAppSyncEvents function to AppSync, as it offers better security posture. One thing I appreciate in this setup: AppSync Events supports Web ACLs. That means you can easily layer in protections like rate limiting and IP filtering. It’s a nice edge over API Gateway WebSockets, which still doesn’t offer native WAF support. And tying it all together, the browser connects via WebSocket to AppSync, giving us a real-time, bidirectional channel, ideal for sending the models responses back to users in conversational interfaces. Let’s dive into the details of the solution; but if you’d like to jump straight to the complete implementation, you can find it here

May 9, 2025 - 16:09
 0
Serverless RAG Chat with AppSync Events and Bedrock Knowledge Bases

Photo by Jason Leung on Unsplash<br>

When it comes to building serverless WebSocket APIs on AWS, there’s no shortage of options: API Gateway, IoT Core, AppSync GraphQL subscriptions, and now AppSync Events. Each option comes with its own level of control and complexity. I’ve found that AppSync Events to be simplest to work with.

One of the interesting features of AppSync Events is its data sources capability. It lets you directly integrate to resources like DynamoDB, OpenSearch, Bedrock and Lambda. You can interact with these data sources using AppSyncJS (appsync’s own flavor of javascript). But to be totally fair, I lean toward direct lambda integration as it gives more control and makes the development and testing workflow more familiar, standard and manageable.

Currently, Bedrock data source supports only the InvokeModel and Converse APIs. So, if you want to integrate with Knowledge Bases, the viable approach is to create a custom data source using Lambda.

And that’s exactly what this blog post is about, we’ll walk through how to build this RAG-based chat application with AppSync Events and bedrock knowledge based using nodejs, TypeScript and Terraform.

Solution overview

Let’s take a look at how the whole setup fits together:

Architecture overviewArchitecture overview

The Knowledge Base is configured to use PostgreSQL as its vector store where we store the embeddings as well as the associated metadata of the documents we want to index. Using Postgres gives us control over the schema, the indexing strategy, and embedding format, all of which come in handy when fine-tuning a vector-based RAG setup.

We’ve got the handleAppSyncEvents function directly integrated as a data source for the AppSync Events API. Its role is to process incoming events from AppSync and to invoke the retrieveAndGenerate from the Knowledge Base. This function is configured to be asynchronous (with the EVENT invocation type), which means AppSync doesn't wait for the function to complete before returning a response to the client. Once we receive a result from bedrock this function publishes a response back to the client’s response channel.

AppSync Events supports multiple authorization methods to secure Event APIs, including API keys, Lambda authorizers, IAM, OpenID Connect, and Amazon Cognito user pools. In this setup, I’m using both Cognito user pools and IAM:

  • Web clients use cognito for authentication
  • And I chose IAM over using API key for publishing events from the handleAppSyncEvents function to AppSync, as it offers better security posture.

One thing I appreciate in this setup: AppSync Events supports Web ACLs. That means you can easily layer in protections like rate limiting and IP filtering. It’s a nice edge over API Gateway WebSockets, which still doesn’t offer native WAF support.

And tying it all together, the browser connects via WebSocket to AppSync, giving us a real-time, bidirectional channel, ideal for sending the models responses back to users in conversational interfaces.

Let’s dive into the details of the solution; but if you’d like to jump straight to the complete implementation, you can find it here