MCP Server with AWS Lambda and HTTP Api Gateway

Photo by Mika Baumeister on Unsplash For a "long" time (long in the AI terms) the serverless approach didn't really fit the Model Context Protocol idea. MCP was designed with statefulness in mind, and the ephemeral computing didn't match the initial requirements. The situation changed with the latest protocol update, which replaced SSE + HTTP transport with Streamable HTTP transport. This is a significant update, which opened up new possibilities. Now MCP server can be stateless or can handle state using sessionId from the request header. It means that using a serverless approach doesn't require any tricks from our side. Goal I build an MCP Server and deploy it as the AWS Lambda function behind HTTP Gateway. For the moment I am writing the blog post, the Streamable HTTP transport is available out-of-the-box only in the TypeScript sdk. For other languages, there is already work in progress for implementing the latest changes, so there is a huge chance that when you read this, you can pick another SDK. I create an MCP Server that acts as a prompt library. It will use S3 as a storage for prompt templates, and allow dynamic prompt creation. Architecture I use AWS CDK to define infrastructure. Code for this blog post is available HERE Server function The TypeScript SDK for MCP has great documentation, including examples. I will use one of them as a starting point. After creating a directory and installing dependencies, I create a function for the MCP server initialization // server/server.ts import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js' import { GetPromptResult } from '@modelcontextprotocol/sdk/types'; import { z } from 'zod'; export const initializeServer = (): McpServer => { // Create an MCP server with implementation details const server = new McpServer({ name: 'prompt-gallery-server', version: '1.0.0', }, { capabilities: { logging: {} } }); // Register a simple prompt server.prompt( 'greeting-template', 'A simple greeting prompt template', { name: z.string().describe('Name to include in greeting'), }, async ({ name }): Promise => { return { messages: [ { role: 'user', content: { type: 'text', text: `Please greet ${name} in a friendly manner.`, }, }, ], }; } ); return server } And I use it for handling requests in the express app // main.ts import express, { Request, Response } from "express"; import { initializeServer } from "./server/server"; import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js"; const app = express(); app.use(express.json()); app.post("/mcp", async (req, res) => { const server = initializeServer(); try { const transport = new StreamableHTTPServerTransport({ sessionIdGenerator: undefined, }); await server.connect(transport); await transport.handleRequest(req, res, req.body); req.on("close", () => { console.log("Request closed"); transport.close(); server.close(); }); } catch (e) { console.error("Error handling MCP request:", e); if (!res.headersSent) { res.status(500).json({ jsonrpc: "2.0", error: { code: -32603, message: "Internal server error", }, id: null, }); } } }); app.get('/mcp', async (req: Request, res: Response) => { console.log('Received GET MCP request'); res.writeHead(405).end(JSON.stringify({ jsonrpc: "2.0", error: { code: -32000, message: "Method not allowed." }, id: null })); }); app.delete('/mcp', async (req: Request, res: Response) => { console.log('Received DELETE MCP request'); res.writeHead(405).end(JSON.stringify({ jsonrpc: "2.0", error: { code: -32000, message: "Method not allowed." }, id: null })); }); const PORT = 3000; app.listen(PORT, () => { console.log(`MCP server running on ${PORT}`) }) Ok, now let's run npx ts-node main.ts Local testing Once it is running, I want to test it. The most convenient way is to use MCP Inspector, which will test our server against specific rules expected by the protocol. When you read this blog post, you can probably run inspector directly using npx, but at this day I need to clone the repo and run npm run dev to be able to pick the HTTP Streamable connection Great, I can connect to my server, and I see that it offers prompts. I can list available prompts, and call the only one available with the name parameter Cool. So far, so good! Cloud infrastructure We need some extra steps before deploying the Express app as a Lambda function. AWS Lambda has its own mechanism to p

May 4, 2025 - 18:52

MCP Server with AWS Lambda and HTTP Api Gateway

For a "long" time (long in the AI terms) the serverless approach didn't really fit the Model Context Protocol idea. MCP was designed with statefulness in mind, and the ephemeral computing didn't match the initial requirements.

The situation changed with the latest protocol update, which replaced SSE + HTTP transport with Streamable HTTP transport. This is a significant update, which opened up new possibilities. Now MCP server can be stateless or can handle state using sessionId from the request header. It means that using a serverless approach doesn't require any tricks from our side.

Goal

I build an MCP Server and deploy it as the AWS Lambda function behind HTTP Gateway. For the moment I am writing the blog post, the Streamable HTTP transport is available out-of-the-box only in the TypeScript sdk. For other languages, there is already work in progress for implementing the latest changes, so there is a huge chance that when you read this, you can pick another SDK.

I create an MCP Server that acts as a prompt library. It will use S3 as a storage for prompt templates, and allow dynamic prompt creation.

Architecture

I use AWS CDK to define infrastructure.

Code for this blog post is available HERE

Server function

The TypeScript SDK for MCP has great documentation, including examples. I will use one of them as a starting point.

After creating a directory and installing dependencies, I create a function for the MCP server initialization

// server/server.ts
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js'
import { GetPromptResult } from '@modelcontextprotocol/sdk/types';
import { z } from 'zod';

export const initializeServer = (): McpServer => {
    // Create an MCP server with implementation details
    const server = new McpServer({
      name: 'prompt-gallery-server',
      version: '1.0.0',
    }, { capabilities: { logging: {} } });

     // Register a simple prompt
    server.prompt(
        'greeting-template',
        'A simple greeting prompt template',
        {
        name: z.string().describe('Name to include in greeting'),
        },
        async ({ name }): Promise<GetPromptResult> => {
        return {
            messages: [
            {
                role: 'user',
                content: {
                type: 'text',
                text: `Please greet ${name} in a friendly manner.`,
                },
            },
            ],
        };
        }
    );

    return server
}

And I use it for handling requests in the express app

// main.ts
import express, { Request, Response } from "express";
import { initializeServer } from "./server/server";
import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js";

const app = express();
app.use(express.json());

app.post("/mcp", async (req, res) => {
  const server = initializeServer();
  try {
    const transport = new StreamableHTTPServerTransport({
      sessionIdGenerator: undefined,
    });

    await server.connect(transport);

    await transport.handleRequest(req, res, req.body);

    req.on("close", () => {
      console.log("Request closed");
      transport.close();
      server.close();
    });
  } catch (e) {
    console.error("Error handling MCP request:", e);
    if (!res.headersSent) {
      res.status(500).json({
        jsonrpc: "2.0",
        error: {
          code: -32603,
          message: "Internal server error",
        },
        id: null,
      });
    }
  }
});

app.get('/mcp', async (req: Request, res: Response) => {
    console.log('Received GET MCP request');
    res.writeHead(405).end(JSON.stringify({
      jsonrpc: "2.0",
      error: {
        code: -32000,
        message: "Method not allowed."
      },
      id: null
    }));
  });

  app.delete('/mcp', async (req: Request, res: Response) => {
    console.log('Received DELETE MCP request');
    res.writeHead(405).end(JSON.stringify({
      jsonrpc: "2.0",
      error: {
        code: -32000,
        message: "Method not allowed."
      },
      id: null
    }));
  });

  const PORT = 3000;

  app.listen(PORT, () => {
    console.log(`MCP server running on ${PORT}`)
  })

Ok, now let's run

npx ts-node main.ts

Local testing

Once it is running, I want to test it. The most convenient way is to use MCP Inspector, which will test our server against specific rules expected by the protocol.

When you read this blog post, you can probably run inspector directly using npx, but at this day I need to clone the repo and run npm run dev to be able to pick the HTTP Streamable connection

Great, I can connect to my server, and I see that it offers prompts. I can list available prompts, and call the only one available with the name parameter

Cool. So far, so good!

Cloud infrastructure

We need some extra steps before deploying the Express app as a Lambda function. AWS Lambda has its own mechanism to pass requests to the function, as it can be called in many different ways. Luckily, there is an awesome AWS Lambda web adapter that allows deploying web applications as a lambda function and provides a proxy to translate the request from HTTP/REST Gateway to the shape expected by a web application.

Lambda Web Adapter is an extension written in Rust, which is added to the function as a layer. In the infrastructure definition lambda function is set as a default integration in the Http Api

// ...
    const mcpHttpHandler = new cdk.aws_lambda_nodejs.NodejsFunction(
      this,
      "httpHandler",
      {
        entry: "../server/main.ts",
        runtime: cdk.aws_lambda.Runtime.NODEJS_20_X,
        timeout: cdk.Duration.minutes(3),
        handler: "run.sh",
        architecture: cdk.aws_lambda.Architecture.X86_64,
        environment: {
          AWS_LAMBDA_EXEC_WRAPPER: "/opt/bootstrap",
          RUST_LOG: "info",
        },
        bundling: {
          minify: true,
          commandHooks: {
            beforeInstall: () => [],
            beforeBundling: () => [],
            afterBundling: (inputDir: string, outputDir: string) => {
              return [`cp ${inputDir}/../server/run.sh ${outputDir}`];
            },
          },
          target: "node20",
          externalModules: ["@aws-sdk/*"],
          sourceMap: true,
          forceDockerBundling: false,
        },
        layers: [
          LayerVersion.fromLayerVersionArn(
            this,
            "layer",
            `arn:aws:lambda:us-east-1:753240598075:layer:LambdaAdapterLayerX86:25`
          ),
        ],
      }
    );

    const mcpHttpApi = new cdk.aws_apigatewayv2.HttpApi(this, "mcpAPI", {
      defaultIntegration: new cdk.aws_apigatewayv2_integrations.HttpLambdaIntegration(
        "httpLambdaIntegration",
        mcpHttpHandler
      ),
    })
// ...

After deployment, I can call the MCP server running as a Lambda Function. Please see the URL used in the Inspector

Authorization

Security is one of the critical aspects that needs to be covered when the MCP matures. We will see how MCP clients will handle authentication in the future. I expect that it will look similar to the standard application flow that allows usage of the external providers, like cognito, okta, etc.

I secure my MCP server with Cognito JWT authorizer.

The thing is that currently, MCP Inspector doesn't fully support authorization for HTTP Streamable transport. At least I wasn't able to have it work, but there is active development in the repository in this area.

To test functionality, I use Bruno and call the server directly.

Without the authorization header

Connect with the token

Get the list of prompts

And finally, generate the prompt

Integrate with S3

Having an MCP server running as the AWS Lambda opens unlimited possibilities to integrate with the serverless infrastructure (and with any other services).

In my case, I will get the template of the prompt from the S3 bucket. Let's add an S3 connection to the function.

Web Applications behind Lambda Web Adapter are similar to the regular Lambdas - we want to initialize all services before the handler, so the time needed for this impacts only the cold start. To achieve it, it is enough to initialize services outside express handlers and inject them into the created server

// initialize S3 service
const client = new S3Client();
const bucketName = process.env.BUCKET_NAME || "bucket name missing";

const s3Service = new S3Service(client, bucketName)

const app = express();
app.use(express.json());

app.post("/mcp", async (req, res) => {
    console.log(`req body: ${JSON.stringify(req.body)}`);

    const server = initializeServer(s3Service);

// ...

Now Lambda Function gets the prompt template from S3.

The full code is available HERE

Summary

Model Context Protocol is evolving pretty fast. The recent changes allowing a stateless approach for creating MCP servers are an example of how rapidly the ecosystem changes.

Model Context Protocol comes with a set of SDKs for different languages. In the Lambda functions, they can be used out-of-the-box for MCP server creation when put behind the Lambda Web Adapter.

In this Blog post, I created a simple stateless server, but thanks to the sessionId passed with the request by the MCP Client, state can be persisted e.g., in the DynamoDB. It allows more complex interactions without giving up the flexibility of the serverless setup.

I believe that opening the MCP specification on the stateless servers will have a significant impact on the whole MCP ecosystem.