Implementing Nova Act MCP Server on ECS Fargate
Browser Automation as a Service This technical blog post outlines the implementation of a Model Context Protocol (MCP) server for Amazon Nova Act on Amazon ECS Fargate platform as a container, providing browser automation capabilities as a service. Introduction Amazon Nova Act is an early research preview AI model and SDK designed to enable developers to build reliable web agents that can perform actions within a web browser. By integrating Nova Act with the Model Context Protocol (MCP), its browser automation capabilities can be standardized and exposed to diverse clients (e.g., AI assistants, web UIs, IDE extensions) through multiple communication channels: Standard I/O (stdio) for local scripting and CLI tools like Amazon Q Developer CLI & Cline VS Code Extension (open-source) Server-Sent Events (SSE) via HTTP connection for real-time, bidirectional communication Streaming HTTP transport (this was recently introduced in the latest MCP specification) for scalable web-based interactions This implementation deploys Nova Act as a containerized service on Amazon ECS fronted by ALB and uses Server-Sent Events (SSE) via HTTP connection to enable real-time communication. Note: Amazon Nova Act is only available in the US to request for an API Key, and hence the solution is deployed on AWS in us-east-1 region to ensure compliance. Architecture Overview The solution consists of the following components: MCP Server: A FastAPI application that implements the Model Context Protocol and exposes Nova Act functionality by using FastAPI-MCP Streamlit Client: A web UI and MCP Client for interacting with the MCP server AWS Infrastructure: ECS Fargate tasks, ALB, and supporting resources Deployment Guide Prerequisites Before deploying the Nova Act MCP server on ECS, ensure you have: AWS CLI configured with appropriate credentials Node.js and npm installed AWS CDK installed (npm install -g aws-cdk) Docker installed and running A valid Nova Act API key Step 1: Clone the Repository git clone https://github.com/awsdataarchitect/nova-act-ecs.git cd nova-act-ecs Step 2: Export your API Key as environment variable export NOVA_ACT_API_KEY = your-api-key-here Step 3: Deploy the CDK Stack npm install cdk bootstrap cdk deploy Step 4: Access the Application After deployment completes, the CDK will output the ALB DNS name. You can access: MCP Server: http:///mcp Streamlit UI: http://:8501 MCP Server Implementation The server implements the MCP protocol using fastapi_mcp with the following components: MCP Endpoints (automatically handled by fastapi_mcp): /mcp - SSE endpoint for event streaming /mcp/schema - Schema endpoint for method discovery /mcp/jsonrpc - JSON-RPC endpoint for method calls Core API Endpoint: /browse - Combined endpoint for all browser automation tasks Additional Endpoints: /health - Used by ALB for health checks /logs - Endpoint to retrieve recent server logs The "browse" Method The implementation uses a simplified approach with a single "browse" method that combines browser control and instruction execution: @app.post("/browse", operation_id="browse") async def browse(request: BrowseRequest) -> BrowseResponse: """ Execute a browsing task with Nova Act. This method handles browser initialization, navigation, and instruction execution. """ # Implementation details excluded here for brevity... Request Schema The browse method accepts a flexible request schema that can handle various browsing scenarios: class BrowseRequest(BaseModel): starting_url: str instructions: List[str] = Field(..., description="List of instructions to execute sequentially") max_steps_per_instruction: int = 30 timeout_per_instruction: Optional[int] = None schema: Optional[Dict[str, Any]] = None headless: bool = True Response Schema The response includes detailed information about the browsing session: class BrowseResponse(BaseModel): status: str results: List[Dict[str, Any]] errors: List[Dict[str, Any]] = [] Server Features Single Global Browser Instance: The server maintains a single global Nova Act instance Headless Mode: Browser always runs in headless mode for ECS compatibility API Key Management: Retrieves API key from environment variables or AWS Secrets Manager Structured Data Extraction: Supports schema-based data extraction Error Handling: Comprehensive error handling and logging Thread Pool Execution: Runs synchronous Nova Act code in a thread pool to avoid asyncio conflicts Resource Monitoring: Monitors system resources (CPU, memory) for debugging Log Buffering: Maintains a circular buffer of recent logs for client display Console Output Capture: Captures all stdout/stderr output including Nova Act's thinking steps

Browser Automation as a Service
This technical blog post outlines the implementation of a Model Context Protocol (MCP) server for Amazon Nova Act on Amazon ECS Fargate platform as a container, providing browser automation capabilities as a service.
Introduction
Amazon Nova Act is an early research preview AI model and SDK designed to enable developers to build reliable web agents that can perform actions within a web browser. By integrating Nova Act with the Model Context Protocol (MCP), its browser automation capabilities can be standardized and exposed to diverse clients (e.g., AI assistants, web UIs, IDE extensions) through multiple communication channels:
- Standard I/O (stdio) for local scripting and CLI tools like Amazon Q Developer CLI & Cline VS Code Extension (open-source)
- Server-Sent Events (SSE) via HTTP connection for real-time, bidirectional communication
- Streaming HTTP transport (this was recently introduced in the latest MCP specification) for scalable web-based interactions
This implementation deploys Nova Act as a containerized service on Amazon ECS fronted by ALB and uses Server-Sent Events (SSE) via HTTP connection to enable real-time communication.
Note: Amazon Nova Act is only available in the US to request for an API Key, and hence the solution is deployed on AWS in us-east-1 region to ensure compliance.
Architecture Overview
The solution consists of the following components:
- MCP Server: A FastAPI application that implements the Model Context Protocol and exposes Nova Act functionality by using FastAPI-MCP
- Streamlit Client: A web UI and MCP Client for interacting with the MCP server
- AWS Infrastructure: ECS Fargate tasks, ALB, and supporting resources
Deployment Guide
Prerequisites
Before deploying the Nova Act MCP server on ECS, ensure you have:
- AWS CLI configured with appropriate credentials
- Node.js and npm installed
- AWS CDK installed (
npm install -g aws-cdk
) - Docker installed and running
- A valid Nova Act API key
Step 1: Clone the Repository
git clone https://github.com/awsdataarchitect/nova-act-ecs.git
cd nova-act-ecs
Step 2: Export your API Key as environment variable
export NOVA_ACT_API_KEY = your-api-key-here
Step 3: Deploy the CDK Stack
npm install
cdk bootstrap
cdk deploy
Step 4: Access the Application
After deployment completes, the CDK will output the ALB DNS name. You can access:
- MCP Server:
http://
/mcp - Streamlit UI:
http://
:8501
MCP Server Implementation
The server implements the MCP protocol using fastapi_mcp
with the following components:
-
MCP Endpoints (automatically handled by
fastapi_mcp
):-
/mcp
- SSE endpoint for event streaming -
/mcp/schema
- Schema endpoint for method discovery -
/mcp/jsonrpc
- JSON-RPC endpoint for method calls
-
-
Core API Endpoint:
-
/browse
- Combined endpoint for all browser automation tasks
-
-
Additional Endpoints:
-
/health
- Used by ALB for health checks -
/logs
- Endpoint to retrieve recent server logs
-
The "browse" Method
The implementation uses a simplified approach with a single "browse" method that combines browser control and instruction execution:
@app.post("/browse", operation_id="browse")
async def browse(request: BrowseRequest) -> BrowseResponse:
"""
Execute a browsing task with Nova Act.
This method handles browser initialization, navigation, and instruction execution.
"""
# Implementation details excluded here for brevity...
Request Schema
The browse method accepts a flexible request schema that can handle various browsing scenarios:
class BrowseRequest(BaseModel):
starting_url: str
instructions: List[str] = Field(..., description="List of instructions to execute sequentially")
max_steps_per_instruction: int = 30
timeout_per_instruction: Optional[int] = None
schema: Optional[Dict[str, Any]] = None
headless: bool = True
Response Schema
The response includes detailed information about the browsing session:
class BrowseResponse(BaseModel):
status: str
results: List[Dict[str, Any]]
errors: List[Dict[str, Any]] = []
Server Features
- Single Global Browser Instance: The server maintains a single global Nova Act instance
- Headless Mode: Browser always runs in headless mode for ECS compatibility
- API Key Management: Retrieves API key from environment variables or AWS Secrets Manager
- Structured Data Extraction: Supports schema-based data extraction
- Error Handling: Comprehensive error handling and logging
- Thread Pool Execution: Runs synchronous Nova Act code in a thread pool to avoid asyncio conflicts
- Resource Monitoring: Monitors system resources (CPU, memory) for debugging
- Log Buffering: Maintains a circular buffer of recent logs for client display
- Console Output Capture: Captures all stdout/stderr output including Nova Act's thinking steps
Key Implementation Details
The server uses a thread pool to run synchronous Nova Act code without blocking the FastAPI event loop:
# Execute the browse sequence in a thread pool
logger.info("Running browse sequence in thread pool")
browse_result = await asyncio.get_event_loop().run_in_executor(
thread_pool, run_browse_sequence
)
The server also implements a log capture mechanism to provide real-time logs to clients, including stdout/stderr interception to capture Nova Act's thinking process:
# Log buffer implementation
class LogBuffer:
def __init__(self, max_size=1000):
self.logs = collections.deque(maxlen=max_size)
self.lock = threading.Lock()
def add(self, log_entry):
with self.lock:
self.logs.append(log_entry)
def get_logs(self, limit=100):
with self.lock:
return list(self.logs)[-limit:]
# Custom stdout/stderr interceptor to capture Nova Act outputs
class OutputInterceptor(StringIO):
def __init__(self, log_buffer, stream_name, original_stream):
super().__init__()
self.log_buffer = log_buffer
self.stream_name = stream_name
self.original_stream = original_stream
def write(self, text):
# Write to the original stream
self.original_stream.write(text)
# Add to log buffer if not empty
if text.strip():
self.log_buffer.add(text.rstrip())
def flush(self):
self.original_stream.flush()
# Log endpoint
@app.get("/logs")
async def get_logs(limit: int = 100):
return {"logs": log_buffer.get_logs(limit)}
For the full server implementation, see the GitHub repository.
MCP Client Implementation
The client implementation provides a Python interface to the Nova Act MCP server. I've implemented a synchronous (requests) version as its more stable in the Streamlit environment.
Client Features
-
Connection Management:
- Connects to the server's health endpoint to verify availability
- Manages an HTTP session for all requests
- Handles connection errors gracefully
-
API Method:
-
browse(starting_url, instructions, max_steps_per_instruction, timeout_per_instruction, schema, headless)
- Execute a browsing task
-
-
Error Handling:
- Proper error propagation
- Detailed error messages
- Connection retry logic
-
Log Retrieval:
-
get_logs(limit)
- Retrieve recent server logs
-
Synchronous Client Implementation
import requests
import logging
from typing import Optional, Dict, Any, List, Union
class MCPClient:
def __init__(self, base_url: str):
self.base_url = base_url.rstrip('/')
self.session = requests.Session()
self.connected = False
self._current_url = None
def connect(self) -> bool:
"""Initialize connection to MCP server"""
try:
response = self.session.get(f"{self.base_url}/health")
if response.status_code == 200:
self.connected = True
logger.info("Connected to MCP server")
return True
return False
except Exception as e:
logger.error(f"Connection error: {str(e)}")
return False
def browse(self, starting_url: str, instructions: Union[str, List[str]],
max_steps_per_instruction: int = 30,
timeout_per_instruction: Optional[int] = None,
schema: Optional[Dict[str, Any]] = None,
headless: bool = True) -> Dict[str, Any]:
"""Execute a sequence of instructions in a single browser session."""
# Convert single instruction to list
if isinstance(instructions, str):
instructions = [instructions]
if not self.connected:
self.connect()
try:
data = {
"starting_url": starting_url,
"instructions": instructions,
"max_steps_per_instruction": max_steps_per_instruction,
"headless": headless
}
if timeout_per_instruction:
data["timeout_per_instruction"] = timeout_per_instruction
if schema:
data["schema"] = schema
logger.info(f"Sending browse request with {len(instructions)} instructions to {starting_url}")
response = self.session.post(
f"{self.base_url}/browse",
json=data
)
if response.status_code != 200:
raise Exception(f"Server returned {response.status_code}: {response.text}")
result = response.json()
logger.info(f"Browse request completed with status: {result.get('status')}")
# Update current URL
self._current_url = starting_url
return result
except Exception as e:
raise Exception(f"Error in browse operation: {str(e)}")
def get_logs(self, limit: int = 100) -> List[str]:
"""Get recent logs from the server"""
if not self.connected:
self.connect()
try:
response = self.session.get(
f"{self.base_url}/logs?limit={limit}"
)
if response.status_code != 200:
raise Exception(f"Server returned {response.status_code}: {response.text}")
result = response.json()
return result.get("logs", [])
except Exception as e:
logger.error(f"Error getting logs: {str(e)}")
return []
For the full client implementation, see the GitHub repository.
Streamlit UI Implementation
The Streamlit UI provides a user-friendly interface to the Nova Act MCP server:
UI Features
- Single Form Interface: Combines URL and instruction inputs in one form
- Schema Builder: UI for creating extraction schemas (Boolean, Text, Product Info, List Items, Custom)
- Execution Options: Configure max steps and timeout
- Result Display: Formatted display of execution results and parsed responses
- History Tracking: Maintains a record of previous operations and results
- Live Logs Display: Shows real-time server logs in a scrollable window
- Amazon-Specific Examples: Pre-configured examples for common Amazon shopping tasks
Live Logs Display
A key feature of the UI is the live logs display, which shows the server's output in real-time, including Nova Act's thinking process:
# In the Streamlit UI
with st.expander("Server Logs", expanded=True):
# Add a refresh button and auto-refresh toggle
col1, col2 = st.columns([1, 5])
with col1:
if st.button("