Building a Local MongoDB Sharded Cluster with Docker

This tutorial guides you through setting up a local MongoDB sharded cluster using Docker. This setup is ideal for development and testing, providing an environment that closely resembles a production sharded deployment. Target Audience: Developers needing a local, sharded MongoDB instance. Prerequisites: Docker installed (https://docs.docker.com/get-docker/) Docker Compose installed (usually included with Docker Desktop) Basic understanding of Docker and MongoDB concepts (sharding, replica sets). A text editor. A terminal or command prompt. Why a Local Sharded Cluster? Transaction Support: The main reason is that when developing/testing transaction in MongoDB basic MongoDB containers wont containers wont work, you need a cluster not a single node. Realistic Development: Test application logic against a sharded environment. Performance Insights: Understand query behavior across shards. Isolation: Avoid conflicts with shared development databases. Cluster Architecture We will build a minimal sharded cluster consisting of: Config Server Replica Set (1 Node): Stores cluster metadata. (Port: 27018) Shard Replica Set (1 Node): Stores a portion of the data. (Port: 27019) Mongos Router (1 Instance): The entry point for applications. (Port: 27017) All components will run within a single Docker container, orchestrated by a startup script. Step 1: Project Setup Create a dedicated directory for this setup. Let's call it mongodb-dev. mkdir mongodb-dev cd mongodb-dev Inside mongodb-dev, we will create the necessary configuration and script files. Step 2: Create Initialization Scripts These JavaScript files contain commands run by mongosh to initialize the replica sets and register the shard. init-replica.js (Initialize Config Server Replica Set) Create a file named init-replica.js with the following content: // init-replica.js rs.initiate({ _id: "configReplSet", configsvr: true, members: [ { _id: 0, host: "localhost:27018" } ] }); Explanation: This initiates a replica set named configReplSet, marks it as a config server (configsvr: true), and adds the mongod instance running on localhost:27018 as its only member. init-shard.js (Initialize Shard Replica Set) Create a file named init-shard.js with the following content: // init-shard.js rs.initiate({ _id: "shard1", members: [ { _id: 0, host: "localhost:27019" } ] }); Explanation: This initiates a regular replica set named shard1 with the mongod instance running on localhost:27019 as its only member. init-router.js (Add Shard to Cluster) Create a file named init-router.js with the following content: // init-router.js sh.addShard("shard1/localhost:27019"); Explanation: This command, run against the mongos router, tells the cluster about the shard1 replica set located at localhost:27019. Step 3: Create the Dockerfile This file defines the Docker image for our cluster. Create a file named Dockerfile (no extension) with the following content: # Dockerfile FROM mongo:latest # Create necessary directories RUN mkdir -p /data/configdb /data/shard1 /scripts # Copy initialization scripts COPY init-replica.js /scripts/ COPY init-shard.js /scripts/ COPY init-router.js /scripts/ # Copy the startup script COPY start-cluster.sh /scripts/ RUN chmod +x /scripts/start-cluster.sh # Expose ports (Router, Config Server, Shard) EXPOSE 27017 27018 27019 # Set the working directory (optional) WORKDIR / # Command to run when the container starts CMD ["/scripts/start-cluster.sh"] Explanation: * Uses the official mongo image as a base. * Creates directories for database files (/data/) and scripts (/scripts). Copies our .js initialization scripts and the (yet to be created) startup script into /scripts. * Makes the startup script executable. * Exposes the ports for the router, config server, and shard. * Sets the command to run the start-cluster.sh script when the container launches. Step 4: Create the Startup Script This script orchestrates the starting and initialization of all MongoDB components within the container. Create a file named start-cluster.sh with the following content: #!/bin/bash # start-cluster.sh echo "Starting config server (configReplSet)..." mongod --configsvr --replSet configReplSet --port 27018 --dbpath /data/configdb --fork --logpath /data/configdb/config.log # Wait for config server to be ready echo "Waiting for config server..." until mongosh --port 27018 --eval "db.adminCommand('ping')" &> /dev/null; do sleep 2 done echo "Config server is ready." echo "Initializing config server replica set..." mongosh --port 27018 < /scripts/init-replica.js echo "Config server replica set initialized." echo "Starting shard server (shard1)..." mongod --shardsvr --replSet shard1 --port 27019 --dbpath /data/shard1 --fork --logpath /data/shard1/shard1.lo

Apr 5, 2025 - 16:59

Building a Local MongoDB Sharded Cluster with Docker

This tutorial guides you through setting up a local MongoDB sharded cluster using Docker. This setup is ideal for development and testing, providing an environment that closely resembles a production sharded deployment.

Target Audience: Developers needing a local, sharded MongoDB instance.
Prerequisites:

Docker installed (https://docs.docker.com/get-docker/)
Docker Compose installed (usually included with Docker Desktop)
Basic understanding of Docker and MongoDB concepts (sharding, replica sets).
A text editor.
A terminal or command prompt.

Why a Local Sharded Cluster?

Transaction Support: The main reason is that when developing/testing transaction in MongoDB basic MongoDB containers wont containers wont work, you need a cluster not a single node.
*Realistic Development: * Test application logic against a sharded environment.
Performance Insights: Understand query behavior across shards.
Isolation: Avoid conflicts with shared development databases.

Cluster Architecture

We will build a minimal sharded cluster consisting of:

Config Server Replica Set (1 Node): Stores cluster metadata. (Port: 27018)
Shard Replica Set (1 Node): Stores a portion of the data. (Port: 27019)
Mongos Router (1 Instance): The entry point for applications. (Port: 27017)

All components will run within a single Docker container, orchestrated by a startup script.

Step 1: Project Setup

Create a dedicated directory for this setup. Let's call it mongodb-dev.
```
mkdir mongodb-dev
cd mongodb-dev
```
Inside mongodb-dev, we will create the necessary configuration and script files.

Step 2: Create Initialization Scripts

These JavaScript files contain commands run by mongosh to initialize the replica sets and register the shard.

init-replica.js (Initialize Config Server Replica Set)

Create a file named init-replica.js with the following content:

// init-replica.js
rs.initiate({
  _id: "configReplSet",
  configsvr: true,
  members: [
    { _id: 0, host: "localhost:27018" }
  ]
});

Explanation: This initiates a replica set named configReplSet, marks it as a config server (configsvr: true), and adds the mongod instance running on localhost:27018 as its only member.

init-shard.js (Initialize Shard Replica Set)

Create a file named init-shard.js with the following content:

// init-shard.js
rs.initiate({
  _id: "shard1",
  members: [
    { _id: 0, host: "localhost:27019" }
  ]
});

Explanation: This initiates a regular replica set named shard1 with the mongod instance running on localhost:27019 as its only member.

init-router.js (Add Shard to Cluster)
- Create a file named init-router.js with the following content:
```
// init-router.js
sh.addShard("shard1/localhost:27019");
```

Explanation: This command, run against the mongos router, tells the cluster about the shard1 replica set located at localhost:27019.

Step 3: Create the Dockerfile

This file defines the Docker image for our cluster.

Create a file named Dockerfile (no extension) with the following content:

# Dockerfile
FROM mongo:latest

# Create necessary directories
RUN mkdir -p /data/configdb /data/shard1 /scripts

# Copy initialization scripts
COPY init-replica.js /scripts/
COPY init-shard.js /scripts/
COPY init-router.js /scripts/

# Copy the startup script
COPY start-cluster.sh /scripts/
RUN chmod +x /scripts/start-cluster.sh

# Expose ports (Router, Config Server, Shard)
EXPOSE 27017 27018 27019

# Set the working directory (optional)
WORKDIR /

# Command to run when the container starts
CMD ["/scripts/start-cluster.sh"]

Explanation:
* Uses the official mongo image as a base.
* Creates directories for database files (/data/*) and scripts (/scripts).
* Copies our .js initialization scripts and the (yet to be created) startup script into /scripts.
* Makes the startup script executable.
* Exposes the ports for the router, config server, and shard.
* Sets the command to run the start-cluster.sh script when the container launches.

Step 4: Create the Startup Script

This script orchestrates the starting and initialization of all MongoDB components within the container.

Create a file named start-cluster.sh with the following content:

#!/bin/bash
# start-cluster.sh

echo "Starting config server (configReplSet)..."
mongod --configsvr --replSet configReplSet --port 27018 --dbpath /data/configdb --fork --logpath /data/configdb/config.log

# Wait for config server to be ready
echo "Waiting for config server..."
until mongosh --port 27018 --eval "db.adminCommand('ping')" &> /dev/null; do
  sleep 2
done
echo "Config server is ready."

echo "Initializing config server replica set..."
mongosh --port 27018 < /scripts/init-replica.js
echo "Config server replica set initialized."

echo "Starting shard server (shard1)..."
mongod --shardsvr --replSet shard1 --port 27019 --dbpath /data/shard1 --fork --logpath /data/shard1/shard1.log

# Wait for shard server to be ready
echo "Waiting for shard server..."
until mongosh --port 27019 --eval "db.adminCommand('ping')" &> /dev/null; do
  sleep 2
done
echo "Shard server is ready."

echo "Initializing shard replica set..."
mongosh --port 27019 < /scripts/init-shard.js
echo "Shard replica set initialized."

echo "Starting router (mongos)..."
mongos --configdb configReplSet/localhost:27018 --port 27017 --bind_ip_all --fork --logpath /data/mongos.log

# Wait for router to be ready
echo "Waiting for router..."
until mongosh --port 27017 --eval "db.adminCommand('ping')" &> /dev/null; do
  sleep 2
done
echo "Router is ready."

echo "Adding shard to the cluster via router..."
mongosh --port 27017 < /scripts/init-router.js
echo "Shard added."

echo "Cluster setup complete. Tailing mongos log..."
# Keep container running by tailing a log file
tail -f /data/mongos.log

Explanation:
* Starts mongod as a config server (--configsvr), assigns it to configReplSet, listens on port 27018, stores data in /data/configdb, and forks to the background (--fork).
* Waits until the server responds to a ping.
* Initializes the config server replica set using mongosh and init-replica.js.
* Starts mongod as a shard server (--shardsvr), assigns it to shard1, listens on port 27019, stores data in /data/shard1, and forks.
* Waits for the shard server.
* Initializes the shard replica set using mongosh and init-shard.js.
* Starts mongos (the router), connects it to the config replica set (--configdb configReplSet/localhost:27018), listens on port 27017, binds to all interfaces (--bind_ip_all) so it's accessible from outside the container, and forks.
* Waits for the router.
* Adds the shard to the cluster using mongosh connected to the router and init-router.js.
* Uses tail -f on the mongos log file as the main process to keep the container running.

Step 5: Build and Run the Cluster

Now we can build the Docker image and run the container.

Build the Image:
- Make sure you are in the mongodb-dev directory in your terminal.
- Run:
```
docker build -t local-mongo-cluster .
```
  This builds the image using our Dockerfile and tags it as local-mongo-cluster.

Run the Container:

Run the image as a container:

docker run -d --name mongo-cluster-dev \
  -p 27017:27017 \
  -p 27018:27018 \
  -p 27019:27019 \
  local-mongo-cluster

*   **Explanation:**
    *   `-d`: Run in detached (background) mode.
    *   `--name mongo-cluster-dev`: Give the container a convenient name.
    *   `-p :`: Map ports from your host machine to the container. We map all three for potential direct access/debugging, but only 27017 (mongos) is essential for application connection.
    *   `local-mongo-cluster`: The name of the image to run.

Check Logs (Optional):
- You can view the startup logs:
```
docker logs mongo-cluster-dev -f
```

*   Press `Ctrl+C` to stop following the logs. You should see messages indicating the successful start and initialization of all components.

Step 6: Connect and Verify

Connect using mongosh:
- If you have mongosh installed locally:
```
mongosh mongodb://localhost:27017
```

*   This connects you to the `mongos` router.

Verify Cluster Status:
- Once connected via mongosh, run:
```
sh.status()
```

*   This command should show information about the sharded cluster, including the `configReplSet`, the `shard1` shard, and databases.

Using with Docker Compose (Integration)

While the docker run command works, integrating this into a docker-compose.yaml file is often more practical for multi-service applications.

Consider a dev.docker-compose.yaml in your project's root directory:

# dev.docker-compose.yaml (Example Snippet)
volumes:
  mongodb_data:
    name: mongodb_cluster_data # Use a named volume for persistence

services:
  mongodb-cluster:
    build:
      context: ./mongodb-dev # Path to the directory containing the Dockerfile
      dockerfile: Dockerfile
    container_name: mongodb-cluster
    ports:
      - "27017:27017"  # Expose router port to host
      # - "27018:27018" # Optional: Expose config server
      # - "27019:27019" # Optional: Expose shard
    volumes:
      - mongodb_data:/data # Mount named volume to persist data
    # Add healthcheck, restart policy etc. as needed
    # healthcheck:
    #   test: ["CMD", "mongosh", "--port", "27017", "--eval", "db.adminCommand('ping')"]
    #   interval: 20s
    #   timeout: 10s
    #   retries: 5
    #   start_period: 30s
    # restart: unless-stopped

  # Example application service
  my-app:
    build: . # Or specify app context/dockerfile
    ports:
      - "8080:8080"
    environment:
      # Connect using the service name 'mongodb-cluster'
      - DATABASE_URL=mongodb://mongodb-cluster:27017/myappdb
    depends_on:
      mongodb-cluster:
        # condition: service_healthy # Use if healthcheck is defined
        condition: service_started # Basic dependency

To run with Compose:

docker compose -f dev.docker-compose.yaml up -d

To stop:
```
docker compose -f dev.docker-compose.yaml down
```
(Add -v to down to also remove the mongodb_data volume)

Conclusion

You have successfully built and run a local MongoDB sharded cluster using Docker. This setup provides a valuable tool for developing and testing applications designed for sharded environments. Remember to adapt port mappings, volumes, and Compose configurations to your specific project needs.