Building a Local MongoDB Sharded Cluster with Docker

This tutorial guides you through setting up a local MongoDB sharded cluster using Docker. This setup is ideal for development and testing, providing an environment that closely resembles a production sharded deployment. Target Audience: Developers needing a local, sharded MongoDB instance. Prerequisites: Docker installed (https://docs.docker.com/get-docker/) Docker Compose installed (usually included with Docker Desktop) Basic understanding of Docker and MongoDB concepts (sharding, replica sets). A text editor. A terminal or command prompt. Why a Local Sharded Cluster? Transaction Support: The main reason is that when developing/testing transaction in MongoDB basic MongoDB containers wont containers wont work, you need a cluster not a single node. *Realistic Development: * Test application logic against a sharded environment. Performance Insights: Understand query behavior across shards. Isolation: Avoid conflicts with shared development databases. Cluster Architecture We will build a minimal sharded cluster consisting of: Config Server Replica Set (1 Node): Stores cluster metadata. (Port: 27018) Shard Replica Set (1 Node): Stores a portion of the data. (Port: 27019) Mongos Router (1 Instance): The entry point for applications. (Port: 27017) All components will run within a single Docker container, orchestrated by a startup script. Step 1: Project Setup Create a dedicated directory for this setup. Let's call it mongodb-dev. mkdir mongodb-dev cd mongodb-dev Inside mongodb-dev, we will create the necessary configuration and script files. Step 2: Create Initialization Scripts These JavaScript files contain commands run by mongosh to initialize the replica sets and register the shard. init-replica.js (Initialize Config Server Replica Set) Create a file named init-replica.js with the following content: // init-replica.js rs.initiate({ _id: "configReplSet", configsvr: true, members: [ { _id: 0, host: "localhost:27018" } ] }); Explanation: This initiates a replica set named configReplSet, marks it as a config server (configsvr: true), and adds the mongod instance running on localhost:27018 as its only member. init-shard.js (Initialize Shard Replica Set) Create a file named init-shard.js with the following content: // init-shard.js rs.initiate({ _id: "shard1", members: [ { _id: 0, host: "localhost:27019" } ] }); Explanation: This initiates a regular replica set named shard1 with the mongod instance running on localhost:27019 as its only member. init-router.js (Add Shard to Cluster) Create a file named init-router.js with the following content: // init-router.js sh.addShard("shard1/localhost:27019"); Explanation: This command, run against the mongos router, tells the cluster about the shard1 replica set located at localhost:27019. Step 3: Create the Dockerfile This file defines the Docker image for our cluster. Create a file named Dockerfile (no extension) with the following content: # Dockerfile FROM mongo:latest # Create necessary directories RUN mkdir -p /data/configdb /data/shard1 /scripts # Copy initialization scripts COPY init-replica.js /scripts/ COPY init-shard.js /scripts/ COPY init-router.js /scripts/ # Copy the startup script COPY start-cluster.sh /scripts/ RUN chmod +x /scripts/start-cluster.sh # Expose ports (Router, Config Server, Shard) EXPOSE 27017 27018 27019 # Set the working directory (optional) WORKDIR / # Command to run when the container starts CMD ["/scripts/start-cluster.sh"] Explanation: * Uses the official mongo image as a base. * Creates directories for database files (/data/*) and scripts (/scripts). * Copies our .js initialization scripts and the (yet to be created) startup script into /scripts. * Makes the startup script executable. * Exposes the ports for the router, config server, and shard. * Sets the command to run the start-cluster.sh script when the container launches. Step 4: Create the Startup Script This script orchestrates the starting and initialization of all MongoDB components within the container. Create a file named start-cluster.sh with the following content: #!/bin/bash # start-cluster.sh echo "Starting config server (configReplSet)..." mongod --configsvr --replSet configReplSet --port 27018 --dbpath /data/configdb --fork --logpath /data/configdb/config.log # Wait for config server to be ready echo "Waiting for config server..." until mongosh --port 27018 --eval "db.adminCommand('ping')" &> /dev/null; do sleep 2 done echo "Config server is ready." echo "Initializing config server replica set..." mongosh --port 27018 < /scripts/init-replica.js echo "Config server replica set initialized." echo "Starting shard server (shard1)..." mongod --shardsvr --replSet shard1 --port 27019 --dbpath /data/shard1 --fork --logpath /data/shard1/shard1.lo

Apr 5, 2025 - 16:59
 0
Building a Local MongoDB Sharded Cluster with Docker

This tutorial guides you through setting up a local MongoDB sharded cluster using Docker. This setup is ideal for development and testing, providing an environment that closely resembles a production sharded deployment.

Target Audience: Developers needing a local, sharded MongoDB instance.
Prerequisites:

  • Docker installed (https://docs.docker.com/get-docker/)
  • Docker Compose installed (usually included with Docker Desktop)
  • Basic understanding of Docker and MongoDB concepts (sharding, replica sets).
  • A text editor.
  • A terminal or command prompt.

Why a Local Sharded Cluster?

  • Transaction Support: The main reason is that when developing/testing transaction in MongoDB basic MongoDB containers wont containers wont work, you need a cluster not a single node.
  • *Realistic Development: * Test application logic against a sharded environment.
  • Performance Insights: Understand query behavior across shards.
  • Isolation: Avoid conflicts with shared development databases.

Cluster Architecture

We will build a minimal sharded cluster consisting of:

  1. Config Server Replica Set (1 Node): Stores cluster metadata. (Port: 27018)
  2. Shard Replica Set (1 Node): Stores a portion of the data. (Port: 27019)
  3. Mongos Router (1 Instance): The entry point for applications. (Port: 27017)

All components will run within a single Docker container, orchestrated by a startup script.

Step 1: Project Setup

  1. Create a dedicated directory for this setup. Let's call it mongodb-dev.

    mkdir mongodb-dev
    cd mongodb-dev
    
  2. Inside mongodb-dev, we will create the necessary configuration and script files.

Step 2: Create Initialization Scripts

These JavaScript files contain commands run by mongosh to initialize the replica sets and register the shard.

  1. init-replica.js (Initialize Config Server Replica Set)

    • Create a file named init-replica.js with the following content:

      // init-replica.js
      rs.initiate({
        _id: "configReplSet",
        configsvr: true,
        members: [
          { _id: 0, host: "localhost:27018" }
        ]
      });
      

Explanation: This initiates a replica set named configReplSet, marks it as a config server (configsvr: true), and adds the mongod instance running on localhost:27018 as its only member.

  1. init-shard.js (Initialize Shard Replica Set)

    • Create a file named init-shard.js with the following content:

      // init-shard.js
      rs.initiate({
        _id: "shard1",
        members: [
          { _id: 0, host: "localhost:27019" }
        ]
      });
      

Explanation: This initiates a regular replica set named shard1 with the mongod instance running on localhost:27019 as its only member.

  1. init-router.js (Add Shard to Cluster)

    • Create a file named init-router.js with the following content:

      // init-router.js
      sh.addShard("shard1/localhost:27019");
      

Explanation: This command, run against the mongos router, tells the cluster about the shard1 replica set located at localhost:27019.

Step 3: Create the Dockerfile

This file defines the Docker image for our cluster.

  • Create a file named Dockerfile (no extension) with the following content:

    # Dockerfile
    FROM mongo:latest
    
    # Create necessary directories
    RUN mkdir -p /data/configdb /data/shard1 /scripts
    
    # Copy initialization scripts
    COPY init-replica.js /scripts/
    COPY init-shard.js /scripts/
    COPY init-router.js /scripts/
    
    # Copy the startup script
    COPY start-cluster.sh /scripts/
    RUN chmod +x /scripts/start-cluster.sh
    
    # Expose ports (Router, Config Server, Shard)
    EXPOSE 27017 27018 27019
    
    # Set the working directory (optional)
    WORKDIR /
    
    # Command to run when the container starts
    CMD ["/scripts/start-cluster.sh"]
    

Explanation:
* Uses the official mongo image as a base.
* Creates directories for database files (/data/*) and scripts (/scripts).
* Copies our .js initialization scripts and the (yet to be created) startup script into /scripts.
* Makes the startup script executable.
* Exposes the ports for the router, config server, and shard.
* Sets the command to run the start-cluster.sh script when the container launches.

Step 4: Create the Startup Script

This script orchestrates the starting and initialization of all MongoDB components within the container.

  • Create a file named start-cluster.sh with the following content:

    #!/bin/bash
    # start-cluster.sh
    
    echo "Starting config server (configReplSet)..."
    mongod --configsvr --replSet configReplSet --port 27018 --dbpath /data/configdb --fork --logpath /data/configdb/config.log
    
    # Wait for config server to be ready
    echo "Waiting for config server..."
    until mongosh --port 27018 --eval "db.adminCommand('ping')" &> /dev/null; do
      sleep 2
    done
    echo "Config server is ready."
    
    echo "Initializing config server replica set..."
    mongosh --port 27018 < /scripts/init-replica.js
    echo "Config server replica set initialized."
    
    echo "Starting shard server (shard1)..."
    mongod --shardsvr --replSet shard1 --port 27019 --dbpath /data/shard1 --fork --logpath /data/shard1/shard1.log
    
    # Wait for shard server to be ready
    echo "Waiting for shard server..."
    until mongosh --port 27019 --eval "db.adminCommand('ping')" &> /dev/null; do
      sleep 2
    done
    echo "Shard server is ready."
    
    echo "Initializing shard replica set..."
    mongosh --port 27019 < /scripts/init-shard.js
    echo "Shard replica set initialized."
    
    echo "Starting router (mongos)..."
    mongos --configdb configReplSet/localhost:27018 --port 27017 --bind_ip_all --fork --logpath /data/mongos.log
    
    # Wait for router to be ready
    echo "Waiting for router..."
    until mongosh --port 27017 --eval "db.adminCommand('ping')" &> /dev/null; do
      sleep 2
    done
    echo "Router is ready."
    
    echo "Adding shard to the cluster via router..."
    mongosh --port 27017 < /scripts/init-router.js
    echo "Shard added."
    
    echo "Cluster setup complete. Tailing mongos log..."
    # Keep container running by tailing a log file
    tail -f /data/mongos.log
    

Explanation:
* Starts mongod as a config server (--configsvr), assigns it to configReplSet, listens on port 27018, stores data in /data/configdb, and forks to the background (--fork).
* Waits until the server responds to a ping.
* Initializes the config server replica set using mongosh and init-replica.js.
* Starts mongod as a shard server (--shardsvr), assigns it to shard1, listens on port 27019, stores data in /data/shard1, and forks.
* Waits for the shard server.
* Initializes the shard replica set using mongosh and init-shard.js.
* Starts mongos (the router), connects it to the config replica set (--configdb configReplSet/localhost:27018), listens on port 27017, binds to all interfaces (--bind_ip_all) so it's accessible from outside the container, and forks.
* Waits for the router.
* Adds the shard to the cluster using mongosh connected to the router and init-router.js.
* Uses tail -f on the mongos log file as the main process to keep the container running.

Step 5: Build and Run the Cluster

Now we can build the Docker image and run the container.

  1. Build the Image:

    • Make sure you are in the mongodb-dev directory in your terminal.
    • Run:

      docker build -t local-mongo-cluster .
      

      This builds the image using our Dockerfile and tags it as local-mongo-cluster.

  2. Run the Container:

    • Run the image as a container:

      docker run -d --name mongo-cluster-dev \
        -p 27017:27017 \
        -p 27018:27018 \
        -p 27019:27019 \
        local-mongo-cluster
      
*   **Explanation:**
    *   `-d`: Run in detached (background) mode.
    *   `--name mongo-cluster-dev`: Give the container a convenient name.
    *   `-p :`: Map ports from your host machine to the container. We map all three for potential direct access/debugging, but only 27017 (mongos) is essential for application connection.
    *   `local-mongo-cluster`: The name of the image to run.
  1. Check Logs (Optional):

    • You can view the startup logs:

      docker logs mongo-cluster-dev -f
      
*   Press `Ctrl+C` to stop following the logs. You should see messages indicating the successful start and initialization of all components.

Step 6: Connect and Verify

  1. Connect using mongosh:

    • If you have mongosh installed locally:

      mongosh mongodb://localhost:27017
      
*   This connects you to the `mongos` router.
  1. Verify Cluster Status:

    • Once connected via mongosh, run:

      sh.status()
      
*   This command should show information about the sharded cluster, including the `configReplSet`, the `shard1` shard, and databases.

Using with Docker Compose (Integration)

While the docker run command works, integrating this into a docker-compose.yaml file is often more practical for multi-service applications.

Consider a dev.docker-compose.yaml in your project's root directory:

# dev.docker-compose.yaml (Example Snippet)
volumes:
  mongodb_data:
    name: mongodb_cluster_data # Use a named volume for persistence

services:
  mongodb-cluster:
    build:
      context: ./mongodb-dev # Path to the directory containing the Dockerfile
      dockerfile: Dockerfile
    container_name: mongodb-cluster
    ports:
      - "27017:27017"  # Expose router port to host
      # - "27018:27018" # Optional: Expose config server
      # - "27019:27019" # Optional: Expose shard
    volumes:
      - mongodb_data:/data # Mount named volume to persist data
    # Add healthcheck, restart policy etc. as needed
    # healthcheck:
    #   test: ["CMD", "mongosh", "--port", "27017", "--eval", "db.adminCommand('ping')"]
    #   interval: 20s
    #   timeout: 10s
    #   retries: 5
    #   start_period: 30s
    # restart: unless-stopped

  # Example application service
  my-app:
    build: . # Or specify app context/dockerfile
    ports:
      - "8080:8080"
    environment:
      # Connect using the service name 'mongodb-cluster'
      - DATABASE_URL=mongodb://mongodb-cluster:27017/myappdb
    depends_on:
      mongodb-cluster:
        # condition: service_healthy # Use if healthcheck is defined
        condition: service_started # Basic dependency
  • To run with Compose:

    docker compose -f dev.docker-compose.yaml up -d
    
  • To stop:

    docker compose -f dev.docker-compose.yaml down
    

    (Add -v to down to also remove the mongodb_data volume)

Conclusion

You have successfully built and run a local MongoDB sharded cluster using Docker. This setup provides a valuable tool for developing and testing applications designed for sharded environments. Remember to adapt port mappings, volumes, and Compose configurations to your specific project needs.