Hands-On with P2P Networks: Building a Messaging System

Imagine a world where communication happens directly between devices no central server, no middleman, just peer-to-peer. That’s the essence of pure peer-to-peer (P2P) networks. These networks are resilient, decentralized, and increasingly relevant in systems like IoT Device Communication, file sharing, and secure messaging. In this article, we’ll break down how a pure peer-to-peer system works by building a simple messaging system from scratch using Golang. We'll explore core P2P concepts like node identification, message routing, and decentralized lookup, and implement them step by step in a way that’s easy to follow even if you’re not a Go expert. By the end, you’ll not only understand the theory behind P2P communication but also have a working messaging network where nodes can discover and message each other without relying on any central server. Let’s dive into the mechanics behind this powerful network design My Journey: From BitTorrent to DHTs While writing my own mini BitTorrent client to download Linux distributions, I stumbled upon the term pure peer-to-peer network, and more specifically, Kademlia. At first, these terms felt overwhelming, but I kept exploring and things started to click. That’s when I came across something fundamental the concept of a Distributed Hash Table (DHT). So first, I’ll break down the core terms used in DHTs to build a solid foundation. Then, we’ll move on to the prototype project I built, which will make everything much clearer through practical demonstration. What is a DHT? A Distributed Hash Table (DHT) is basically like a hash table, but spread across many peers in a network. A regular hash table stores (key, value) pairs and offers fast lookups, often in O(1) time complexity, because the data is kept in one place usually in memory. But a DHT works differently. Here, the key-value pairs aren’t stored in one table they’re distributed across multiple nodes in the network. For example, each node (often identified by its IP address and a unique Node ID) holds a portion of the data. So, when you want to find a key, you don’t search a local table you need to find the right node that holds it. That’s where routing algorithms come in they help direct the query to the correct node. Among several DHT routing algorithms, Kademlia stands out as one of the most efficient, because it uses a clever XOR-based distance metric for routing, and it can locate a key in around O(log N) time, where N is the number of nodes in the network. What is Kademlia? Now you might be wondering, what is Kademlia? Kademlia is a routing algorithm used in DHT-based peer-to-peer systems. It’s built on a clever idea: instead of using geographical or IP-based distances, it uses something called the XOR distance metric. Here’s how it works: Each peer (or node) has a Node ID, typically a 160-bit (20-byte) value. When you want to find a node or a key, you XOR the target ID with your own Id. The result gives you a distance, and the smaller the value (closer the XOR is to 0), the closer the node is to you in the DHT. The most significant bit (from the left) where two IDs differ determines the "distance" in the network this forms the basis of the routing. I get that all of this may sound a bit maze-like right now, but don’t worry once I show you an example, it’ll start making a lot more sense. Right now, I’m just laying the foundation explaining the basic concepts you’ll need to understand Kademlia and how it helps structure a pure peer-to-peer network. A guide to routing table As we dive deeper into Kademlia, another important concept comes up is routing table. So, let’s break that down. A routing table is a data structure each node maintains to keep track of other nodes in the network. It is made up of K-buckets now you might be wondering, what are K-buckets? Think of K-buckets as containers that store information about the closest nodes to a given node, based on XOR distance. Each K-bucket typically holds information about up to K nodes in many implementations, K is 20. Now, the number of K-buckets a node has depends on the bit-length of the Node ID. For example, if the Node ID is 160 bits long (as in many real-world systems like BitTorrent), then the routing table will have 160 K-buckets one for each possible bit prefix length. Messaging System So as you get familiar with the terms used in the pure peer to peer network now we looked in to our example which is a pure peer to peer messaging system Let’s start with the NodeID. In our example, each peer is assigned a unique NodeID, which for simplicity is just 16 bits (2 bytes). This keeps things easy to follow, though in real systems like Kademlia, the IDs are usually 160 bits. The NodeID is simply an array of bytes and can be represented as a hexadecimal string for readability. const IdLength = 8 const IdBits = 16 type NodeID [IdBits / IdLength]byte func (id NodeID) S

May 12, 2025 - 06:08

Hands-On with P2P Networks: Building a Messaging System

Imagine a world where communication happens directly between devices no central server, no middleman, just peer-to-peer. That’s the essence of pure peer-to-peer (P2P) networks. These networks are resilient, decentralized, and increasingly relevant in systems like IoT Device Communication, file sharing, and secure messaging.

In this article, we’ll break down how a pure peer-to-peer system works by building a simple messaging system from scratch using Golang. We'll explore core P2P concepts like node identification, message routing, and decentralized lookup, and implement them step by step in a way that’s easy to follow even if you’re not a Go expert.

By the end, you’ll not only understand the theory behind P2P communication but also have a working messaging network where nodes can discover and message each other without relying on any central server.

Let’s dive into the mechanics behind this powerful network design

My Journey: From BitTorrent to DHTs

While writing my own mini BitTorrent client to download Linux distributions, I stumbled upon the term pure peer-to-peer network, and more specifically, Kademlia. At first, these terms felt overwhelming, but I kept exploring and things started to click.

That’s when I came across something fundamental the concept of a Distributed Hash Table (DHT). So first, I’ll break down the core terms used in DHTs to build a solid foundation. Then, we’ll move on to the prototype project I built, which will make everything much clearer through practical demonstration.

What is a DHT?

A Distributed Hash Table (DHT) is basically like a hash table, but spread across many peers in a network. A regular hash table stores (key, value) pairs and offers fast lookups, often in O(1) time complexity, because the data is kept in one place usually in memory.

But a DHT works differently. Here, the key-value pairs aren’t stored in one table they’re distributed across multiple nodes in the network. For example, each node (often identified by its IP address and a unique Node ID) holds a portion of the data. So, when you want to find a key, you don’t search a local table you need to find the right node that holds it.

That’s where routing algorithms come in they help direct the query to the correct node. Among several DHT routing algorithms, Kademlia stands out as one of the most efficient, because it uses a clever XOR-based distance metric for routing, and it can locate a key in around O(log N) time, where N is the number of nodes in the network.

What is Kademlia?

Now you might be wondering, what is Kademlia?

Kademlia is a routing algorithm used in DHT-based peer-to-peer systems. It’s built on a clever idea: instead of using geographical or IP-based distances, it uses something called the XOR distance metric.

Here’s how it works:

Each peer (or node) has a Node ID, typically a 160-bit (20-byte) value.
When you want to find a node or a key, you XOR the target ID with your own Id.
The result gives you a distance, and the smaller the value (closer the XOR is to 0), the closer the node is to you in the DHT.
The most significant bit (from the left) where two IDs differ determines the "distance" in the network this forms the basis of the routing.

I get that all of this may sound a bit maze-like right now, but don’t worry once I show you an example, it’ll start making a lot more sense.

Right now, I’m just laying the foundation explaining the basic concepts you’ll need to understand Kademlia and how it helps structure a pure peer-to-peer network.

A guide to routing table

As we dive deeper into Kademlia, another important concept comes up is routing table. So, let’s break that down.

A routing table is a data structure each node maintains to keep track of other nodes in the network. It is made up of K-buckets now you might be wondering, what are K-buckets?

Think of K-buckets as containers that store information about the closest nodes to a given node, based on XOR distance. Each K-bucket typically holds information about up to K nodes in many implementations, K is 20.

Now, the number of K-buckets a node has depends on the bit-length of the Node ID. For example, if the Node ID is 160 bits long (as in many real-world systems like BitTorrent), then the routing table will have 160 K-buckets one for each possible bit prefix length.

Messaging System

So as you get familiar with the terms used in the pure peer to peer network now we looked in to our example which is a pure peer to peer messaging system

Let’s start with the NodeID. In our example, each peer is assigned a unique NodeID, which for simplicity is just 16 bits (2 bytes). This keeps things easy to follow, though in real systems like Kademlia, the IDs are usually 160 bits. The NodeID is simply an array of bytes and can be represented as a hexadecimal string for readability.

const IdLength = 8
const IdBits = 16

type NodeID [IdBits / IdLength]byte

func (id NodeID) String() string {
    return hex.EncodeToString(id[:])
}

To measure how "far apart" two nodes are in the network, we use the XOR distance metric a fundamental concept in DHT (Distributed Hash Table) systems. The XOR of two node IDs gives a new value that represents how different they are bit-by-bit. A smaller XOR result means the nodes are closer in the ID space.

func (id NodeID) XOR(otherId NodeID) *big.Int {
    var result NodeID
    for i := 0; i < len(id); i++ {
        result[i] = id[i] ^ otherId[i]
    }
    return new(big.Int).SetBytes(result[:])
}

Once we can uniquely identify and compare distances between peers, the next step is to maintain a list of known peers. This is where K-Buckets come into play. A K-Bucket is simply a container that stores a small list of peer contacts typically capped at a fixed number like 20 in real systems, but we’ll use 2 here for demonstration purposes.

Each contact holds essential info about a peer: its NodeID, network address (e.g., IP and port), and the last time it was seen online.

const contactSize = 2

type Contacts struct {
    Id           NodeID
    Address      string
    last_seen_at time.Time
}

type KBucket struct {
    contacts []Contacts
}

We can search the bucket for a specific peer by ID using the Find method, which returns the address if the node is found.

func (kb *KBucket) Find(id NodeID) string {
    for _, c := range kb.contacts {
        if c.Id == id {
            return c.Address
        }
    }
    return ""
}

To maintain the K-Bucket, we use the Add method. It checks if the node already exists. If there's space, it adds the new contact. If the bucket is full, it tries to evict the least recently seen contact using the Evict method.

func (kb *KBucket) Add(contact Contacts) bool {
    for _, c := range kb.contacts {
        if c.Id == contact.Id {
            return true
        }
    }

    if len(kb.contacts) < contactSize {
        kb.contacts = append(kb.contacts, contact)
        return true
    } else {
        evicted := kb.Evict()
        if evicted {
            kb.contacts = append(kb.contacts, contact)
            return true
        } else {
            fmt.Println("There is no space in this bucket")
            return false
        }
    }
}

Eviction is a key part of a robust peer list. When the bucket is full, we sort the contacts by their last_seen_at timestamp (oldest first) and attempt to ping the oldest contact. If it's offline, we safely remove it and make room for a new one.

func (kb *KBucket) Evict() bool {
    sort.Slice(kb.contacts, func(i, j int) bool {
        return kb.contacts[i].last_seen_at.Before(kb.contacts[j].last_seen_at)
    })

    contact := kb.contacts[0]
    _, err := net.Dial("tcp", contact.Address)
    if err != nil {
        kb.contacts = append(kb.contacts[:0], kb.contacts[1:]...)
        return true
    }
    return false
}

This strategy aligns with Kademlia's Least Recently Seen (LRS) eviction policy ensuring that more active and reliable peers remain in the routing table while stale or unreachable peers are replaced. This is essential for maintaining a responsive and scalable peer-to-peer network.

Once the structure of the K-Buckets is in place, we move on to the core component of peer discovery: the Routing Table. The routing table acts as a high-level container for all buckets and is responsible for managing peer lookups and updates across the network. It consists of a slice of K-Buckets and the current node's ID.

type RoutingTable struct {
    buckets []KBucket
    selfId  NodeID
}

Each peer maintains its own routing table, which contains buckets representing various distance ranges from the current node. When a new contact is introduced, the appropriate bucket is determined based on the Most Significant Bit (MSB) index of the XOR distance between the current node ID and the contact's ID. This is handled by the GetMSBIndex utility function:

func GetMSBIndex(distance *big.Int) int {
    if distance.Sign() == 0 {
        return -1
    }
    return distance.BitLen() - 1
}

This function ensures that a contact is placed in the correct bucket based on its logical distance from the current node. If the calculated distance is zero (i.e., the node is attempting to add itself), the function returns -1, and the contact is ignored.

The Add method of the routing table adds a given contact to the correct bucket, if possible:

func (rt *RoutingTable) Add(contact Contacts) {
    distance := rt.selfId.XOR(contact.Id)
    index := GetMSBIndex(distance)

    if index >= 0 {
        isAdded := rt.buckets[index].Add(contact)
        if isAdded {
            fmt.Printf("The contact is added to bucket: %d \n", index)
        } else {
            fmt.Println("Not added")
        }
    } else {
        fmt.Println("Same peer can't connect with each other")
    }
}

Another critical feature is the ability to find the closest peers to a target node ID. This is particularly useful during message forwarding and when bootstrapping the network. The FindClosestContacts method performs a global scan across all buckets, sorts the contacts by XOR distance to the target ID, and returns the closest k contacts:

func (rt *RoutingTable) FindClosestContacts(targetID NodeID, k int) []Contacts {
    var allContacts []Contacts

    for _, bucket := range rt.buckets {
        allContacts = append(allContacts, bucket.contacts...)
    }

    sort.Slice(allContacts, func(i, j int) bool {
        d1 := targetID.XOR(allContacts[i].Id)
        d2 := targetID.XOR(allContacts[j].Id)
        return d1.Cmp(d2) == -1
    })

    if len(allContacts) > k {
        return allContacts[:k]
    }
    return allContacts
}

To support the initialization of nodes and routing tables, we define utility constructors. NewNodeId generates a hashed node ID using SHA-1, and NewRoutingTable initializes the routing table with empty K-Buckets.

func NewNodeId(data []byte) NodeID {
    hash := sha1.Sum(data[:])
    var newId NodeID
    copy(newId[:], hash[:len(newId)])
    return newId
}

func NewRoutingTable(selfID NodeID) *RoutingTable {
    buckets := make([]KBucket, IdBits)
    for i := 0; i < len(buckets); i++ {
        buckets[i] = KBucket{contacts: make([]Contacts, 0, contactSize)}
    }
    return &RoutingTable{buckets: buckets, selfId: selfID}
}

Adding Messaging Capability to Peers

With the routing infrastructure in place, we can now add real messaging capabilities between nodes in the network. This is crucial for simulating peer-to-peer interactions like sending commands, exchanging data, or simulating real-world communication in a distributed system.

We introduce a MessagingPeer structure, which encapsulates the logic to start a server, send and receive messages, and use the routing table to locate other peers in the network.

1. Message Structure

We define a Messages struct to encapsulate message-related data:

type Messages struct {
    SenderId       NodeID
    ReceiverId     NodeID
    MessageContent string
    MessageId      string
}

SenderId and ReceiverId identify the message endpoints.
MessageContent holds the actual message text.
MessageId can be used for tracking, deduplication, or threading.

2. MessagingPeer Struct

We now define a MessagingPeer to wrap the node's messaging functionality:

type MessagingPeer struct {
    ID           NodeID
    Messages     []Messages
    Port         int
    Address      string
    RoutingTable *RoutingTable
}

Messages keeps a record of received messages.
RoutingTable is used to find other nodes (reusing our earlier routing logic).
Each peer listens on its own Port.

3. Starting the Server

Each peer runs a TCP server to receive messages:

func (mp *MessagingPeer) StartServer() {
    ln, err := net.Listen("tcp", fmt.Sprintf(":%d", mp.Port))
    if err != nil {
        log.Fatalf("Failed to start server: %v", err)
    }
    defer ln.Close()

    for {
        conn, err := ln.Accept()
        if err != nil {
            continue
        }
        go mp.handleConnection(conn)
    }
}

The server listens on its configured port.
Incoming connections are handled concurrently.

4. Receiving and Storing Messages

When a peer receives a message, it is deserialized and stored:

func (mp *MessagingPeer) handleConnection(conn net.Conn) {
    defer conn.Close()
    buf := make([]byte, 1024)
    n, err := conn.Read(buf)
    if err != nil {
        return
    }

    var msg Messages
    err = json.Unmarshal(buf[:n], &msg)
    if err != nil {
        return
    }

    mp.Messages = append(mp.Messages, msg)
}

This allows each peer to act as a receiver in the network.

5. Local Node Lookup

We add a helper function that wraps our existing routing logic to find nodes by ID:

func (mp *MessagingPeer) FindNode(targetID NodeID) []Contacts {
    return mp.RoutingTable.FindClosestContacts(targetID, contactSize)
}

This uses our bucket-based routing table to get nearby contacts.

6. Iterative Lookup Across Peers

If a node is not found locally, we query other known peers:

func (mp *MessagingPeer) IterativeFindNode(targetId NodeID, knownPeers map[string]*MessagingPeer) []Contacts {
    visited := make(map[string]bool)
    shortList := mp.RoutingTable.FindClosestContacts(targetId, contactSize)
    closest := shortList

    for {
        newShortList := []Contacts{}
        for _, contact := range shortList {
            idStr := contact.Id.String()
            if visited[idStr] {
                continue
            }
            visited[idStr] = true
            peer := knownPeers[idStr]
            if peer == nil {
                continue
            }
            closerContacts := peer.FindNode(targetId)
            for _, c := range closerContacts {
                if !visited[c.Id.String()] {
                    newShortList = append(newShortList, c)
                }
            }
        }
        all := append(closest, newShortList...)
        sort.Slice(all, func(i, j int) bool {
            return targetId.XOR(all[i].Id).Cmp(targetId.XOR(all[j].Id)) < 0
        })
        if len(all) > contactSize {
            all = all[:contactSize]
        }
        if EqualContacts(closest, all) {
            break
        }
        closest = all
        shortList = newShortList
    }
    return closest
}

Works similarly to Kademlia’s node lookup.
Stops when no closer peers are found.

7. Sending a Message to Another Peer

We now allow one peer to send a message to another, using either the routing table or iterative search:

func (mp *MessagingPeer) SendMessage(content string, peerId NodeID, network map[string]*MessagingPeer) (string, error) {
    if mp.ID == peerId {
        return "", fmt.Errorf("same peer can't send message to itself")
    }

    distance := mp.ID.XOR(peerId)
    index := GetMSBIndex(distance)
    peerAddress := mp.RoutingTable.buckets[index].Find(peerId)

    if peerAddress == "" {
        closest := mp.IterativeFindNode(peerId, network)
        for _, con := range closest {
            if con.Id == peerId {
                peerAddress = con.Address
                break
            }
        }
    }

    if peerAddress == "" {
        return "", fmt.Errorf("could not find peer address")
    }

    conn, err := net.Dial("tcp", peerAddress)
    if err != nil {
        return "", fmt.Errorf("failed to connect to peer: %w", err)
    }
    defer conn.Close()

    msg := &Messages{
        SenderId:       mp.ID,
        ReceiverId:     peerId,
        MessageContent: content,
        MessageId:      "Random",
    }

    msgBytes, err := json.Marshal(msg)
    if err != nil {
        return "", fmt.Errorf("failed to serialize message: %w", err)
    }

    _, err = conn.Write(msgBytes)
    if err != nil {
        return "", fmt.Errorf("failed to send message: %w", err)
    }

    return "message sent successfully", nil
}

Attempts direct routing first.
Falls back to iterative peer discovery if needed.
Message is serialized as JSON and sent via TCP.

8. Peer Initialization

Finally, we define a constructor for MessagingPeer:

func NewMessagingPeer(port int, address string) *MessagingPeer {
    add := fmt.Sprintf("%s:%d", address, port)
    selfID := NewNodeId([]byte(add))
    return &MessagingPeer{
        ID:           selfID,
        Messages:     make([]Messages, 0),
        Port:         port,
        Address:      add,
        RoutingTable: NewRoutingTable(selfID),
    }
}

This sets the NodeID using the address + port as a seed, ensuring uniqueness in the network.

Main Function: Bringing It All Together

As all the components are now in place for our simple peer-to-peer messaging system, we move on to our main function, where we use everything we’ve built so far peer creation, server start-up, routing table population, and the actual message sending mechanism.

package main

import (
    "fmt"
    "time"
    "github.com/HadeedTariq/p-to-p-msg-system-kademlia/algo"
)

We import the necessary packages including our algo package where all the peer and routing logic lives.

Creating and Starting Peers

peerA := algo.NewMessagingPeer(8000, "localhost")
go peerA.StartServer()
peerB := algo.NewMessagingPeer(8001, "localhost")
go peerB.StartServer()
peerC := algo.NewMessagingPeer(8002, "localhost")
go peerC.StartServer()
peerD := algo.NewMessagingPeer(8003, "localhost")
go peerD.StartServer()
peerE := algo.NewMessagingPeer(8004, "localhost")
go peerE.StartServer()
time.Sleep(1 * time.Second)

We create five peers (A through E), each listening on a unique port.

Each MessagingPeer starts its own server in a goroutine to handle incoming messages concurrently.

We add a small delay using time.Sleep to ensure all servers are up before proceeding.

Manually Populating Routing Tables

peerB.RoutingTable.Add(algo.Contacts{
    Id: peerA.ID,
    Address: peerA.Address,
})
peerB.RoutingTable.Add(algo.Contacts{
    Id: peerC.ID,
    Address: peerC.Address,
})
peerC.RoutingTable.Add(algo.Contacts{
    Id: peerD.ID,
    Address: peerD.Address,
})
peerD.RoutingTable.Add(algo.Contacts{
    Id: peerE.ID,
    Address: peerE.Address,
})

In a real distributed network, peers discover each other via ping or bootstrap mechanisms.

Here, we manually add entries to some routing tables to simulate an imperfect but realistic partial network view.

Peer B knows A and C.
C knows D.
D knows E.

This mimics limited knowledge in a real-world scenario.

Simulating the Network Map

network := map[string]*algo.MessagingPeer{}
network[peerA.ID.String()] = peerA
network[peerB.ID.String()] = peerB
network[peerC.ID.String()] = peerC
network[peerD.ID.String()] = peerD
network[peerE.ID.String()] = peerE

We construct a network map to simulate our distributed environment.

This acts as a directory the sending peer can use during IterativeFindNode().

Note: This is only for simulation. In a real P2P system, peers wouldn’t have access to this map directly.

Sending the Message

msg, err := peerB.SendMessage("Hello from B to E", peerE.ID, network)
if err != nil {
    fmt.Println(err)
}
fmt.Println(msg)

Here’s where the actual communication happens:

Peer B tries to send a message to Peer E.
Since B doesn't know E directly, it uses the iterative lookup we implemented earlier to find E via the known path B → C → D → E.
Once found, it establishes a TCP connection and sends the message.

If successful, we see "message sent successfully" printed.

Keeping the Program Alive

select {}

We block the main goroutine using select {} so all servers stay alive and continue listening for incoming messages.

Conclusion

So as you can see, we’ve built our own small peer-to-peer messaging system from scratch! I’ve tried to explain each concept in a simple and easy-to-understand way so you don’t need to copy and paste everything blindlyjust understand the flow and play around with the code.
You can find the complete source code here: GitHub Repository
I also write about database-related topics and other backend concepts. Feel free to check out more of my work here: @HadeedTariq on Dev.to

Before I end this post, I want to take a moment to express my deep love and unwavering support for our Palestinian Muslim brothers and sisters