Loss Functions and NN Parameter Accumulation in micrograd

Hi there! I'm Shrijith Venkatrama, founder of Hexmos. Right now, I’m building LiveAPI, a tool that makes generating API docs from your code ridiculously easy. A Sample Dataset and Comparing With Predictions In the following dataset, we have 4 example inputs/outputs. And each input has 3 numbers. We use the neural network (MLP object) defined in last post to do some predictions on each input. xs = [ [2.0, 3.0, -1.0], [3.0, -1.0, 0.5], [0.5, 1.0, 1.0], [1.0, 1.0, -1.0], ] ys = [1.0, -1.0, -1.0, 1.0] # desired targets ypred = [n(x) for x in xs] ypred When I run this I get some predictions which are either a bit less than required or more: A Skeleton to Build a Loss Function Code: We define a loss variable, and define in the following way: Find the difference between actual output vs predicted output Square the difference (to remove negative sum) Sum all such squares of differences for whole input/output set loss = sum((yout - ygt)**2 for ygt, yout in zip(ys, ypred)) loss.grad = 1 loss.backward() n.layers[0].neurons[0].w[0].grad Once we have the loss node, we can run backward propagation to get gradients for each node. When we do draw_dot(loss) we get a huge graph consisting of 4 forward passes corresponding to the 4 examples above and loss on top of those. Gathering All Neural Network Parameters And Operating On Them Pay attention to the parameters() method in each of the classes below - where we collect the various parameters at every level (neuron, layer, MLP) for convenience: class Neuron: def init(self, nin): self.w = [Value(random.uniform(-1,1)) for _ in range(nin)] self.b = Value(random.uniform(-1,1)) def call(self, x): # w * x + b act = sum((wi * xi for wi, xi in zip(self.w, x)), self.b) out = act.tanh() return out def parameters(self): return self.w + [self.b] class Layer: def init(self, nin, nout): self.neurons = [Neuron(nin) for _ in range(nout)] def call(self, x): outs = [n(x) for n in self.neurons] return outs[0] if len(outs) == 1 else outs def parameters(self): params = [] for neuron in self.neurons: params.extend(neuron.parameters()) return params class MLP: def init(self, nin, nouts): sz = [nin] + nouts self.layers = [Layer(sz[i], sz[i+1]) for i in range(len(nouts))] def call(self, x): for layer in self.layers: x = layer(x) return x def parameters(self): params = [] for layer in self.layers: params.extend(layer.parameters()) return params Now when I do n.parameters() I get all the parameters in the neural network: Reference The spelled-out intro to neural networks and backpropagation: building micrograd - YouTube

Feb 22, 2025 - 20:07

Loss Functions and NN Parameter Accumulation in micrograd

Hi there! I'm Shrijith Venkatrama, founder of Hexmos. Right now, I’m building LiveAPI, a tool that makes generating API docs from your code ridiculously easy.

A Sample Dataset and Comparing With Predictions

In the following dataset, we have 4 example inputs/outputs. And each input has 3 numbers.

We use the neural network (MLP object) defined in last post to do some predictions on each input.

xs = [
    [2.0, 3.0, -1.0],
    [3.0, -1.0, 0.5],
    [0.5, 1.0, 1.0],
    [1.0, 1.0, -1.0],
]

ys = [1.0, -1.0, -1.0, 1.0]  # desired targets
ypred = [n(x) for x in xs]
ypred

When I run this I get some predictions which are either a bit less than required or more:

A Skeleton to Build a Loss Function

Code:

We define a loss variable, and define in the following way:

Find the difference between actual output vs predicted output
Square the difference (to remove negative sum)
Sum all such squares of differences for whole input/output set


loss = sum((yout - ygt)**2 for ygt, yout in zip(ys, ypred))
loss.grad = 1

loss.backward()

n.layers[0].neurons[0].w[0].grad

Once we have the loss node, we can run backward propagation to get gradients for each node.

When we do draw_dot(loss) we get a huge graph consisting of 4 forward passes corresponding to the 4 examples above and loss on top of those.

Gathering All Neural Network Parameters And Operating On Them

Pay attention to the parameters() method in each of the classes below - where we collect the various parameters at every level (neuron, layer, MLP) for convenience:

class Neuron:

    def __init__(self, nin):
        self.w = [Value(random.uniform(-1,1)) for _ in range(nin)]
        self.b = Value(random.uniform(-1,1))

    def __call__(self, x):
        # w * x + b
        act = sum((wi * xi for wi, xi in zip(self.w, x)), self.b)
        out = act.tanh()
        return out

    def parameters(self):
        return self.w + [self.b]


class Layer:

    def __init__(self, nin, nout):
        self.neurons = [Neuron(nin) for _ in range(nout)]

    def __call__(self, x):
        outs = [n(x) for n in self.neurons]
        return outs[0] if len(outs) == 1 else outs

    def parameters(self):
        params = []
        for neuron in self.neurons:
            params.extend(neuron.parameters())
        return params

class MLP:

    def __init__(self, nin, nouts):
        sz = [nin] + nouts
        self.layers = [Layer(sz[i], sz[i+1]) for i in range(len(nouts))]

    def __call__(self, x):
        for layer in self.layers:
            x = layer(x)
        return x

    def parameters(self):
        params = []
        for layer in self.layers:
            params.extend(layer.parameters())
        return params

Now when I do n.parameters() I get all the parameters in the neural network:

Reference

The spelled-out intro to neural networks and backpropagation: building micrograd - YouTube