Building a Real-Time Voice Assistant with Local LLMs on a Raspberry Pi
Introduction In this document, I’m sharing my journey of turning a Raspberry Pi into a powerful, real-time voice assistant. The goal was to: Capture voice input through a web interface. Process the text using a local LLM (like Mistral) running on the Pi. Generate voice responses using Piper for text-to-speech (TTS). Stream everything in real-time via WebSockets. All of this runs offline on the Raspberry Pi — no cloud services involved. Let’s dive into how I built it step by step! 1. Setting up the Raspberry Pi First, I set up my Raspberry Pi with the latest Raspberry Pi OS. It’s important to enable hardware interfaces and connect a USB microphone and speaker. Steps: Update the system: sudo apt-get update sudo apt-get upgrade Enable the audio interface: sudo raspi-config Navigate to System Options > Audio and select the correct output/input device. 2. Installing Ollama for Local LLMs Ollama makes it easy to run local LLMs like Mistral on your Raspberry Pi. I installed it using: curl -fsSL https://ollama.com/install.sh | sh Once installed, I pulled the Mistral model: ollama pull mistral To confirm it works, I ran a quick test: ollama run mistral The model was ready to process text right on the Pi! 3. Setting up Piper for Text-to-Speech (TTS) For offline voice generation, I chose Piper — a fantastic open-source TTS engine. Install dependencies: sudo apt-get install wget build-essential libsndfile1 Download Piper for ARM64 (Raspberry Pi): wget https://github.com/rhasspy/piper/releases/download/v1.0.0/piper_arm64.tar.gz tar -xvzf piper_arm64.tar.gz chmod +x piper sudo mv piper /usr/local/bin/ Test if Piper works: echo "Hello, world!" | piper --model en_US --output_file output.wav aplay output.wav Now the Pi could "talk" back! 4. Creating the Backend (Node.js) I built a simple Node.js server to: Accept text from the client (voice input from a web app). Process it using Mistral (via Ollama). Convert the LLM response to speech with Piper. Stream the audio back to the client. server.js: const express = require('express'); const { exec } = require('child_process'); const WebSocket = require('ws'); const app = express(); const PORT = 3001; // WebSocket setup const wss = new WebSocket.Server({ port: 3002 }); wss.on('connection', (ws) => { console.log('Client connected'); ws.on('message', (message) => { console.log('Received:', message); // Run Mistral LLM exec(`ollama run mistral "${message}"`, (err, stdout) => { if (err) { console.error('LLM error:', err); ws.send('Error processing your request.'); return; } // Convert LLM response to speech using Piper exec(`echo "${stdout}" | piper --model en_US --output_file output.wav`, (ttsErr) => { if (ttsErr) { console.error('Piper error:', ttsErr); ws.send('Error generating speech.'); return; } // Send the audio file back to the client ws.send(JSON.stringify({ text: stdout, audio: 'output.wav' })); }); }); }); }); app.listen(PORT, () => { console.log(`Server running at http://localhost:${PORT}`); }); 5. Building the Real-Time Web Interface (React) For the frontend, I created a simple React app to: Record voice input. Display real-time text responses. Play the generated speech audio. App.js: import React, { useState } from 'react'; function App() { const [text, setText] = useState(''); const [response, setResponse] = useState(''); const [audio, setAudio] = useState(null); const ws = new WebSocket('ws://localhost:3002'); const handleSend = () => { ws.send(text); }; ws.onmessage = (event) => { const data = JSON.parse(event.data); setResponse(data.text); fetch(`http://localhost:3001/${data.audio}`) .then(res => res.blob()) .then(blob => { setAudio(URL.createObjectURL(blob)); }); }; return ( Voice Assistant setText(e.target.value)} /> Send Response: {response} {audio && } ); } export default App; 6. Running the Project Once the backend and frontend were ready, I launched both: Start the backend: node server.js Run the React app: npm start I accessed the web app on my Raspberry Pi’s IP at port 3000 and spoke into the mic — and voilà! The assistant responded in real-time, all processed locally. Conclusion Building a real-time, fully offline voice assistant on a Raspberry Pi was an exciting challenge. With: Ollama for running local LLMs (like Mistral) Piper for high-quality text-to-speech WebSockets for real-time communication React for a smooth web interface ... I now have a personalized voice AI that works without relying on th

Introduction
In this document, I’m sharing my journey of turning a Raspberry Pi into a powerful, real-time voice assistant. The goal was to:
- Capture voice input through a web interface.
- Process the text using a local LLM (like Mistral) running on the Pi.
- Generate voice responses using Piper for text-to-speech (TTS).
- Stream everything in real-time via WebSockets.
All of this runs offline on the Raspberry Pi — no cloud services involved. Let’s dive into how I built it step by step!
1. Setting up the Raspberry Pi
First, I set up my Raspberry Pi with the latest Raspberry Pi OS. It’s important to enable hardware interfaces and connect a USB microphone and speaker.
Steps:
- Update the system:
sudo apt-get update
sudo apt-get upgrade
- Enable the audio interface:
sudo raspi-config
Navigate to System Options > Audio and select the correct output/input device.
2. Installing Ollama for Local LLMs
Ollama makes it easy to run local LLMs like Mistral on your Raspberry Pi. I installed it using:
curl -fsSL https://ollama.com/install.sh | sh
Once installed, I pulled the Mistral model:
ollama pull mistral
To confirm it works, I ran a quick test:
ollama run mistral
The model was ready to process text right on the Pi!
3. Setting up Piper for Text-to-Speech (TTS)
For offline voice generation, I chose Piper — a fantastic open-source TTS engine.
- Install dependencies:
sudo apt-get install wget build-essential libsndfile1
- Download Piper for ARM64 (Raspberry Pi):
wget https://github.com/rhasspy/piper/releases/download/v1.0.0/piper_arm64.tar.gz
tar -xvzf piper_arm64.tar.gz
chmod +x piper
sudo mv piper /usr/local/bin/
- Test if Piper works:
echo "Hello, world!" | piper --model en_US --output_file output.wav
aplay output.wav
Now the Pi could "talk" back!
4. Creating the Backend (Node.js)
I built a simple Node.js server to:
- Accept text from the client (voice input from a web app).
- Process it using Mistral (via Ollama).
- Convert the LLM response to speech with Piper.
- Stream the audio back to the client.
server.js:
const express = require('express');
const { exec } = require('child_process');
const WebSocket = require('ws');
const app = express();
const PORT = 3001;
// WebSocket setup
const wss = new WebSocket.Server({ port: 3002 });
wss.on('connection', (ws) => {
console.log('Client connected');
ws.on('message', (message) => {
console.log('Received:', message);
// Run Mistral LLM
exec(`ollama run mistral "${message}"`, (err, stdout) => {
if (err) {
console.error('LLM error:', err);
ws.send('Error processing your request.');
return;
}
// Convert LLM response to speech using Piper
exec(`echo "${stdout}" | piper --model en_US --output_file output.wav`, (ttsErr) => {
if (ttsErr) {
console.error('Piper error:', ttsErr);
ws.send('Error generating speech.');
return;
}
// Send the audio file back to the client
ws.send(JSON.stringify({ text: stdout, audio: 'output.wav' }));
});
});
});
});
app.listen(PORT, () => {
console.log(`Server running at http://localhost:${PORT}`);
});
5. Building the Real-Time Web Interface (React)
For the frontend, I created a simple React app to:
- Record voice input.
- Display real-time text responses.
- Play the generated speech audio.
App.js:
import React, { useState } from 'react';
function App() {
const [text, setText] = useState('');
const [response, setResponse] = useState('');
const [audio, setAudio] = useState(null);
const ws = new WebSocket('ws://localhost:3002');
const handleSend = () => {
ws.send(text);
};
ws.onmessage = (event) => {
const data = JSON.parse(event.data);
setResponse(data.text);
fetch(`http://localhost:3001/${data.audio}`)
.then(res => res.blob())
.then(blob => {
setAudio(URL.createObjectURL(blob));
});
};
return (
<div>
<h1>Voice Assistant</h1>
<textarea value={text} onChange={(e) => setText(e.target.value)} />
<button onClick={handleSend}>Send</button>
<h2>Response:</h2>
<p>{response}</p>
{audio && <audio controls src={audio} />}
</div>
);
}
export default App;
6. Running the Project
Once the backend and frontend were ready, I launched both:
- Start the backend:
node server.js
- Run the React app:
npm start
I accessed the web app on my Raspberry Pi’s IP at port 3000 and spoke into the mic — and voilà! The assistant responded in real-time, all processed locally.
Conclusion
Building a real-time, fully offline voice assistant on a Raspberry Pi was an exciting challenge. With:
- Ollama for running local LLMs (like Mistral)
- Piper for high-quality text-to-speech
- WebSockets for real-time communication
- React for a smooth web interface
... I now have a personalized voice AI that works without relying on the cloud.