MoodStream

enricozanetti1 pts0 comments

Introducing MoodStream - Enrico Zanetti

Back to Blog<br>Discover MoodStream here: MoodStream Repository on GitHub

Important: MoodStream was designed and built for the sole purpose of benefiting humans: not for surveillance, profiling, or behavioral control, nor for replacing people’s roles. Emotion recognition is a powerful and sensitive technology, and responsible use means consent, transparency, and purpose limitation.

Most people view language as the primary means of communication. It’s undoubtedly the most controlled and widely used form, yet we often underestimate the power of non-verbal communication. All of us constantly rely on it, yet we’re not very skilled at conveying it as effectively as we are with words. Examples of non-verbal communication are facial expressions, gestures, body posture, eye contact, proxemics, tone of voice, and micro-expressions: we send all of this simultaneously, mostly without thinking about it.

Now think about how we communicate with machines.

Machines are everywhere around us, on your desk, in your pocket, and are becoming ever more naturally integrated into our lives. There’s going to be a day, and I don’t think it’s far off, when robots, humanoid or otherwise, are part of our everyday lives. They’ll help with household chores, do small repairs, and maybe even become someone we can talk to without feeling judged, as many people are already doing with AI chatbots. Whatever the task, you’ll have to interact with these machines. Their brains will most likely be powered by LLMs that receive your speech as transcribed text.

However, your future robot won’t know if you’re angry, sad, anxious, or just exhausted, because all it gets is the words you chose to say, words you can carefully select to hide what you actually feel. The robot will be incredibly easy to fool, and as a result, it won’t be able to adapt its behavior, tone, or responses to what you actually need.

MoodStream is built to close that gap.

It’s an open-source module that classifies your facial expression in real time and streams your emotional state through a pipeline so other systems can use it. Your robot, your application, your dashboard, your research tool, gets a more honest version of you, communicating not just what you’re saying, but what your face tells about your emotional state while you say it.

The detected emotion can be visualized on a live dashboard, and the data is stored in a database for later analysis. MoodStream is designed to run on resource-constrained hardware, like a tiny camera mounted on a robot, but it also works on your laptop using a regular webcam.

How it works

The system is built as a pipeline of small, independent stages, each doing one job and passing its output to the next.

Here’s the end-to-end journey of a single video frame:

Capture. A frame is acquired from a video source: a webcam via OpenCV, an embedded camera module (OpenMV Cam H7+) connected over UART, or a synthetic source for testing. The active source is selected at startup via CLI flags.

Face detection. The frame is converted to grayscale and passed to a Haar Cascade classifier, which locates one or more faces and returns their bounding boxes. If no face is found, the loop simply advances to the next frame.

Cropping & preprocessing. Each detected face is cropped out, resized to 48×48 pixels, and normalized to [0, 1] as float32. Because the input is already grayscale at this stage, the final reshape just adds the batch and channel dimensions expected by the model (1, 48, 48, 1).

Emotion classification. The preprocessed face is fed into a quantized TFLite CNN, which outputs a softmax probability distribution over 6 emotion classes (happy, sad, angry, neutral, surprised, fearful). The top class and its confidence score are returned; the rest of the distribution is not passed downstream.

Publishing. The emotion label is published as an MQTT message to a Mosquitto broker. From there, Node-RED picks it up, enriches it with a timestamp and emoji, and fans it out to two destinations: InfluxDB for time-series persistence, and Grafana for live visualization.

The broker, Node-RED, InfluxDB, and Grafana all run as Docker containers, so the only dependency you need on the host is Python. A single docker compose up brings the entire stack online. Moreover, because each stage is decoupled, you can swap any of them without rewriting the rest. For example, if you want a different face detector you can plug it in, or if you want to send the output to a robot’s behavior controller instead of a dashboard, you can add another consumer at the publishing stage.

Model and training data

The current model is a lightweight convolutional neural network trained on FER2013, a public dataset of around 35,000 grayscale 48×48 face images. The Disgust class from the original dataset is excluded, leaving six classes: anger, fear, happiness, sadness, surprise, and neutral. The model is exported to TensorFlow Lite (float16 quantized),...

moodstream face emotion robot source frame

Related Articles