A look into Ubuntu Core 26: Building a local AI inference appliance in a virtual machine<br>| Ubuntu
Your submission was sent successfully!<br>Close
Thank you for contacting us. A member of our team will be in touch shortly.<br>Close
You have successfully unsubscribed!<br>Close
Thank you for signing up for our newsletter!
In these regular emails you will find the latest updates about<br>Ubuntu and upcoming events where you can meet our team. e.preventDefault()">Close
Your preferences have been successfully updated. Close notification
Please try again or<br>file a bug report.
Close
Blog
Welcome to this blog series which explores innovative uses of Ubuntu Core. Throughout this series, Canonical’s Engineers will show what you can build with this Core 26 release, highlighting the features and tools available to you.
In this first blog, Farshid Tavakolizadeh, Engineer Manager for Canonical’s Industrial team, will show you how to try Ubuntu Core 26 inside a virtual machine and turn it into a local AI inference appliance using Multipass and the gemma4 snap. Running Ubuntu Core in a VM is a useful starting point for developers who want to experiment before moving to dedicated hardware. You can explore the Ubuntu Core environment, install snaps, expose services to your host machine, and test how an appliance-style experience could work in production.
By the end of this blog, you’ll know how to launch Ubuntu Core 26 with Multipass, install a local AI inference snap, access its WebUI from your host machine, and understand how this workflow maps to a production Ubuntu Core image.
Why start with Ubuntu Core in a VM?
Ubuntu Core is designed for production devices: appliances, gateways, robots, kiosks, industrial systems, and edge AI products. In the field, you would normally build a custom Ubuntu Core image that includes the snaps, configuration, permissions, and update policy your product needs.
A virtual machine gives you a fast way to explore the system. You can launch Ubuntu Core from your laptop, install application snaps, test services, and understand how the pieces fit together before committing to a board or production image.
For this, Multipass provides a simple path. It has integrated support for Ubuntu Core images and can launch an Ubuntu Core VM with a single command. That makes it ideal for experimentation, demos, and local development workflows.
Turning the VM into a local AI appliance
We will use Ubuntu Core to create a local AI inference appliance. The idea is simple: Ubuntu Core provides the secure, minimal, appliance-like operating system, while the AI workload is delivered as a snap.
For this example, we’ll use the gemma4 inference snap.
Because AI inference needs more resources than a minimal shell test, launch a VM with additional CPU, memory, and disk:
multipass launch core26 -n aibox --cpus 4 --memory 10GB --disk 16GB<br>Then enter the instance:
multipass shell aibox<br>The Ubuntu Core instance may update itself after first boot, and it may restart automatically. This is part of the experience you should expect from Ubuntu Core: the base system and snapd are managed, updated, and kept reliable.
Now install the AI inference snap:
sudo snap install gemma4<br>This installs the most suitable runtime and model for the machine.
Checking the inference endpoint
Once installed, gemma4 runs as a managed snap service. You can check its status with:
gemma4 status<br>The output includes the active engine, services, and endpoints:
engine: cpu<br>services:<br>server: active<br>server-webui: active<br>endpoints:<br>openai: http://localhost:8336/v1<br>webui: http://localhost:8337/<br>At this point, the inference server and WebUI are running inside the Ubuntu Core instance.
There is one important detail: localhost here refers to the Ubuntu Core VM, not your host machine. So while the service is active, your browser on your laptop cannot necessarily access it yet.
To make the inference server and WebUI available from the host, configure the service to listen on the VM’s network interface:
sudo gemma4 set http.host=0.0.0.0 webui.http.host=0.0.0.0 --assume-yes<br>Then, from your host machine, find the VM’s IP address:
multipass info aibox<br>The output includes an IPv4 address:
Name: aibox<br>State: Running<br>Snapshots: 0<br>IPv4: 10.100.120.150<br>Release: Ubuntu Core 26<br>Use the IPv4 address to access the inference server and WebUI, in this case: 10.100.120.150.
The inference server’s API is accessible at http://:8336/v1. This is an OpenAI compliant API that can be used with a wide range of clients. You can use an HTTP client like cURL to make a prompt:
curl http://10.100.120.150:8336/chat/completions -H "Content-Type: application/json" -d '{<br>"messages": [{"role": "user", "content": "What is the meaning of ubuntu?"}],<br>"max_completion_tokens": 100<br>}'<br>Of course, experimenting with an OpenAI API over the terminal is no fun. The WebUI that is provided by the gemma4 snap is a better entry point to try. Open in...