Podman Quadlets

Introduction

In one of my previous posts about Podman I wrote that in the newer version of Podman (> 4.4), quadlets are recommended instead of podman-compose (both as an alternative to docker-compose). Now with Debian 13 we have an appropriate version of Podman to play with quadlets. Following current homelab trends, I will show how to install local AI using Podman and Quadlets, specifically setting up Ollama and Open WebUI. I want to keep this post short, so I will focus on installation on bare metal desktop computer. In such setup you don't need to configure PCI Passthrough which you need with virtual machines.

OS used: Debian 13
Software used: Podman 5.4.2, Nvidia Container Toolkit 1.19.1, Ollama 0.30.8, Open WebUI 0.9.6

Source

Official Podman 5.4.2 Quadlets Documentation

Introduction

Podman Quadlets are user written systemd unit files used for managing containers, pods, volumes, networks, and images. They also provide an easy way to ensure containers start automatically at system boot.

Install Podman

Install podman:

$ sudo apt install podman

If you will be using this setup on bare metal server with GPU, then following command should be used to ensure that a user session is spawned at boot and kept active even after logouts from GUI or tty sessions:

$ sudo loginctl enable-linger <username>

<username> - enter username of user using Podman

Reboot the system for the above command to take effect.

Install GPU drivers

Now you should install GPU drivers (if you haven't already). I have Nvidia card, so I will show how to install Nvidia drivers.

Nvidia drivers from Debian repository

Install following packages:

$ sudo apt install linux-headers-$(dpkg --print-architecture) \
  nvidia-kernel-dkms \
  nvidia-driver \
  firmware-misc-nonfree \
  libnvoptix1

Official Nvidia drivers

If you prefer newer drivers than those from Debian repositories, you can download drivers from Nvidia webpage. Save them to your ~/Downloads/ directory.

To install proprietary Nvidia drivers you need to block nouveau kernel driver by creating a file in /etc/modprobe.d/ directory:

$ sudo vim /etc/modprobe.d/nvidia-disable-nouveau.conf

/etc/modprobe.d/nvidia-disable-nouveau.conf

blacklist nouveau
options nouveau modeset=0

After that update initramfs:

$ sudo update-initramfs -u

Reboot system.

After reboot install packages needed by Nvidia drivers installer:

$ sudo apt install build-essential \
  linux-headers-$(dpkg --print-architecture) \
  dkms

Now run Nvidia drivers installer:

$ sudo bash ~/Downloads/NVIDIA-Linux-x86_64-{version}.run

Note

INFO 1:
Nvidia drivers temporarily extract data to /tmp/ directory. In Debian 13 this folder is kept in memory and it's max size depends on available RAM. For 8GB RAM and more the drivers extracts with no problems.

INFO 2:
During installation you can see following message: WARNING: Unable to determine the path to install the libglvnd EGL vendor library config files. Check that you have pkg-config and the libglvnd development libraries installed, or specify a path with --glvnd-egl-config-path.. You can ingore it. Drivers work fine if you continue.

INFO 3:
During installation there can also be a message that xserver is being used. On Wayland you can ignore it. On X11 go to the terminal Ctrl+Alt+F1 and type following command:

For Gnome
```
$ sudo systemctl stop gdm
```
For KDE
```
$ sudo systemctl stop sddm
```

After that continue installation in terminal.

Install Nvidia Container Toolkit

For containers to have access to Nvidia GPU you need to install the Nvidia Container Toolkit:

$ curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
  && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
  sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
  sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list && \
  sudo apt-get update && \
  sudo apt-get install -y nvidia-container-toolkit

Check the list of generated CDI (Container Device Interface) devices:

$ nvidia-ctk cdi list

nvidia.com/gpu=0
nvidia.com/gpu=GPU-...
nvidia.com/gpu=all

If you don't see Nvidia GPU in the line nvidia.com/gpu=, manually generate CDI:

$ sudo systemctl restart nvidia-cdi-refresh.service

Reboot system and check if CDI are still present:

$ nvidia-ctk cdi list

Tip

If CDI devices are absent (I had this problem on Debian 13 with Nvidia drivers installed from Debian repository), use this command:

$ sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml

IMPORTANT: In this case generate CDI after every Nvidia driver update.

Verify that containers have access to GPU:

$ podman run --rm --device nvidia.com/gpu=all \
  docker.io/nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi

Create Podman Quadlets

Now you can start to create Podman Quadlets.

Quadlet files for user containers should be created in $HOME/.config/containers/systemd/ directory. Create this directory:

$ mkdir -p ~/.config/containers/systemd

Containers network

$ vim ~/.config/containers/systemd/ollama.network

~/.config/containers/systemd/ollama.network

[Network]
NetworkName=ollama

Ollama container

With Nvidia support

$ vim ~/.config/containers/systemd/ollama.container

~/.config/containers/systemd/ollama.container

[Container]
ContainerName=ollama
Image=docker.io/ollama/ollama
AutoUpdate=registry
Volume=ollama:/root/.ollama
Timezone=Europe/Warsaw
Environment=OLLAMA_MAX_LOADED_MODELS=1
Network=ollama.network
AddDevice=nvidia.com/gpu=all

[Service]
Restart=on-failure

[Install]
# Start by default on boot
# If you don’t want an unit to start at boot, remove the [Install] section
WantedBy=multi-user.target default.target

With Vulkan support

If you have AMD or Intel GPU you can use following code for Ollama with Vulkan support:

$ vim ~/.config/containers/systemd/ollama.container

~/.config/containers/systemd/ollama.container

[Container]
ContainerName=ollama
Image=docker.io/ollama/ollama
AutoUpdate=registry
Volume=ollama:/root/.ollama
Timezone=Europe/Warsaw
Environment=OLLAMA_MAX_LOADED_MODELS=1
Environment=OLLAMA_VULKAN=1
Network=ollama.network
AddDevice=/dev/kfd
AddDevice=/dev/dri

[Service]
Restart=on-failure

[Install]
# Start by default on boot
# If you don’t want an unit to start at boot, remove the [Install] section
WantedBy=multi-user.target default.target

Open WebUI container

This is a simple Quadlet for creating Open WebUI container without user authentication (Environment=WEBUI_AUTH=False).

$ vim ~/.config/containers/systemd/open-webui.container

~/.config/containers/systemd/open-webui.container

[Container]
ContainerName=open-webui
Image=ghcr.io/open-webui/open-webui:main
AutoUpdate=registry
Volume=open-webui:/app/backend/data
Timezone=Europe/Warsaw
Environment=WEBUI_AUTH=False
Environment=OLLAMA_BASE_URL="http://ollama:11434"
Network=ollama.network
PublishPort=3000:8080

[Service]
Restart=on-failure

[Install]
# Start by default on boot
# If you don’t want an unit to start at boot, remove the [Install] section
WantedBy=multi-user.target default.target

Apply Podman Quadlets

Reload systemd daemon:

$ systemctl --user daemon-reload

Start Quadlets:

$ systemctl --user start ollama.service && \
  systemctl --user start open-webui.service

Auto update containers

Because we used latest images from containers registry and AutoUpdate option in .container files, we can auto update containers with following command:

$ podman auto-update

Delete unused images

You can delete old unused images that will be left after AutoUpdate by using command:

$ podman image prune -a

Start Open WebUI

Enter address http://localhost:3000 in your browser to use Open WebUI and Ollama. Because this installation is used on your own Desktop computer we don't use https or user authentication in Open WebUI. Protect access to Open WebUI by ensuring that you block port 3000 on your Desktop computer firewall.