This chapter is for contributors and maintainers.

Introduction

Welcome to the NeuralDrive Developer Guide. This documentation provides a deep technical look at the internals of NeuralDrive, a headless Large Language Model (LLM) appliance built on Debian 12.

NeuralDrive is designed to transform standard hardware into a high-performance AI server with minimal configuration. It combines modern LLM runtimes with a robust, immutable system architecture to ensure stability and ease of deployment.

Target Audience

This guide is intended for:

Contributors: Developers looking to improve the core system, add features, or fix bugs.
Maintainers: Individuals responsible for managing the build pipeline and release process.
Image Builders: Users who need to create custom ISO images with specific hardware drivers, pre-loaded models, or modified security policies.

Technology Stack

NeuralDrive leverages several key technologies to provide a seamless experience:

Base System: Debian 12 (Bookworm) managed via the live-build framework.
Inference Engine: Ollama for efficient local LLM execution.
User Interface: Open WebUI for a modern, feature-rich chat interface.
Edge Proxy: Caddy server for TLS termination, routing, and authentication.
System Management: A custom FastAPI-based System API and a Textual-based TUI for console interactions.

The project prioritizes security through systemd hardening, dedicated service users, and an automated TLS certificate management system.

Project Vision

NeuralDrive aims to bridge the gap between complex AI research environments and production-ready appliances. By treating the entire OS as a single, reproducible unit, we ensure that the environment remains consistent across different hardware configurations.

For end-user documentation covering installation and basic usage, refer to the User Guide.

This chapter is for contributors and maintainers.

Development Environment Setup

Setting up a reliable development environment is the first step toward contributing to NeuralDrive. Because the project relies on live-build to generate a bootable Debian image, the host environment must support several low-level system tools.

Supported Environments

There are three primary ways to set up your development environment:

Option A: Debian 12 Native (Recommended)

Developing on a native Debian 12 (Bookworm) system is the most reliable method. It avoids potential issues with loop device mounting and filesystem permissions that can occur in containerized environments.

Option B: Docker (Any OS)

If you are on macOS, Windows, or a non-Debian Linux distribution, you can use the provided Docker environment. This container encapsulates all necessary build dependencies. Note that building requires privileged mode to manage loop devices for SquashFS and ISO generation.

Option C: Virtual Machine

Running Debian 12 inside a VM (via VirtualBox, Proxmox, or VMware) provides the benefits of a native environment while keeping the build system isolated from your primary OS.

Prerequisites

Regardless of your environment, you must install the core build dependencies.

Core Build Tools

Install the following packages on a Debian-based host:

sudo apt update
sudo apt install -y \
    live-build \
    debootstrap \
    squashfs-tools \
    xorriso \
    grub-pc-bin \
    grub-efi-amd64-bin \
    mtools \
    yq \
    git \
    curl

Python Environment

The System API and TUI are developed in Python. It is recommended to use a virtual environment for local development:

python3 -m venv venv
source venv/bin/activate
pip install textual psutil httpx rich  # TUI dependencies
pip install fastapi uvicorn            # API dependencies

Repository Structure

After cloning the repository, familiarize yourself with the layout:

config/: The core of the live-build configuration.
config/hooks/: Scripts executed inside the chroot during the build process.
config/includes.chroot/: Files that are copied directly onto the final system filesystem.
scripts/: Helper scripts for building, flashing, and testing.
docs/: Markdown source for this documentation and the user guide.

Tooling and Editors

Any text editor can be used, but VS Code or Neovim are recommended for their robust support for Shell and Python.

Tip: Install the ShellCheck extension to catch common errors in hook scripts and helper utilities.

QEMU for Testing

To test the generated ISO images without flashing a physical drive, install QEMU:

sudo apt install qemu-system-x86 qemu-utils

This allows you to run the tests/test-boot.sh utility to verify that the image boots correctly in a virtualized environment.

This chapter is for contributors and maintainers.

Building from Source

NeuralDrive uses a customized live-build workflow to generate its bootable ISO images. The build process can be initiated either natively on a Debian system or through a Docker container.

Standard Build Process

The primary entry point for building is the build.sh script located in the project root.

Native Build

To start a build on a native Debian host:

sudo ./build.sh

Docker Build

If you prefer using Docker, use the provided compose configuration:

docker compose up builder

The Docker method uses privileged: true and mounts the current directory into the container to allow the build system to interact with kernel loop devices.

Build Stages

The build.sh script coordinates several distinct phases:

Validation: Checks that the host environment has all necessary tools and that configuration files are valid.
Configuration: Runs lb config to set up the live-build environment based on parameters in the config/ directory.
Branding: Applies NeuralDrive-specific themes, splash screens, and versioning info.
Model Staging: Downloads base models defined in neuraldrive-models.yaml so they can be baked into the image (if configured).
Chroot Construction: Downloads the Debian base and installs packages listed in config/package-lists/.
Hook Execution: Runs the scripts in config/hooks/live/ to configure services and user accounts.
Binary Stage: Packs the filesystem into a SquashFS image and generates the final ISO.

Incremental Builds and Cleanup

Building the entire system from scratch can take between 30 and 90 minutes depending on your internet connection and CPU speed.

To reset the build environment and start fresh:

sudo lb clean --all

Warning: Avoid manually deleting files in the chroot/ directory, as this can leave stale mount points on your host system. Always use lb clean.

Common Build Errors

Loop Device Exhaustion: If the build fails during the binary stage, you may have run out of available loop devices. Run losetup -a to check and reboot the host if necessary.
GPG Errors: Failures during the archive staging usually indicate a missing or expired repository key in config/archives/.
Space Requirements: Ensure you have at least 40GB of free space before starting a build, as the chroot and temporary SquashFS files are large.

The final output will be located in the build/ directory as a .iso file.

This chapter is for contributors and maintainers.

Running Tests

NeuralDrive includes a suite of automated and manual tests to ensure system stability across different hardware targets.

Boot Testing with QEMU

Before flashing to physical hardware, use QEMU to verify that the ISO image boots to the TUI login screen.

./tests/test-boot.sh build/neuraldrive-dev.iso

This script launches a virtual machine with:

8GB of RAM
UEFI boot support
A virtual disk for testing persistence
Port forwarding for the System API (3001) and WebUI (443)

GPU Validation

Because QEMU does not easily simulate a physical GPU, vendor-specific detection and inference tests must be run on real hardware.

Automatic Detection Test

sudo /usr/lib/neuraldrive/gpu-detect.sh

This should correctly identify the active GPU and generate /run/neuraldrive/gpu.conf with the appropriate vendor tags.

Inference Verification

The tests/test-gpu.sh utility performs a small inference task to verify that the Ollama service can communicate with the GPU drivers and load a model into VRAM.

API and Integration Testing

The System API is tested using pytest. These tests verify that the FastAPI endpoints correctly interact with systemd and the underlying config files.

# From the project root with the venv active
pytest tests/test_api.py

These tests cover:

Authentication token verification
Service status reporting
Log retrieval
Network configuration changes

CI Integration

Every Pull Request triggers a subset of these tests via GitHub Actions:

Linting: ShellCheck for scripts and Ruff for Python code.
Unit Tests: Running the API test suite against a mock system.
Build Test: Attempting to run lb config and a partial lb build to catch configuration errors.

Note: Full ISO builds and QEMU boot tests are typically reserved for merges into the main branch due to their long execution time.

This chapter is for contributors and maintainers.

How to Contribute

NeuralDrive is a community-driven project. We welcome contributions of all kinds, from core system improvements to documentation updates and bug reports.

Finding an Issue

If you are looking for a place to start, check the GitHub Issues page for labels like good first issue or help wanted. These are specifically curated for new contributors.

For more complex features, it is recommended to search the existing issues or start a new Discussion thread to ensure your proposed approach aligns with the project's long-term architecture.

Types of Contributions

Core Code

Contributions to the build system (live-build configs), service units, or system scripts (gpu-detect.sh, first-boot.sh). This requires familiarity with Debian and shell scripting.

Applications

Development of the custom Python applications, including the FastAPI System API and the Textual-based TUI.

Documentation

Improving this Developer Guide or the User Guide. Clear documentation is as important as working code.

Testing and QA

Testing the latest snapshots on a variety of hardware (NVIDIA, AMD, Intel GPUs) and reporting the results.

Communication Channels

GitHub Discussions: The primary place for architectural debate and general questions.
Discord/Matrix: For real-time coordination and quick troubleshooting (links available in the README).

Contribution Workflow

Fork the repository.
Create a new branch for your work.
Implement your changes and add tests where appropriate.
Ensure your code follows the Code Style Guidelines.
Submit a Pull Request.

Note: All contributors must adhere to the project's Code of Conduct to ensure a welcoming and inclusive environment for everyone.

This chapter is for contributors and maintainers.

Code Style and Standards

To maintain a consistent and maintainable codebase, all contributions must adhere to the following style guidelines.

Shell Scripting

NeuralDrive relies heavily on shell scripts for system configuration and build hooks.

Interpreter: Use #!/bin/bash or #!/bin/sh as appropriate.
Safety: Start all scripts with set -euo pipefail.
Indentation: Use 4 spaces for indentation.
Linting: All scripts must pass shellcheck without warnings.
Functions: Define logic in functions rather than a flat script structure.

Python

The System API and TUI are written in Python.

Style: Adhere to PEP 8.
Indentation: Use 4 spaces.
Formatting: Use ruff for both linting and formatting.
Types: Use Python type hints for all function signatures and complex variables.
Dependencies: New dependencies must be added to the appropriate requirements.txt file and justified in the PR.

Configuration Files (YAML/JSON)

YAML: Use 2-space indentation.
JSON: Use 4-space indentation and ensure it is valid via jq.

Commit Messages

We follow the Conventional Commits specification. This allows for automated changelog generation and versioning.

Format: <type>(<scope>): <description>

Common types:

feat: A new feature
fix: A bug fix
docs: Documentation changes
style: Changes that do not affect the meaning of the code (formatting, etc.)
refactor: A code change that neither fixes a bug nor adds a feature
test: Adding missing tests or correcting existing tests
chore: Changes to the build process or auxiliary tools

Documentation

Use Markdown for all documentation.
Avoid AI slop phrases and maintain a professional, technical tone.
Ensure all relative links between chapters are correct.
Code blocks must have the appropriate language tag (e.g., bash, python, yaml).

This chapter is for contributors and maintainers.

Pull Request Process

This chapter outlines the steps and requirements for submitting a Pull Request (PR) to the NeuralDrive repository.

Branching Strategy

All development should occur on branches derived from the main branch. Use descriptive names for your branches:

feat/description-of-feature
fix/description-of-bug
docs/description-of-docs-change

Preparation

Before submitting your PR:

Sync with Main: Rebase your branch on the latest main to ensure there are no merge conflicts.
Run Tests: Ensure all automated tests pass locally.
Linting: Run shellcheck on shell scripts and ruff on Python files.
Documentation: Update any relevant documentation files if your changes affect the system architecture or user experience.

Submission

When creating the PR on GitHub:

Provide a clear and concise title using Conventional Commits format.
Use the PR template to describe the changes, the motivation behind them, and how they were tested.
Reference any related issues (e.g., Closes #123).

Review and Feedback

At least one maintainer must review and approve the PR before it can be merged.
Be prepared to address feedback and make requested changes.
If you make updates, push them to the same branch; the PR will update automatically.

CI Requirements

The following checks must pass for a PR to be considered for merging:

Build Validation: The live-build configuration must be valid.
Linting: All linters must report zero issues.
API Tests: The pytest suite for the System API must pass.

Merging Policy

NeuralDrive uses a Squash and Merge policy. This keeps the main branch history clean and ensures that each feature or fix is represented by a single, well-documented commit.

Note: Only maintainers have the permission to merge PRs into the main branch.

This chapter is for contributors and maintainers.

Issue Guidelines

We use GitHub Issues to track bugs, feature requests, and tasks. Effective issue reporting helps maintainers understand and resolve problems faster.

Bug Reports

Before filing a bug report, search the existing issues to see if it has already been reported. If not, use the "Bug Report" template and include:

System Version: The version of NeuralDrive you are using (found in /etc/neuraldrive/version).
Hardware Specs: CPU, RAM, and specifically your GPU model and driver version.
Steps to Reproduce: A clear, numbered list of steps that lead to the issue.
Expected vs Actual Behavior: What you expected to happen and what actually happened.
Logs: Relevant logs from journalctl or the System API.

Feature Requests

Feature requests should be submitted using the "Feature Request" template. Good requests include:

Problem Statement: What problem does this feature solve?
Proposed Solution: A description of how the feature should work.
Alternatives Considered: Any other ways you thought about solving the problem.
Context: Why this feature is important for the NeuralDrive appliance.

Issue Labels

Maintainers use several labels to organize the backlog:

bug: Something is broken.
enhancement: New feature or improvement.
documentation: Changes to docs.
help wanted: Tasks that are ready for community contribution.
good first issue: Simple tasks for new contributors.
triage: New issues that need further investigation.

Issue Lifecycle

New: The issue has been created and is waiting for triage.
Accepted: A maintainer has verified the bug or approved the feature request.
In Progress: Someone is actively working on the issue.
Resolved: The fix or feature has been merged into main.

Tip: If you find an issue you want to work on, please leave a comment so others know it is being handled.

This chapter is for contributors and maintainers.

System Overview

NeuralDrive is a specialized Linux distribution designed to function as a headless LLM appliance. It prioritizes reliability, security, and ease of use by abstracting the complexities of GPU drivers and model orchestration.

Runtime Stack

The system follows a layered architecture that moves from low-level hardware management to high-level user interfaces.

+-------------------------------------------------------+
|                    Web Browser (UI)                   |
+-------------------------------------------------------+
                           | (HTTPS)
+-------------------------------------------------------+
|                     Caddy Proxy                       |
|   (TLS, Routing, Authentication, Rate Limiting)       |
+-----------+---------------+-----------+---------------+
            |               |           |
+-----------v-----------+   |   +-------v-------+   +---|---+
|      Open WebUI       |   |   |   System API  |   |  TUI  |
| (Frontend Application)|   |   |   (FastAPI)   |   | (TTY) |
+-----------+-----------+   |   +-------+-------+   +---|---+
            |               |           |               |
+-----------v---------------v-----------v---------------v-------+
|                           Ollama                             |
|              (Inference Engine & Model Manager)              |
+-------------------------------+-------------------------------+
                                |
+-------------------------------v-------------------------------+
|                      GPU Hardware / Drivers                  |
|               (NVIDIA CUDA, AMD ROCm, Intel OneAPI)          |
+---------------------------------------------------------------+
|                        Debian 12 Base                        |
+---------------------------------------------------------------+

Component Roles

Caddy Proxy

Acts as the secure gateway for the entire appliance. It handles TLS termination using self-signed or ACME-provided certificates. Caddy routes traffic to the appropriate backend service based on the URL path and enforces Bearer token authentication for API requests.

Ollama

The core inference engine. It manages model lifecycles (downloading, loading, unloading) and provides an OpenAI-compatible API. It is isolated within its own systemd service with restricted device access.

Open WebUI

A self-contained web interface that communicates with Ollama. It provides a user-friendly environment for chatting with models, managing documents (RAG), and configuring user profiles.

System API

A custom FastAPI application that provides programmatic control over the appliance. It handles tasks like restarting services, retrieving logs, and updating network configurations.

Textual TUI

A terminal-based user interface that appears on the physical console. It allows administrators to view system status, networking info, and perform the initial setup wizard without needing a network connection.

Data Flow

User Request: An HTTPS request arrives at Caddy on port 443.
Routing: Caddy determines if the request is for the WebUI (/), the inference API (/v1/), or the System API (/system/).
Authentication: If the request is for an API endpoint, Caddy verifies the Bearer token.
Backend Processing: The request is proxied to the relevant local service (e.g., localhost:11434 for Ollama).
Response: The backend service returns data to Caddy, which then passes it back to the user over the encrypted connection.

This chapter is for contributors and maintainers.

Boot Sequence

The NeuralDrive boot sequence is designed to move from a cold start to a fully functional LLM appliance as quickly as possible. It uses systemd to manage parallelization and service ordering.

Timeline of Events

Firmware (UEFI/BIOS): The system initializes hardware and locates the EFI system partition on the boot media.
GRUB Bootloader: Loads the kernel and the initial RAM disk (initrd).
Kernel & initramfs: The kernel boots and the live-boot scripts mount the compressed SquashFS filesystem. If persistence is detected, the overlayfs layer is established.
systemd Init: The systemd process (PID 1) starts and begins processing unit files.

Service Ordering

The following list details the startup order of NeuralDrive-specific services:

1. Initialization Phase

neuraldrive-setup.service: A oneshot service that runs /usr/lib/neuraldrive/first-boot.sh. It checks for the existence of a sentinel file (/etc/neuraldrive/.setup-complete). If missing, it blocks the TTY and runs the setup wizard.
neuraldrive-zram.service: Configures swap space in RAM to handle memory-intensive model loading.

2. Hardware and Security Phase

neuraldrive-gpu-detect.service: Runs /usr/lib/neuraldrive/gpu-detect.sh to identify the GPU vendor and load the appropriate kernel modules (NVIDIA, AMD, or Intel).
neuraldrive-certs.service: Checks if TLS certificates exist in /etc/neuraldrive/tls/. If not, it generates a new self-signed CA and server certificate.

3. Application Phase

neuraldrive-ollama.service: Starts the inference engine. This service Requires=neuraldrive-gpu-detect to ensure drivers are loaded first.
neuraldrive-webui.service: Launches the Open WebUI container/process. It Wants=neuraldrive-ollama but can start independently.
neuraldrive-system-api.service: Starts the FastAPI backend.

4. Gateway Phase

neuraldrive-caddy.service: Starts the Caddy proxy. It Requires=neuraldrive-certs to ensure it has valid TLS material for binding to port 443.

5. Console Phase

neuraldrive-show-ip.service: A simple oneshot that prints the current IP address and mDNS hostname to the console.
TUI (getty@tty1): The Textual TUI is launched on the main console, providing a dashboard for local administration.

Dependency Visualization

[Hardware Detect] -> [GPU Detect] -> [Ollama] -> [WebUI]
                                              \
[ZRAM Setup]                                   \
                                                +-> [Caddy]
[Certs Gen] -----------------------------------/

Note: Failures in the gpu-detect service will prevent ollama from starting, effectively putting the appliance into a "degraded" mode where only the System API and TUI are fully functional for troubleshooting.

This chapter is for contributors and maintainers.

Service Dependency Graph

NeuralDrive uses systemd to manage a complex tree of service dependencies. Understanding this graph is essential for troubleshooting startup issues and adding new components.

Dependency Types

We primarily use three types of systemd dependencies:

Requires=: Strong dependency. If the required unit fails, this unit will not start.
Wants=: Weak dependency. This unit will attempt to start the wanted unit, but will proceed even if it fails.
After= / Before=: Controls ordering. Does not imply a requirement, only the sequence in which units are started.

Core Dependency Tree

The following diagram illustrates the relationship between the primary NeuralDrive services.

multi-user.target
├── neuraldrive-caddy.service
│   ├── After: network-online.target, neuraldrive-certs.service
│   ├── Requires: neuraldrive-certs.service
│   └── Wants: network-online.target
├── neuraldrive-webui.service
│   ├── After: network.target, neuraldrive-ollama.service
│   └── Wants: neuraldrive-ollama.service
├── neuraldrive-ollama.service
│   ├── After: network.target, neuraldrive-gpu-detect.service
│   └── Requires: neuraldrive-gpu-detect.service
├── neuraldrive-system-api.service
│   └── After: network.target
├── neuraldrive-gpu-detect.service
│   └── Before: neuraldrive-ollama.service
└── neuraldrive-certs.service
    ├── After: local-fs.target, network-online.target
    └── Before: neuraldrive-caddy.service

Service Breakdown

`neuraldrive-ollama`

This is the most critical service in the application stack. It requires neuraldrive-gpu-detect to ensure that kernel modules for NVIDIA, AMD, or Intel GPUs are loaded before the Ollama binary attempts to initialize its compute provider.

`neuraldrive-caddy`

As the edge proxy, Caddy is the final piece of the puzzle. It requires neuraldrive-certs because it cannot bind to port 443 without valid certificate files in /etc/neuraldrive/tls/. It also requires network-online.target to ensure network interfaces are available before starting.

`neuraldrive-gpu-monitor`

This service runs independently of Ollama. It Wants=neuraldrive-gpu-detect but can run in a fallback mode using CPU-only monitoring if no GPU is found.

Failure Cascades

GPU Detection Failure: If gpu-detect fails, ollama will not start. Consequently, the WebUI will show connection errors, though the System API will remain available for logs.
Certificate Failure: If certs.service fails to generate or find certificates, caddy will fail to start. This makes the appliance unreachable over the network via HTTPS.

Tip: Use systemctl list-dependencies neuraldrive-caddy.service on a running system to see a live representation of the current dependency tree.

This chapter is for contributors and maintainers.

Storage Architecture

NeuralDrive uses a hybrid storage model that combines an immutable base system with a persistent writable layer. This design ensures that the appliance remains stable over time while allowing for model storage and user data persistence.

Partition Layout

The standard NeuralDrive image expects a 4-partition layout on the boot media (typically a USB drive or internal SSD):

Partition 1 (EFI): FAT32. Contains the GRUB bootloader and EFI binaries.
Partition 2 (Boot): Ext4. Contains the Linux kernel and initrd.
Partition 3 (Live System): ISO9660 or SquashFS. Contains the compressed, read-only root filesystem.
Partition 4 (Persistence): Ext4 or LUKS2-encrypted. Contains the persistence layer used by live-boot.

Immutable Root (SquashFS)

The core operating system is stored in a highly compressed SquashFS image. During the boot process, this image is mounted as the read-only root (/). This ensures that:

Accidental changes to system binaries are impossible.
The system always boots into a known-good state.
The disk footprint of the OS remains small.

Persistence and OverlayFS

To allow for data persistence, NeuralDrive uses overlayfs. This kernel feature merges the read-only SquashFS layer with a writable directory on the persistence partition.

Key Persistent Paths

While the entire filesystem can be made persistent, NeuralDrive is configured to prioritize specific directories:

/var/lib/neuraldrive/models/: Stores the large LLM weights for Ollama.
/var/lib/neuraldrive/webui/: Stores the Open WebUI database and user-uploaded documents.
/etc/neuraldrive/: Stores system configuration, API keys, and TLS certificates.
/var/log/: Persists system logs across reboots for troubleshooting.

Model Storage Management

Because LLM models can be dozens of gigabytes in size, NeuralDrive handles model storage separately from the main OS updates. When a user downloads a model via the WebUI or System API, it is saved directly to the persistence partition. This means models survive system updates (re-flashing the SquashFS partition).

Encryption (LUKS2)

For deployments requiring high security, the persistence partition can be encrypted using LUKS2. This is handled during the first-boot setup wizard. If encryption is enabled:

The user provides a passphrase.
The persistence partition is formatted as a LUKS2 volume.
The system adds the necessary crypttab entries to the initramfs to prompt for the password at boot.

Warning: If the persistence partition is lost or corrupted, all downloaded models and user configurations will be deleted. Always ensure the system is shut down cleanly to prevent filesystem corruption on the writable layer.

This chapter is for contributors and maintainers.

Network Architecture

NeuralDrive is designed to operate as a secure network appliance. It uses a combination of Caddy for edge routing, Avahi for service discovery, and nftables for firewalling.

Edge Proxy (Caddy)

Caddy serves as the single point of entry for all network traffic. It listens on two ports with distinct responsibilities:

Port 443 — Web Dashboard

External Path	Internal Destination	Purpose
`/*`	`localhost:3000`	Open WebUI

Port 8443 — API Gateway

External Path	Internal Destination	Purpose
`/v1/*`	`localhost:11434`	Ollama OpenAI-compatible API (authenticated)
`/api/*`	`localhost:11434`	Ollama Native API (authenticated)
`/system/*`	`localhost:3001`	System API (FastAPI)
`/monitor/*`	`localhost:1312`	GPU Hot Dashboard
`/health`	`200 OK`	Liveness Probe

This dual-port architecture separates browser traffic from programmatic API access, allowing each to be managed and monitored independently.

Service Discovery (mDNS)

To simplify headless access, NeuralDrive runs avahi-daemon. By default, the appliance advertises itself as neuraldrive.local. This allows users to access the WebUI at https://neuraldrive.local without needing to know the IP address.

The mDNS name can be changed via the System API or the first-boot wizard.

Firewall (nftables)

The system uses nftables with a "default drop" policy. The firewall configuration is managed via /etc/neuraldrive/nftables.conf, loaded through a systemd drop-in at /etc/systemd/system/nftables.service.d/neuraldrive.conf.

Permitted Traffic

Inbound TCP 443/8443: WebUI and API access.
Inbound TCP 22: SSH (rate-limited to 3 attempts per minute).
Inbound UDP 5353: mDNS for service discovery.
Inbound ICMP: Echo requests (rate-limited to 5 per second).
Outbound: All traffic permitted (required for downloading models and system updates).

Internal Port Assignments

Services are bound to localhost whenever possible to ensure they are only accessible via the Caddy proxy or the local TUI.

3000: Open WebUI
3001: System API (FastAPI)
11434: Ollama
1312: GPU Monitor

Note: The System API and Ollama services enforce their own authentication (API keys), but Caddy provides the first layer of defense by requiring valid TLS and potentially enforcing IP-based allowlists.

This chapter is for contributors and maintainers.

Security Model

The NeuralDrive security model is built on the principle of defense-in-depth. As an appliance that often handles sensitive or private data, the system must protect against both external network attacks and local privilege escalation.

Threat Model

The primary threats NeuralDrive is designed to mitigate include:

Unauthorized Inference: Using the LLM without a valid API key.
System Tampering: Unauthorized changes to the system configuration or service units.
Data Exfiltration: Accessing stored model weights or user chat history.
Denial of Service: Exhausting system resources (GPU VRAM or system memory) through malicious requests.

Defense Layers

1. Service Isolation

Each major component runs as a dedicated, non-root user. This limits the "blast radius" if a single service is compromised.

Service	User	UID
Ollama	`neuraldrive-ollama`	901
WebUI	`neuraldrive-webui`	902
Caddy	`neuraldrive-caddy`	903
Monitor	`neuraldrive-monitor`	904
System API	`neuraldrive-api`	905

2. systemd Hardening

Every service unit employs advanced systemd hardening directives:

ProtectSystem=strict: The root filesystem is read-only for the service.
ProtectHome=yes: Access to /home is denied.
PrivateTmp=yes: A private /tmp directory is created.
NoNewPrivileges=yes: Prevents the service and its children from gaining new privileges via setuid binaries.
PrivateDevices=no: Explicitly disabled for the Ollama service to allow access to GPU device nodes (/dev/nvidia*, /dev/dri/*) required for accelerated inference.
DeviceAllow removal: All DeviceAllow lines were removed from the Ollama service unit. On cgroup v2 systems, DeviceAllow uses eBPF device filters that blocked CUDA access even with explicit allow rules for GPU devices. Removing these rules was necessary to enable reliable GPU acceleration.

3. Authentication and Authorization

NeuralDrive uses a dual-key system for authentication:

Admin Password: Used for local TUI access and the initial WebUI account creation.
API Key: A 32-character token (prefixed with nd-) used for all programmatic access to the inference and system APIs.

The System API key is stored in /etc/neuraldrive/api.key with 0600 permissions, owned by the neuraldrive-api user.

4. Transport Layer Security (TLS)

All network communication is encrypted via TLS 1.3. The system generates a unique CA and server certificate during the first-boot process. Caddy enforces HTTPS for all routes.

SSH Security

SSH is disabled by default. When enabled via the System API:

Only key-based authentication is permitted.
Only the neuraldrive-admin user is allowed to log in.
fail2ban monitors logs and bans IPs after three failed attempts.

Immutable OS

The read-only SquashFS root filesystem prevents persistent malware from being installed on the system. Any changes made to the system directories (outside of the persistence layer) are discarded on reboot.

Warning: Security is a shared responsibility. While NeuralDrive provides a hardened base, users must ensure their API keys are kept secret and that physical access to the appliance is restricted.

This chapter is for contributors and maintainers.

live-build Overview

live-build is a set of scripts used to create Debian Live system images. It is the core framework used by NeuralDrive to generate its bootable ISO files.

How live-build Works

The live-build process is divided into several stages, each responsible for a different part of the system creation:

Bootstrap: Downloads a minimal Debian base system using debootstrap.
Chroot: Enters the base system and installs additional packages, executes hooks, and applies configurations.
Binary: Packages the chroot into a SquashFS image and creates the final bootable medium (ISO or HDD image).
Source: (Optional) Creates a source image containing the source code for all packages used.

Configuration Logic

The behavior of live-build is controlled by the contents of the config/ directory. When you run lb config, these files are read to generate a master configuration for the build.

Key Directories

config/package-lists/: Defines which packages are installed from the Debian repositories.
config/includes.chroot/: Files placed here are copied directly into the chroot filesystem before it is packed.
config/hooks/: Executable scripts that run inside the chroot to perform complex setup tasks.
config/archives/: Custom repository definitions and GPG keys.

NeuralDrive Implementation

NeuralDrive extends the standard live-build workflow with a custom wrapper (build.sh). This wrapper handles pre-build tasks like validating the environment and post-build tasks like branding the ISO.

Build Stages in NeuralDrive

Pre-Configuration: Setting version strings and updating model metadata.
Standard live-build Workflow: Running lb clean, lb config, and lb build.
Artifact Management: Moving the finished ISO to the output/ directory and generating checksums.

Benefits of live-build

Reproducibility: The same configuration produces the same image every time.
Flexibility: Easily switch between different Debian branches (Stable, Testing, Sid).
Automation: The entire process can be run in a CI/CD environment without manual intervention.

For official documentation on live-build, refer to the Debian Live Manual.

This chapter is for contributors and maintainers.

Directory Structure

This chapter describes the purpose of the primary directories and files in the NeuralDrive repository.

Root Directory

build.sh: The main entry point for starting a build.
Dockerfile: Defines the containerized build environment.
docker-compose.yml: Orchestrates the builder container and volume mounts.
neuraldrive-build.yaml.example: Template for CI/CD or local build configurations.

config/

The config/ directory is the heart of the live-build setup.

archives/: Contains .list and .key files for external repositories (e.g., ROCm, Intel, Debian Backports).
hooks/:
- live/: Scripts that run inside the chroot during the build process. They must be named with a numeric prefix (e.g., 01-setup-system.chroot).
includes.chroot/: This directory mirrors the root filesystem of the final appliance.
- etc/neuraldrive/: Configuration files for Ollama, WebUI, and the System API.
- etc/systemd/system/: Systemd unit files for all NeuralDrive services.
- usr/lib/neuraldrive/: Location for custom scripts, Python applications, and their virtual environments.
package-lists/:
- neuraldrive.list.chroot: List of standard Debian packages to install.
- nvidia.list.chroot: Packages required for NVIDIA GPU support.
preseed/: (Empty) NeuralDrive uses a live system approach rather than a traditional Debian installer, so preseed files are not used.

scripts/

Contains utility scripts for developers and maintainers:

neuraldrive-flash.sh: Writes a generated ISO to a physical USB drive.
download-models.sh: Downloads model weights from the Ollama registry for pre-loading.
seed-models.sh: Stages downloaded models into the build filesystem.
apply-branding.sh: Applies NeuralDrive branding to the Open WebUI interface.
validate-config.sh: Validates the build configuration before starting.
post-build.sh: Post-build cleanup and image finalization.

docs/

Source files for the documentation.

user-guide/: End-user documentation.
dev-guide/: This developer guide.

tests/

Integration and unit tests.

test_api.py: Pytest suite for the System API.
test-boot.sh: Launches the ISO in QEMU for boot verification.
test-gpu.sh: Shell script for on-target GPU validation.

plan/

Internal design documents and implementation plans used during the development of NeuralDrive.

This chapter is for contributors and maintainers.

Build Hooks

Hooks are scripts that live-build executes inside the chroot environment during the build process. They are used to perform configuration tasks that cannot be handled by simple file inclusion or package installation.

Hook Execution Order

Hooks are executed in alphabetical order. In NeuralDrive, we use a numeric prefix (e.g., 01-, 02-) to ensure a predictable sequence. All hooks are located in config/hooks/live/ and use the .chroot suffix.

Current Hooks Breakdown

`01-setup-system.chroot`

Performs base system configuration, including:

Setting the default locale and timezone.
Configuring the hostname.
Creating the neuraldrive-admin user.
Setting up the sudoers file.

`02-setup-autologin.chroot`

Configures the system to automatically log into the TTY1 console and launch the NeuralDrive TUI. This involves modifying getty service overrides.

`03-install-extras.chroot`

Installs components that are not available via standard APT repositories, such as the Ollama binary. It also handles the installation of GPU-specific firmware.

`04-install-python-apps.chroot`

Sets up the Python virtual environments for the System API, WebUI, and TUI. It runs pip install for all requirements and ensures the environments are correctly owned by their respective service users.

`05-generate-configs.chroot`

Generates default configuration files and ensures correct permissions for sensitive files (like API keys and TLS directories). It also enables all NeuralDrive systemd services.

Writing New Hooks

When adding a new hook:

Naming: Use the .chroot suffix and place the file in config/hooks/live/.
Interpreter: Always start with #!/bin/sh or #!/bin/bash.
Safety: Include set -e to ensure the build fails if the hook encounters an error.
Permissions: Ensure the script is executable (chmod +x).

Note: Hooks run as the root user inside the chroot. Be careful when modifying system files and always verify that the changes will be persistent in the final SquashFS image.

This chapter is for contributors and maintainers.

Package Lists

Package lists define which software is retrieved from APT repositories and installed into the NeuralDrive image.

Core List (`neuraldrive.list.chroot`)

This list contains the essential packages for the appliance:

Base System: systemd, udev, kmod, ca-certificates.
Networking: caddy, avahi-daemon, nftables, curl, wget.
Python Stack: python3, python3-venv, python3-pip.
Utilities: vim, htop, pciutils, usbutils, p7zip-full.

GPU-Specific Lists

To support different hardware configurations, we use specialized package lists:

NVIDIA (`nvidia.list.chroot`)

nvidia-driver: The core proprietary driver.
nvidia-smi: System management interface for monitoring.
nvidia-cuda-toolkit: Required for compute tasks.
libnvidia-encode1: For video encoding/decoding if needed by secondary apps.

AMD (ROCm)

Packages for ROCm support are typically pulled from the official Radeon repositories defined in the archives/ directory. These include rocm-hip-sdk and amdgpu-dkms.

Intel (OneAPI)

Similar to AMD, Intel packages like intel-oneapi-runtime-libs and intel-opencl-icd are sourced from the Intel OneAPI repository.

How live-build Handles Lists

During the chroot stage, live-build reads every file with the .list.chroot extension and passes the package names to apt-get install.

Dependencies

live-build handles dependency resolution automatically. However, to keep the image size small, we explicitly use --no-install-recommends in the global build configuration.

Customizing Package Lists

If you need to add a package to your custom build:

Create a new file in config/package-lists/ (e.g., custom.list.chroot).
Add the package names, one per line.
Run a new build.

Tip: For temporary testing, you can add packages to neuraldrive.list.chroot, but it is better to keep custom additions in a separate file for better maintainability.

This chapter is for contributors and maintainers.

Archive Sources

NeuralDrive supplements the standard Debian repositories with third-party archives to provide the latest GPU drivers and specialized software. These are managed via the config/archives/ directory.

Repository Configuration

Each third-party archive requires two files:

.list file: Defines the repository URL and components (e.g., deb https://repo.radeon.com/rocm/apt/latest focal main).
.key file: The GPG public key used to verify the packages in the repository.

Currently Configured Archives

Debian Backports

Used to pull newer versions of certain packages (like the Linux kernel) while remaining on the Stable (Bookworm) base.

NVIDIA Repository

Provides the latest proprietary drivers and CUDA toolkit directly from NVIDIA.

ROCm (AMD)

Provides the Radeon Open Compute stack. We pin this to specific versions to ensure compatibility with Ollama's build requirements.

Intel OneAPI

Provides the necessary libraries for Intel Arc and Data Center GPUs.

Repository Pinning

To prevent third-party repositories from accidentally upgrading core Debian packages, we use APT pinning. This is configured in config/archives/*.pref.chroot files.

Example pin for the NVIDIA repository:

Package: *
Pin: origin developer.download.nvidia.com
Pin-Priority: 600

Adding a New Archive

To add a new repository:

Download the GPG key and place it in config/archives/repo-name.key.
Create a list file at config/archives/repo-name.list.chroot.
(Optional) Create a preferences file at config/archives/repo-name.pref.chroot if pinning is required.

Warning: Be cautious when adding third-party archives. Every new source increases the risk of package conflicts and can significantly increase the size of the final ISO image. Always verify the authenticity of GPG keys before adding them to the project.

This chapter is for contributors and maintainers.

Docker Build Environment

The Docker-based build environment provides a consistent, isolated workspace for generating NeuralDrive images regardless of the host operating system.

Dockerfile Walkthrough

The Dockerfile in the project root defines the builder image. It is based on debian:bookworm to match the target OS.

Key components of the Dockerfile:

Base Layer: Installs live-build, debootstrap, and other core utilities.
Workdir: Sets /build as the working directory.
Volume: Declares /output as a destination for the finished ISO.
Entrypoint: A script that runs lb clean, lb config, and lb build in sequence.

Docker Compose Configuration

The docker-compose.yml simplifies the process of launching the builder with the correct permissions and mounts.

services:
  builder:
    build: .
    privileged: true
    volumes:
      - .:/build
      - ./output:/output
    environment:
      - BUILD_VARIANT=full

Privileged Mode

The privileged: true flag is mandatory. live-build uses chroot, mount, and mknod, all of which require elevated privileges. Additionally, generating SquashFS and ISO images requires access to the host's loop devices.

Building with Docker

To start a build:

docker compose run --rm builder

The finished ISO will appear in the ./output/ directory on your host machine.

Benefits and Limitations

Benefits

No Host Contamination: Build dependencies are not installed on your primary OS.
Cross-Platform: Build from macOS or Windows (using Docker Desktop).
CI Readiness: The same Docker image used for local development is used in GitHub Actions.

Limitations

Performance: Building inside a container can be slightly slower due to I/O overhead on non-Linux hosts.
Loop Device Contention: If multiple builds are run simultaneously on the same host, they may compete for the same loop devices, leading to failures.

Tip: If you encounter "Permission Denied" errors when accessing the ./output/ directory on Linux, ensure that your user has permission to write to that folder, as files created by the root user inside the container may have restricted permissions on the host.

This chapter is for contributors and maintainers.

CI/CD Pipeline

NeuralDrive uses GitHub Actions to automate the testing, building, and distribution of system images.

Workflow Structure

The primary workflow is defined in .github/workflows/build.yml. It consists of several jobs that run in sequence.

1. Lint and Test

Runs shellcheck on all scripts in config/hooks/ and scripts/.
Runs ruff on the Python codebase.
Executes the pytest suite for the System API.
This job runs on every push and pull request.

2. Build Matrix

When a change is merged into main or a tag is created, the build job is triggered. It uses a matrix to build multiple variants of NeuralDrive simultaneously:

Full: Includes drivers for NVIDIA, AMD, and Intel.
NVIDIA-Only: Optimized for NVIDIA hardware.
Minimal: CPU-only, intended for testing and low-power hardware.

3. Artifact Publishing

Finished ISO images are uploaded as GitHub Action artifacts. For tagged releases, the workflow also:

Generates SHA256 checksums.
Signs the checksums using the project's GPG key.
Creates a new GitHub Release and uploads the ISOs and signatures.

Configuration (neuraldrive-build.yaml)

The CI pipeline can be configured via a neuraldrive-build.yaml file in the repository root. This allows maintainers to:

Toggle specific build variants.
Define which models are pre-loaded into the images.
Set custom version strings for nightlies.

Runner Requirements

Building ISO images requires a Linux runner with support for nested virtualization or privileged containers. We use large GitHub-hosted runners to ensure there is enough disk space and CPU power to complete builds within the 60-minute timeout.

Automated Testing in CI

In addition to static analysis, the CI pipeline attempts a "Dry Run" build:

It runs lb config to verify the configuration is valid.
It performs the bootstrap stage to ensure the Debian repositories are accessible.
Full binary builds are only performed on the main branch to conserve resources.

Note: Because CI runners do not have physical GPUs, we cannot perform full GPU validation in the cloud. These tests remain a manual requirement for the release checklist.

This chapter is for contributors and maintainers.

GPU Auto-Detection

The gpu-detect.sh script is a critical component of the NeuralDrive boot sequence. It is responsible for identifying the installed hardware and ensuring the correct compute stack is initialized.

Logic Overview

The script runs during the neuraldrive-gpu-detect.service phase. It performs the following steps:

PCI Enumeration: Uses lspci to scan for VGA and 3D controllers.
Vendor Identification: Matches the PCI IDs against known vendor strings (NVIDIA, AMD, Intel).
Module Loading: Calls modprobe to load the appropriate kernel modules (e.g., nvidia, amdgpu, or i915).
Configuration Generation: Writes the detected state to /run/neuraldrive/gpu.conf.

Vendor Detection Details

NVIDIA

If an NVIDIA card is detected (PCI vendor ID 10de), the script:

Loads the nvidia, nvidia-current-uvm, and nvidia-drm modules via modprobe. Note that on Debian systems, the CUDA Unified Video Memory module is named nvidia-current-uvm, not nvidia-uvm.
Executes nvidia-modprobe -u to create the /dev/nvidia-uvm and /dev/nvidia-uvm-tools device nodes. Without these nodes, CUDA memory allocation fails silently, and Ollama falls back to CPU.
Enables persistence mode with nvidia-smi -pm 1.
Sets VENDOR=NVIDIA in the config file.
If module loading fails, records NVIDIA_MODULE_MISSING=true.

Boot-Time Module Loading

In addition to the detection script, the system includes /etc/modules-load.d/nvidia-uvm.conf. This file contains nvidia-current-uvm to ensure the module is automatically loaded at boot.

Ollama Service Integration

As a safety net, the Ollama systemd unit also includes ExecStartPre commands for both modprobe nvidia-current-uvm and nvidia-modprobe -u. This ensures the necessary drivers and device nodes are present even if the primary detection service is delayed.

cgroup v2 and Device Access

On systems using cgroup v2, standard DeviceAllow rules in systemd units utilize eBPF filters that can inadvertently block CUDA access, even when explicit allow rules are defined. NeuralDrive avoids this by removing all DeviceAllow directives from the Ollama service and relying on PrivateDevices=no instead.

AMD

If an AMD card is detected (PCI vendor ID 1002), the script:

Loads the amdgpu module.
Sets VENDOR=AMD.
If module loading fails, records AMD_MODULE_MISSING=true.

Intel

If an Intel GPU is detected (PCI vendor ID 8086), the script:

Loads the i915 module.
Sets VENDOR=INTEL.
If module loading fails, records INTEL_MODULE_MISSING=true.

The gpu.conf File

The output of the detection process is stored in a runtime environment file:

# /run/neuraldrive/gpu.conf
VENDOR=NVIDIA

Additional keys may be present for error conditions (e.g., NVIDIA_MODULE_MISSING=true) or Secure Boot detection (SECURE_BOOT=true). This file is available to subsequent services for determining the active compute provider.

Troubleshooting and Fallbacks

If no GPU is detected, or if module loading fails:

The script sets VENDOR=CPU.
A message is logged to standard output.
Ollama will start in CPU-only mode, which is significantly slower but allows the appliance to remain functional.

Modifying Detection Logic

To add support for new hardware or refine the detection process, modify /usr/lib/neuraldrive/gpu-detect.sh in the repository.

Note: Changes to the detection script require a re-build of the ISO or a manual update to the file on the persistence layer for testing.

This chapter is for contributors and maintainers.

Ollama Integration

Ollama serves as the core inference engine for NeuralDrive. It is managed as a systemd service and configured to optimize resource usage on the appliance.

Installation

The Ollama binary is installed to /usr/local/bin/ollama during the build process via the 03-install-extras hook. We use the official static binary to ensure compatibility across different Debian versions.

Service Configuration

The neuraldrive-ollama.service manages the lifecycle of the inference engine.

Service Unit Highlights

User: Runs as neuraldrive-ollama (UID 901).
Dependencies: Requires=neuraldrive-gpu-detect.service.
Security: The service uses PrivateDevices=no to allow GPU access. Note that all DeviceAllow directives were removed because cgroup v2's eBPF device filter blocked CUDA access even with explicit allow rules.
Resource Limits:
- MemoryHigh=90%: Triggers aggressive swapping/GC when system memory is nearly full.
- MemoryMax=95%: The hard limit before the OOM killer intervenes.
GPU Initialization: The unit includes ExecStartPre commands to ensure CUDA is ready:
- ExecStartPre=-/sbin/modprobe nvidia-current-uvm: Loads the CUDA Unified Video Memory module (named nvidia-current-uvm in the Debian package).
- ExecStartPre=-/usr/bin/nvidia-modprobe -u: Creates the /dev/nvidia-uvm and /dev/nvidia-uvm-tools device nodes.

Persistent Config Overrides

The service unit includes two EnvironmentFile directives to manage configuration:

EnvironmentFile=/etc/neuraldrive/ollama.conf: Contains baked-in system defaults.
EnvironmentFile=-/var/lib/neuraldrive/config/ollama.conf: Allows persistent user-defined overrides. The - prefix ensures the service starts even if this file is missing.

Configuration (ollama.conf)

System-wide settings are defined in the environment files:

OLLAMA_HOST=127.0.0.1:11434: Ensures the API is only accessible locally (proxied by Caddy).
OLLAMA_MODELS=/var/lib/neuraldrive/models/: Directs model weights to the persistence layer.
OLLAMA_KEEP_ALIVE=5m: Models are unloaded from VRAM after 5 minutes of inactivity.
OLLAMA_MAX_LOADED_MODELS=0: Set to 0 for auto mode. Ollama manages multiple models based on available VRAM using LRU (Least Recently Used) eviction.
OLLAMA_NUM_PARALLEL=1: Processes one request at a time to maintain deterministic performance.

API Usage Details

Loading Models

To load a model, send a POST request to /api/generate with keep_alive set to -1. Note that keep_alive must be an integer; passing it as a string ("-1") will result in a rejection.

Unloading Models

To unload a model, send a POST request to /api/generate with keep_alive set to 0. To verify the eviction, poll /api/ps until the model no longer appears. A race condition exists where the 200 OK response may return before the eviction process is fully complete.

Monitoring

GET /api/ps returns a list of running models, including the size_vram utilized by each.

GPU Support

Ollama automatically detects the compute provider based on the drivers loaded by gpu-detect.sh.

NVIDIA: Uses the CUDA runner.
AMD: Uses the ROCm/HIP runner.
Intel: Uses the OneAPI runner.
CPU: Falls back to the AVX/AVX2 optimized CPU runner.

Model Management

Models can be managed via the Open WebUI or the ollama CLI. When a model is "pulled," it is stored as a series of blobs in the persistent /var/lib/neuraldrive/models/ directory.

Tip: To interact with Ollama manually for troubleshooting, use the neuraldrive-admin user: sudo -u neuraldrive-ollama ollama list

This chapter is for contributors and maintainers.

Open WebUI Integration

Open WebUI provides the primary user interface for NeuralDrive. It is a feature-rich chat environment that communicates with the Ollama backend.

Installation and Environment

Open WebUI is installed into a Python virtual environment at /usr/lib/neuraldrive/webui/venv/. This isolation prevents dependency conflicts with the system Python or the System API.

The service is managed by neuraldrive-webui.service and runs as the neuraldrive-webui user (UID 902).

Configuration (webui.env)

Key configuration parameters are stored in /etc/neuraldrive/webui.env:

OLLAMA_BASE_URL=http://localhost:11434: The internal address of the Ollama service.
DATA_DIR=/var/lib/neuraldrive/webui: Persistent location for the SQLite database and user uploads.
ENABLE_SIGNUP=false: Disables public account creation for security.
WEBUI_AUTH=true: Enforces login for all users.
WEBUI_NAME=NeuralDrive: Customizes the branding of the interface.

Service Lifecycle

The WebUI service Wants=neuraldrive-ollama. This means systemd will attempt to start Ollama whenever the WebUI is started. However, the WebUI is capable of running even if Ollama is temporarily unavailable, showing a "Connection Error" in the settings.

Data Persistence

The /var/lib/neuraldrive/webui directory contains:

webui.db: The SQLite database containing user accounts, chat history, and settings.
uploads/: Documents uploaded for RAG (Retrieval-Augmented Generation).
cache/: Temporary files and model templates.

Customization

To modify the default behavior of Open WebUI on NeuralDrive:

Update the environment variables in config/includes.chroot/etc/neuraldrive/webui.env.
For UI changes, the CSS or frontend assets can be modified in the source before building.

Note: Major updates to Open WebUI often require database migrations. These are handled automatically by the application on startup, but it is recommended to back up the webui.db file before performing a system upgrade.

This chapter is for contributors and maintainers.

Caddy Reverse Proxy

Caddy is the edge proxy and security gateway for the NeuralDrive appliance. It handles TLS termination, URL routing, and authentication for API endpoints.

The Caddyfile

The routing logic is defined in /etc/neuraldrive/Caddyfile. This file is loaded by the neuraldrive-caddy.service.

Key Routing Rules

NeuralDrive uses two separate server blocks — one for the web dashboard and one for the API gateway:

:443 {
    # TLS termination for the Web UI
    tls /etc/neuraldrive/tls/server.crt /etc/neuraldrive/tls/server.key

    # All requests proxy to Open WebUI
    reverse_proxy localhost:3000
}

:8443 {
    # TLS termination for the API gateway
    tls /etc/neuraldrive/tls/server.crt /etc/neuraldrive/tls/server.key

    # Ollama Inference API (/v1/* and /api/*)
    # Requires Bearer token matching NEURALDRIVE_API_KEY
    @api_authenticated {
        path /v1/* /api/*
        header Authorization "Bearer {env.NEURALDRIVE_API_KEY}"
    }
    handle @api_authenticated {
        reverse_proxy localhost:11434
    }
    # Unauthenticated API requests get 401
    handle @api_routes {
        respond 401
    }

    # GPU Monitoring
    handle /monitor/* {
        reverse_proxy localhost:1312
    }

    # System Management API
    handle /system/* {
        reverse_proxy localhost:3001
    }

    # Health Check (public)
    handle /health {
        respond "OK" 200
    }
}

The dual-port architecture keeps the user-facing web UI separate from the machine-to-machine API gateway, allowing each to be managed independently.

Security Features

TLS Management

Caddy uses the certificates generated by neuraldrive-certs.service. By default, these are self-signed RSA 4096-bit certificates. Caddy is configured to only allow modern TLS protocols (1.2 and 1.3).

API Authentication

For requests to /v1/* and /api/*, Caddy can be configured to enforce Bearer token authentication. The valid API key is sourced from the NEURALDRIVE_API_KEY environment variable, which is populated from /etc/neuraldrive/caddy.env.

Capabilities

The neuraldrive-caddy.service uses AmbientCapabilities=CAP_NET_BIND_SERVICE. This allows the non-root neuraldrive-caddy user to bind to privileged ports like 443.

Environment Variables (caddy.env)

NEURALDRIVE_API_KEY: The master 32-character key for the appliance.
DOMAIN_NAME: (Optional) Used for ACME/Let's Encrypt integration if the user provides a public domain.

Customizing Routes

To add a new service to the NeuralDrive stack:

Assign it a local port (e.g., 8080).
Add a handle_path block to the Caddyfile.
Re-build the image or restart the neuraldrive-caddy service.

Tip: Use caddy validate --config /etc/neuraldrive/Caddyfile to check for syntax errors before restarting the service.

This chapter is for contributors and maintainers.

System Management API

The NeuralDrive System API is a custom FastAPI application that provides programmatic control over the appliance's hardware and software configuration.

Application Structure

The source code is located at /usr/lib/neuraldrive/api/neuraldrive_api/. The application is consolidated in a single entry point:

main.py: Route definitions, token verification, and all endpoint logic.

Authentication

All endpoints (except /system/ca-cert) require a Bearer token. The API verifies this token against the master key stored in /etc/neuraldrive/api.key.

Primary Endpoints

System Status

GET /system/status: Returns CPU/RAM usage and system uptime.
GET /system/gpu: Returns detailed GPU metrics (temp, VRAM, utilization).

Service Management

GET /system/services: Lists the status of all NeuralDrive services.
POST /system/services/{name}/{action}: Allows starting, stopping, or restarting services.
GET /system/logs: Retrieves the last N lines of the system journal for a specific service.

Configuration

POST /system/network/hostname: Updates the system hostname and mDNS name.
GET /system/security: Returns the current firewall status and SSH settings.
POST /system/api-keys/rotate: Generates a new master API key.

systemd Integration

The API interacts with systemd via the systemctl CLI or the dbus Python bindings. It is limited to a whitelist of NeuralDrive-specific services to prevent unauthorized modification of core OS components.

Development and Testing

The API can be run locally for development:

# With venv active
uvicorn main:app --host 0.0.0.0 --port 3001

Testing is handled via pytest in the tests/test_api.py file. These tests mock the system calls to ensure the API logic is correct without needing a full NeuralDrive environment.

Warning: The System API runs as a privileged user (neuraldrive-api) with specific sudo permissions to manage services. Never expose port 3001 directly to the internet; always route traffic through the Caddy proxy.

This chapter is for contributors and maintainers.

Terminal User Interface (TUI)

The NeuralDrive TUI provides a local management console for administrators. It is designed to be usable directly from a physical keyboard and monitor without requiring a network connection.

Technology Stack

The TUI is built using the Textual framework, a modern Python library for building sophisticated terminal applications. It uses async I/O to maintain a responsive interface even while performing long-running system tasks.

Interface Structure

The TUI is divided into several screens:

Dashboard

The default screen showing:

System hostname and version.
Current IP addresses (IPv4 and IPv6).
mDNS address (neuraldrive.local).
CPU, Memory, and Disk usage gauges.
GPU status overview. Manual refresh is available via the R key, alongside a live clock.

Models

Lists all LLM models currently stored in the persistence layer. Shows model name and metadata columns (params, quantization, disk size, VRAM usage, and status). Users can Load, Unload, or Delete models. This screen refreshes automatically on user action.

Services

Provides a list of all NeuralDrive systemd units with their current status (active, inactive, failed). Users can select a service to view its recent logs or trigger a restart. This screen auto-polls every 5 seconds.

Logs

System-wide log viewer for NeuralDrive services and kernel messages.

Chat

A lightweight chat interface allowing users to test models locally. It includes a model selector dropdown and supports streaming responses via @work(exclusive=True). Model selection persists across screen switches.

Hotkeys

F1: Dashboard
F2: Models
F3: Services
F4: Logs
F5: Chat
Q: Quit

The TUI uses a zone-based focus system.

Tab / Shift+Tab: Cycle focus between different zones within a screen.
Arrow Keys: Navigate within the currently focused zone.
Enter: Activate the selected item or button.

Custom Widgets

Several custom composite widgets are used to build the interface:

SafeHeader: A subclass of Textual's Header that catches NoMatches exceptions during _on_mount, working around Textual bug #4258.
ServiceItem: Displays service name, status label, and control buttons (Start, Stop, Restart).
ModelItem: Displays model name, metadata, and action buttons (Load, Unload, Delete).

Crash Dump Logging

The TUI overrides App._handle_exception to write crash dumps to /var/lib/neuraldrive/logs/tui-crash-*.log with a full traceback. The entire main() function is also wrapped in a try/except block to catch crashes occurring outside the Textual event loop. Screenshots are saved to /var/lib/neuraldrive/screenshots/.

CLI Flags

--wizard: Removes the sentinel file (/etc/neuraldrive/first-boot-complete) and forces the first-boot wizard to re-run on the next launch.

Command Palette

The Textual command palette is explicitly disabled (ENABLE_COMMAND_PALETTE = False).

The TUI is launched automatically on TTY1 via a getty@tty1 service override created by the 02-setup-autologin.chroot build hook. This override configures autologin for the neuraldrive-admin user, and a .bashrc snippet detects TTY1 and runs /usr/local/bin/neuraldrive-tui — a launcher script that activates the Python virtual environment and starts the application.

Code Location

The source code for the TUI is located at /usr/lib/neuraldrive/tui/.

main.py: The main NeuralDriveTUI application class and screen orchestration.
styles.tcss: Textual CSS stylesheet for the interface.
widgets/: Custom UI components (gauges, log viewers).
screens/: Individual screen definitions (dashboard, models, services, network, logs, chat, wizard).

Refresh Intervals

Dashboard: Manual refresh (R key) with live clock.
Services: Auto-polls every 5 seconds.
Models: Refreshes on user action.
System Metrics: Refreshed every 2 seconds.

Modifying the TUI

To add a new screen or widget:

Define the component in the widgets/ or screens/ directory.
Register the new screen in main.py.
Test locally by running python main.py (ensure you have the textual library installed in your venv).

Tip: Use the textual console tool during development to see live debug output and CSS reload notifications.

This chapter is for contributors and maintainers.

First-Boot Wizard

The First-Boot Wizard is a specialized mode of the TUI that guides the user through the initial configuration of the appliance.

Execution Trigger

The wizard is not a separate service. It is an integrated component of the TUI application (main.py). Upon startup, the TUI checks for the existence of the sentinel file /etc/neuraldrive/first-boot-complete. If this file is missing, the TUI presents the wizard interface before allowing access to the main dashboard.

Wizard Flow

The wizard consists of the following steps:

Welcome: Introduction and hardware verification.
Storage/Persistence: Detects the boot device, creates the persistence partition, and initializes the directory structure:
- /var/lib/neuraldrive/ollama
- /var/lib/neuraldrive/models
- /var/lib/neuraldrive/config
- /var/lib/neuraldrive/webui
- /var/lib/neuraldrive/logs
Security: Prompts for the neuraldrive-admin password and generates initial credentials.
Network: Configuration of Ethernet or Wi-Fi.
Models: Selection of initial models for download.
Done: Finalizes configuration and generates the sentinel file.

Credential Generation

Admin Password: The user is prompted to set the password for the neuraldrive-admin account.
API Key: The system automatically generates a 32-character random string, prefixed with nd-. This key is displayed once and then stored in the persistence layer.

Sentinel File

Completion of the wizard creates the sentinel file at /etc/neuraldrive/first-boot-complete. This ensures that subsequent reboots bypass the wizard and proceed directly to the standard dashboard.

CLI Re-run

To re-run the wizard on a configured system, use the following command: neuraldrive-tui --wizard This command removes the sentinel file, forcing the wizard to launch on the next application start.

Customizing the Wizard

The wizard logic is integrated into the TUI application. To add a new step:

Create a new Screen class in the screens/ directory.
Add the screen to the wizard orchestration loop in main.py.

Note: For development, you can re-trigger the wizard by using the --wizard flag. Warning: This may overwrite existing credentials and configuration.

This chapter is for contributors and maintainers.

Certificate Generation

NeuralDrive includes an automated system for managing TLS certificates, ensuring that all network communication is encrypted from the moment the appliance first boots.

The generate-certs.sh Script

The generate-certs.sh script is located at /usr/lib/neuraldrive/generate-certs.sh. It is executed by the neuraldrive-certs.service.

Certificate Parameters

The script uses openssl to generate a self-signed Root CA and a Server Certificate with the following parameters:

Algorithm: RSA 4096-bit.
Digest: SHA-256.
Validity: 365 days.
Subject Alternative Names (SAN):
- DNS:neuraldrive.local
- DNS:<hostname>.local
- IP:<eth0_ip>
- IP:127.0.0.1

Certificate Storage

All certificate material is stored in the persistent directory /etc/neuraldrive/tls/:

neuraldrive-ca.crt: The public Root CA certificate. Users should install this on their client machines to trust the appliance.
server.crt: The certificate presented by Caddy to clients.
server.key: The private key for the server certificate (Permission 0600).
ca.key: The private key for the Root CA (Permission 0600).

Persistence and Regeneration

The certificates are generated once during the first-boot process. Because they are stored on the persistence partition, they survive system updates.

Regeneration Triggers

The neuraldrive-certs.service uses an ExecCondition that checks for the existence of /etc/neuraldrive/tls/server.crt. If the file is present, the service exits without action. A new certificate is generated only if:

The server certificate file has been manually deleted.
The system is performing its first boot and no certificates exist yet.

Exporting the CA

To allow client browsers to connect without security warnings, the neuraldrive-ca.crt can be downloaded via the System API at GET /system/ca-cert.

Warning: Never share or export the .key files. If the private keys are compromised, the security of the appliance's network communication is invalidated.

This chapter is for contributors and maintainers.

Test Strategy

NeuralDrive uses a multi-layered testing strategy to ensure that the appliance is stable across a wide variety of hardware and software configurations.

Test Philosophy

We prioritize integration testing over unit testing. Since NeuralDrive is a system appliance, its stability depends on the interaction between the kernel, drivers, systemd, and the application stack.

Key Principles

Reproducibility: Tests should yield the same results given the same ISO image and virtualized environment.
Automation: Wherever possible, tests should run in the CI pipeline without manual intervention.
Hardware Diversity: While CI handles basic logic, manual "Target Hardware" testing is mandatory for every release.

Test Categories

1. Boot Testing

Ensures the ISO image is correctly formatted and can boot to a functional state. This is primarily handled via QEMU.

2. Hardware and GPU Testing

Validates that gpu-detect.sh correctly identifies hardware and that the appropriate compute stack (CUDA, ROCm, OneAPI) is loaded. This must be done on physical hardware.

3. API Testing

Verifies the endpoints of the System API and the Ollama inference API. These tests ensure that the core logic of the appliance is working as expected.

4. Security Auditing

Periodic checks of service isolation, systemd hardening, and firewall rules. This involves running automated security scanners against the running appliance.

The Test Life Cycle

Local Development: Developers run unit tests and QEMU boot tests.
Pull Request: CI runs linting, API tests, and build validation.
Pre-Release: Maintainers perform full ISO builds and verify them on a variety of target hardware.
Post-Release: Community feedback and bug reports are triaged and integrated back into the test suite.

Note: For detailed instructions on running specific tests, refer to the subsequent chapters in this section.

This chapter is for contributors and maintainers.

QEMU Boot Tests

QEMU is the primary tool for verifying that the generated ISO images boot correctly and that the initial system services are initialized.

The test-boot.sh Script

Located at scripts/test-boot.sh, this script automates the process of launching an ISO in a virtual machine.

Usage

./scripts/test-boot.sh build/neuraldrive-dev.iso

VM Configuration

The script configures QEMU with the following parameters:

CPU: host (if available) or qemu64.
Memory: 8GB.
Boot Mode: UEFI (via OVMF firmware).
Networking: User-mode networking with port forwarding:
- 4443 -> 443 (WebUI)
- 3001 -> 3001 (System API)
Persistence: A virtual 20GB disk is created to simulate the persistence partition.

What is Validated?

A successful QEMU boot test confirms:

GRUB Integrity: The bootloader loads and displays the menu.
Kernel/Initrd: The system successfully transitions from the initramfs to the SquashFS root.
systemd Startup: Core services reach the multi-user.target.
TUI Initialization: The console on TTY1 displays either the dashboard or the setup wizard.
Network Connectivity: The virtual machine receives an IP address and the forwarded ports respond to requests.

Adding New Boot Tests

To test specific scenarios (like multiple disks or specific network configurations), you can pass additional flags to test-boot.sh.

# Example: Testing with a simulated secondary disk
./scripts/test-boot.sh --extra-drive /path/to/disk.img build/neuraldrive-dev.iso

Tip: Use the -nographic flag if you are testing on a headless server. You can then connect to the TUI via a virtual serial console or SSH if enabled.

This chapter is for contributors and maintainers.

GPU Testing

GPU testing is the most critical part of the NeuralDrive validation process. Because the appliance's value depends on its ability to utilize hardware acceleration, every release must be verified on real hardware.

On-Target Validation

The test-gpu.sh script is included in the ISO image at /usr/lib/neuraldrive/test-gpu.sh.

1. Detection Verification

The first step is verifying that gpu-detect.sh has identified the hardware correctly.

cat /run/neuraldrive/gpu.conf

Check that NEURALDRIVE_GPU_VENDOR matches your hardware and that the appropriate kernel modules are loaded (lsmod | grep -E "nvidia|amdgpu|i915").

2. Compute Stack Functional Test

Run a simple inference task to ensure the compute provider (CUDA/ROCm/OneAPI) is functional.

# Verify Ollama can see the GPU
ollama list
# Run a small model
ollama run tinyllama "Hello, what is your name?"

3. VRAM and Performance

Use vendor-specific tools to monitor VRAM usage during inference:

NVIDIA: nvidia-smi
AMD: rocm-smi
Intel: intel_gpu_top

Verify that the model weights are fully loaded into VRAM and that the inference speed (tokens per second) is within the expected range for the hardware.

Hot Dashboard Testing

NeuralDrive includes a dedicated GPU monitoring service (neuraldrive-gpu-monitor.service). Access the dashboard at https://<ip>/monitor/ to verify:

Real-time temperature reporting.
Power consumption metrics.
Multi-GPU visibility (if applicable).

Testing Matrix

Maintainers maintain a spreadsheet of verified hardware configurations. Before a major release, tests are performed on:

NVIDIA GeForce (Consumer)
NVIDIA RTX/A-Series (Professional)
AMD Radeon RX (Consumer)
Intel Arc (Consumer)

Note: If you have access to hardware not currently in our test matrix, please run test-gpu.sh and share the results on GitHub.

This chapter is for contributors and maintainers.

API Tests

NeuralDrive uses the pytest framework to test the System API and ensure that the backend logic remains correct through code changes.

Test Environment

API tests are located in tests/test_api.py. They use the FastAPI.testclient to simulate HTTP requests without needing a running server.

Mocking System Calls

Since many API endpoints interact with the underlying OS (e.g., restarting services, reading logs), the tests use the unittest.mock library to simulate these interactions. This allows the tests to run in a non-Debian environment (like a macOS development machine or a standard CI runner).

Running the Tests

To run the API test suite locally:

# Ensure your dev venv is active
pip install pytest httpx
pytest tests/test_api.py

Coverage Areas

The test suite covers:

1. Authentication

Verifying that requests without a Bearer token are rejected.
Verifying that incorrect tokens are rejected.
Verifying that valid tokens allow access.

2. Service Management

Mocking systemctl calls to verify that the API correctly handles service start/stop/restart commands.
Verifying that the API correctly parses service status output.

3. Log Retrieval

Testing the logic that reads and truncates system journals.
Ensuring that the API correctly handles cases where a service does not exist or has no logs.

4. Configuration Changes

Verifying that network configuration changes are correctly written to the internal config files.
Testing the API key rotation logic.

Adding New Tests

When adding a new endpoint to the System API:

Create a corresponding test function in test_api.py.
Mock any new system calls or filesystem interactions.
Assert that the response status code and body match the expected output.

Tip: Use pytest -v for verbose output and pytest --cov to check the test coverage of the API source code.

This chapter is for contributors and maintainers.

Hardware Compatibility Testing

Hardware Compatibility Testing (HCT) is the process of verifying that NeuralDrive runs reliably on various physical machine configurations.

The HCT Process

HCT is performed manually by contributors and community members. It focuses on the areas where virtualization (QEMU) cannot provide accurate results.

1. Boot Compatibility

UEFI vs BIOS: Testing boot success on both modern UEFI and legacy BIOS systems.
Secure Boot: Verifying if the image boots with Secure Boot enabled (requires signed kernels).
USB Controller Compatibility: Ensuring the live system can boot from USB 2.0, 3.0, and 3.1 ports.

2. Network Stability

Ethernet Chipsets: Testing common drivers (Intel, Realtek, Mellanox).
Wi-Fi Support: Verifying that firmware for common Wi-Fi cards (Intel Wireless, Broadcom) is included and functional.

3. Storage Performance

Persistence Latency: Measuring the performance impact of the OverlayFS layer on different types of media (USB stick vs. NVMe SSD).
LUKS Performance: Ensuring that encrypted persistence does not significantly degrade model loading times.

Reporting Results

We use a "Hardware Compatibility List" (HCL) to track verified systems. When reporting a test result, include:

Manufacturer and Model (e.g., Dell PowerEdge R740, Razer Blade 15).
CPU and RAM.
GPU Model and VRAM.
NeuralDrive Version.
Status (Verified, Issues Found, Not Working).

Community Testing Program

NeuralDrive encourages users to participate in the testing program by providing pre-release "Beta" ISOs. Feedback from these tests is used to refine the gpu-detect.sh script and include missing firmware in the base image.

Tip: If a system fails to boot, capturing the output of journalctl -b (if reachable via SSH) or taking a photo of the console screen is essential for debugging.

This chapter is for contributors and maintainers.

Performance Benchmarking

Benchmarking allows us to track the performance of the NeuralDrive appliance over time and compare different hardware configurations.

Methodology

We focus on two primary metrics: Inference Speed and Resource Efficiency.

1. Inference Speed (Tokens per Second)

This is measured using the Ollama API. We use a standardized set of prompts and models (e.g., Llama 3 8B) to ensure consistency.

Time to First Token (TTFT): The delay between sending a request and receiving the first character.
Tokens per Second (TPS): The average generation speed once the model has started responding.

2. Resource Efficiency

VRAM Utilization: How much of the available GPU memory is consumed by the model weights and the KV cache.
System Memory Overhead: The RAM usage of the base OS, Caddy, WebUI, and the System API.
Power Consumption: Measured via nvidia-smi or external power meters during peak inference.

Benchmarking Tools

Internal Benchmark Script

NeuralDrive includes a utility at /usr/lib/neuraldrive/benchmark.sh. It performs the following:

Downloads a specific test model.
Runs a series of 5 prompts.
Calculates the average TPS and TTFT.
Logs the results along with system metadata (CPU/GPU info).

External Tools

Ollama-Benchmark: A community tool for stress-testing Ollama instances.
Prometheus/Grafana: For long-term monitoring of performance metrics (available via the neuraldrive-gpu-monitor service).

Comparing Configurations

Benchmarks are used to evaluate:

Quantization Levels: Comparing 4-bit (q4_0) vs 8-bit (q8_0) performance.
Driver Versions: Detecting regressions in new NVIDIA or ROCm driver releases.
Filesystem Impact: Comparing model loading times from SquashFS vs. persistence layers.

Note: Benchmark results are highly dependent on hardware. Always include the specific CPU and GPU models when sharing performance data.

This chapter is for contributors and maintainers.

Versioning

NeuralDrive follows a structured versioning scheme to ensure that users and developers can easily identify the age and feature set of a given image.

Calendar Versioning (CalVer)

We use a variation of Calendar Versioning (CalVer) for our releases. This reflects the project's nature as a collection of upstream components (Debian, Ollama, WebUI) that change frequently.

The format is: YYYY.MM.REVISION

YYYY: The four-digit year of release.
MM: The two-digit month of release.
REVISION: The total number of releases ever made. This number never resets — it always increments, even across year/month boundaries.

Examples: 2026.04.1, 2026.05.2, 2027.01.53

The REVISION serves as a monotonically increasing release counter. Given any two NeuralDrive versions, the one with the higher REVISION is always newer, regardless of the date components.

Version File

The primary source of truth for the system version is the file /etc/neuraldrive/version. This file is written during the build process (by build.sh) into config/includes.chroot/etc/neuraldrive/version and is used by the TUI, WebUI, and System API to display the version.

Git Tags

Git tags are the source of truth for determining REVISION numbers. Tags follow the format vYYYY.MM.REVISION (e.g., v2026.04.1).

Use scripts/tag-release.sh to create the next release tag:

./scripts/tag-release.sh --dry-run   # preview
./scripts/tag-release.sh             # create tag
git push origin v2026.04.1           # push it

The script counts all existing v* tags and sets REVISION to count + 1.

Development Builds

Commits on main that are not on an exact release tag produce dev versions labeled with the date and short git hash:

dev-2026.04.15-a1b2c3d

The build system resolves version automatically:

NEURALDRIVE_VERSION env var (if set explicitly)
Exact git tag on HEAD (stripped of v prefix)
Dev fallback: dev-YYYY.MM.DD-SHORTHASH

Component Versioning

While the appliance has its own version, the individual components are also tracked:

Debian Base: Debian 12 (Bookworm).
Ollama: Tracked via the binary version (e.g., 0.1.32).
Open WebUI: Tracked via the git tag or pip version.

The System API provides an endpoint (GET /system/status) that returns the version string.

Note: Major architectural changes that break backward compatibility with older persistence partitions will be signaled by a "Breaking Change" notice in the release notes.

This chapter is for contributors and maintainers.

Release Checklist

The release checklist ensures that every version of NeuralDrive is thoroughly tested and meets our quality standards before being distributed to the public.

Pre-Build Phase

Changelog: All changes since the last release are documented in CHANGELOG.md.
Version: The etc/neuraldrive/version file is updated.
Dependencies: Python requirements and system package lists are verified for compatibility.
Documentation: Developer and User Guides reflect the latest features and architectural changes.

Build Phase

Clean Build: lb clean --all is run before starting the production build.
Variants: ISO images for all supported variants (Full, NVIDIA-Only, Minimal) are generated.
Checksums: SHA256SUMS files are created for all artifacts.

Testing Phase

QEMU Boot: All variants successfully boot to the TUI in a virtual environment.
NVIDIA GPU: Verified functional on at least one GeForce and one professional (A-series) card.
AMD GPU: Verified functional on at least one ROCm-compatible Radeon card.
Intel GPU: Verified functional on an Arc GPU (if applicable for the release).
Setup Wizard: The first-boot experience is tested from start to finish, including persistence encryption.
WebUI & API: All primary routes respond correctly over HTTPS.

Distribution Phase

Signing: The SHA256SUMS file is signed using the project's GPG key.
GitHub Release: A new release is created with a detailed description and attached artifacts.
Social: Announcements are posted to the project's Discord, Matrix, and Twitter channels.

Post-Release Phase

Community Support: Monitor feedback and report any critical bugs.
Bugfix Releases: If critical regressions are found, a .1 or .2 patch is prepared immediately.

Tip: This checklist is integrated into our GitHub PR template and must be completed by the maintainer before a merge to main.

This chapter is for contributors and maintainers.

ISO Signing

To ensure the integrity and authenticity of the NeuralDrive images, every official release is digitally signed using GPG.

The Signing Process

The project maintainers use a dedicated GPG key to sign the SHA256SUMS file associated with each release.

1. Generating Checksums

sha256sum neuraldrive-*.iso > SHA256SUMS

2. Signing the Checksum File

The maintainer signs the SHA256SUMS file with a detached signature:

gpg --detach-sign --armor SHA256SUMS

This generates a SHA256SUMS.asc file.

Verification for Users

Users can verify the integrity of their download by following these steps:

1. Import the Public Key

The public key is available on the GitHub repository and key servers.

gpg --import neuraldrive-public.key

2. Verify the Signature

gpg --verify SHA256SUMS.asc SHA256SUMS

This should output "Good signature from NeuralDrive (Release Key)".

3. Verify the ISO

sha256sum -c SHA256SUMS --ignore-missing

This should output "OK" for the downloaded ISO.

Secure Boot Signing

In addition to GPG signing for distribution, the Linux kernel and GRUB bootloader within the ISO must be signed with a Microsoft-trusted key for Secure Boot to work without manual CA installation. NeuralDrive currently uses the standard Debian Shim and GRUB binaries, which are signed by Debian's official key.

Warning: Never use an ISO image that fails the checksum verification or signature check. This protects against corrupted downloads and potentially malicious tampering.

This chapter is for contributors and maintainers.

Image Variants

NeuralDrive is distributed in several variants, each tailored for different hardware targets and use cases. This helps keep the ISO size manageable and ensures that the system is optimized for its intended GPU provider.

Full Variant (Recommended)

The Full variant includes the complete driver stack for all supported GPU vendors:

NVIDIA (Proprietary)
AMD (ROCm)
Intel (OneAPI)

Characteristics

Size: ~6-8GB.
Hardware Support: Any compatible system with a modern GPU.
Best For: General use, mixed hardware environments, or users who may swap GPUs between machines.

NVIDIA-Only Variant

The NVIDIA-Only variant is optimized for systems with NVIDIA hardware. It excludes the AMD and Intel compute libraries to reduce disk footprint.

Characteristics

Size: ~4-5GB.
Hardware Support: NVIDIA GeForce/RTX/A-Series GPUs only.
Best For: Dedicated AI workstations and servers using NVIDIA hardware.

Minimal (CPU-Only) Variant

The Minimal variant excludes all proprietary GPU drivers and compute stacks. It is intended for testing, development, or low-power hardware.

Characteristics

Size: ~1.5-2GB.
Hardware Support: CPU-only (AVX/AVX2 optimized).
Best For: Virtual machines, CI/CD testing, or systems where the GPU is not supported by Ollama.

Build Comparison

Feature	Full	NVIDIA-Only	Minimal
Ollama	Yes	Yes	Yes
WebUI	Yes	Yes	Yes
NVIDIA Drivers	Yes	Yes	No
ROCm Libraries	Yes	No	No
Intel OneAPI	Yes	No	No
SquashFS Size	Large	Medium	Small

Custom Variants

Developers can create their own variants by modifying the config/package-lists/ directory and adding a new BUILD_VARIANT flag to the build.sh script.

Tip: For custom enterprise deployments, we recommend starting with the NVIDIA-Only or Full variant and removing any unnecessary networking or utility packages to further reduce the attack surface.