This chapter is for contributors and maintainers.
Introduction
Welcome to the NeuralDrive Developer Guide. This documentation provides a deep technical look at the internals of NeuralDrive, a headless Large Language Model (LLM) appliance built on Debian 12.
NeuralDrive is designed to transform standard hardware into a high-performance AI server with minimal configuration. It combines modern LLM runtimes with a robust, immutable system architecture to ensure stability and ease of deployment.
Target Audience
This guide is intended for:
- Contributors: Developers looking to improve the core system, add features, or fix bugs.
- Maintainers: Individuals responsible for managing the build pipeline and release process.
- Image Builders: Users who need to create custom ISO images with specific hardware drivers, pre-loaded models, or modified security policies.
Technology Stack
NeuralDrive leverages several key technologies to provide a seamless experience:
- Base System: Debian 12 (Bookworm) managed via the
live-buildframework. - Inference Engine: Ollama for efficient local LLM execution.
- User Interface: Open WebUI for a modern, feature-rich chat interface.
- Edge Proxy: Caddy server for TLS termination, routing, and authentication.
- System Management: A custom FastAPI-based System API and a Textual-based TUI for console interactions.
The project prioritizes security through systemd hardening, dedicated service users, and an automated TLS certificate management system.
Project Vision
NeuralDrive aims to bridge the gap between complex AI research environments and production-ready appliances. By treating the entire OS as a single, reproducible unit, we ensure that the environment remains consistent across different hardware configurations.
For end-user documentation covering installation and basic usage, refer to the User Guide.
This chapter is for contributors and maintainers.
Development Environment Setup
Setting up a reliable development environment is the first step toward contributing to NeuralDrive. Because the project relies on live-build to generate a bootable Debian image, the host environment must support several low-level system tools.
Supported Environments
There are three primary ways to set up your development environment:
Option A: Debian 12 Native (Recommended)
Developing on a native Debian 12 (Bookworm) system is the most reliable method. It avoids potential issues with loop device mounting and filesystem permissions that can occur in containerized environments.
Option B: Docker (Any OS)
If you are on macOS, Windows, or a non-Debian Linux distribution, you can use the provided Docker environment. This container encapsulates all necessary build dependencies. Note that building requires privileged mode to manage loop devices for SquashFS and ISO generation.
Option C: Virtual Machine
Running Debian 12 inside a VM (via VirtualBox, Proxmox, or VMware) provides the benefits of a native environment while keeping the build system isolated from your primary OS.
Prerequisites
Regardless of your environment, you must install the core build dependencies.
Core Build Tools
Install the following packages on a Debian-based host:
sudo apt update
sudo apt install -y \
live-build \
debootstrap \
squashfs-tools \
xorriso \
grub-pc-bin \
grub-efi-amd64-bin \
mtools \
yq \
git \
curl
Python Environment
The System API and TUI are developed in Python. It is recommended to use a virtual environment for local development:
python3 -m venv venv
source venv/bin/activate
pip install textual psutil httpx rich # TUI dependencies
pip install fastapi uvicorn # API dependencies
Repository Structure
After cloning the repository, familiarize yourself with the layout:
config/: The core of thelive-buildconfiguration.config/hooks/: Scripts executed inside the chroot during the build process.config/includes.chroot/: Files that are copied directly onto the final system filesystem.scripts/: Helper scripts for building, flashing, and testing.docs/: Markdown source for this documentation and the user guide.
Tooling and Editors
Any text editor can be used, but VS Code or Neovim are recommended for their robust support for Shell and Python.
Tip: Install the ShellCheck extension to catch common errors in hook scripts and helper utilities.
QEMU for Testing
To test the generated ISO images without flashing a physical drive, install QEMU:
sudo apt install qemu-system-x86 qemu-utils
This allows you to run the tests/test-boot.sh utility to verify that the image boots correctly in a virtualized environment.
This chapter is for contributors and maintainers.
Building from Source
NeuralDrive uses a customized live-build workflow to generate its bootable ISO images. The build process can be initiated either natively on a Debian system or through a Docker container.
Standard Build Process
The primary entry point for building is the build.sh script located in the project root.
Native Build
To start a build on a native Debian host:
sudo ./build.sh
Docker Build
If you prefer using Docker, use the provided compose configuration:
docker compose up builder
The Docker method uses privileged: true and mounts the current directory into the container to allow the build system to interact with kernel loop devices.
Build Stages
The build.sh script coordinates several distinct phases:
- Validation: Checks that the host environment has all necessary tools and that configuration files are valid.
- Configuration: Runs
lb configto set up the live-build environment based on parameters in theconfig/directory. - Branding: Applies NeuralDrive-specific themes, splash screens, and versioning info.
- Model Staging: Downloads base models defined in
neuraldrive-models.yamlso they can be baked into the image (if configured). - Chroot Construction: Downloads the Debian base and installs packages listed in
config/package-lists/. - Hook Execution: Runs the scripts in
config/hooks/live/to configure services and user accounts. - Binary Stage: Packs the filesystem into a SquashFS image and generates the final ISO.
Incremental Builds and Cleanup
Building the entire system from scratch can take between 30 and 90 minutes depending on your internet connection and CPU speed.
To reset the build environment and start fresh:
sudo lb clean --all
Warning: Avoid manually deleting files in the
chroot/directory, as this can leave stale mount points on your host system. Always uselb clean.
Common Build Errors
- Loop Device Exhaustion: If the build fails during the binary stage, you may have run out of available loop devices. Run
losetup -ato check and reboot the host if necessary. - GPG Errors: Failures during the archive staging usually indicate a missing or expired repository key in
config/archives/. - Space Requirements: Ensure you have at least 40GB of free space before starting a build, as the chroot and temporary SquashFS files are large.
The final output will be located in the build/ directory as a .iso file.
This chapter is for contributors and maintainers.
Running Tests
NeuralDrive includes a suite of automated and manual tests to ensure system stability across different hardware targets.
Boot Testing with QEMU
Before flashing to physical hardware, use QEMU to verify that the ISO image boots to the TUI login screen.
./tests/test-boot.sh build/neuraldrive-dev.iso
This script launches a virtual machine with:
- 8GB of RAM
- UEFI boot support
- A virtual disk for testing persistence
- Port forwarding for the System API (3001) and WebUI (443)
GPU Validation
Because QEMU does not easily simulate a physical GPU, vendor-specific detection and inference tests must be run on real hardware.
Automatic Detection Test
sudo /usr/lib/neuraldrive/gpu-detect.sh
This should correctly identify the active GPU and generate /run/neuraldrive/gpu.conf with the appropriate vendor tags.
Inference Verification
The tests/test-gpu.sh utility performs a small inference task to verify that the Ollama service can communicate with the GPU drivers and load a model into VRAM.
API and Integration Testing
The System API is tested using pytest. These tests verify that the FastAPI endpoints correctly interact with systemd and the underlying config files.
# From the project root with the venv active
pytest tests/test_api.py
These tests cover:
- Authentication token verification
- Service status reporting
- Log retrieval
- Network configuration changes
CI Integration
Every Pull Request triggers a subset of these tests via GitHub Actions:
- Linting: ShellCheck for scripts and Ruff for Python code.
- Unit Tests: Running the API test suite against a mock system.
- Build Test: Attempting to run
lb configand a partiallb buildto catch configuration errors.
Note: Full ISO builds and QEMU boot tests are typically reserved for merges into the
mainbranch due to their long execution time.
This chapter is for contributors and maintainers.
How to Contribute
NeuralDrive is a community-driven project. We welcome contributions of all kinds, from core system improvements to documentation updates and bug reports.
Finding an Issue
If you are looking for a place to start, check the GitHub Issues page for labels like good first issue or help wanted. These are specifically curated for new contributors.
For more complex features, it is recommended to search the existing issues or start a new Discussion thread to ensure your proposed approach aligns with the project's long-term architecture.
Types of Contributions
Core Code
Contributions to the build system (live-build configs), service units, or system scripts (gpu-detect.sh, first-boot.sh). This requires familiarity with Debian and shell scripting.
Applications
Development of the custom Python applications, including the FastAPI System API and the Textual-based TUI.
Documentation
Improving this Developer Guide or the User Guide. Clear documentation is as important as working code.
Testing and QA
Testing the latest snapshots on a variety of hardware (NVIDIA, AMD, Intel GPUs) and reporting the results.
Communication Channels
- GitHub Discussions: The primary place for architectural debate and general questions.
- Discord/Matrix: For real-time coordination and quick troubleshooting (links available in the README).
Contribution Workflow
- Fork the repository.
- Create a new branch for your work.
- Implement your changes and add tests where appropriate.
- Ensure your code follows the Code Style Guidelines.
- Submit a Pull Request.
Note: All contributors must adhere to the project's Code of Conduct to ensure a welcoming and inclusive environment for everyone.
This chapter is for contributors and maintainers.
Code Style and Standards
To maintain a consistent and maintainable codebase, all contributions must adhere to the following style guidelines.
Shell Scripting
NeuralDrive relies heavily on shell scripts for system configuration and build hooks.
- Interpreter: Use
#!/bin/bashor#!/bin/shas appropriate. - Safety: Start all scripts with
set -euo pipefail. - Indentation: Use 4 spaces for indentation.
- Linting: All scripts must pass
shellcheckwithout warnings. - Functions: Define logic in functions rather than a flat script structure.
Python
The System API and TUI are written in Python.
- Style: Adhere to PEP 8.
- Indentation: Use 4 spaces.
- Formatting: Use
rufffor both linting and formatting. - Types: Use Python type hints for all function signatures and complex variables.
- Dependencies: New dependencies must be added to the appropriate
requirements.txtfile and justified in the PR.
Configuration Files (YAML/JSON)
- YAML: Use 2-space indentation.
- JSON: Use 4-space indentation and ensure it is valid via
jq.
Commit Messages
We follow the Conventional Commits specification. This allows for automated changelog generation and versioning.
Format: <type>(<scope>): <description>
Common types:
feat: A new featurefix: A bug fixdocs: Documentation changesstyle: Changes that do not affect the meaning of the code (formatting, etc.)refactor: A code change that neither fixes a bug nor adds a featuretest: Adding missing tests or correcting existing testschore: Changes to the build process or auxiliary tools
Documentation
- Use Markdown for all documentation.
- Avoid AI slop phrases and maintain a professional, technical tone.
- Ensure all relative links between chapters are correct.
- Code blocks must have the appropriate language tag (e.g.,
bash,python,yaml).
This chapter is for contributors and maintainers.
Pull Request Process
This chapter outlines the steps and requirements for submitting a Pull Request (PR) to the NeuralDrive repository.
Branching Strategy
All development should occur on branches derived from the main branch. Use descriptive names for your branches:
feat/description-of-featurefix/description-of-bugdocs/description-of-docs-change
Preparation
Before submitting your PR:
- Sync with Main: Rebase your branch on the latest
mainto ensure there are no merge conflicts. - Run Tests: Ensure all automated tests pass locally.
- Linting: Run
shellcheckon shell scripts andruffon Python files. - Documentation: Update any relevant documentation files if your changes affect the system architecture or user experience.
Submission
When creating the PR on GitHub:
- Provide a clear and concise title using Conventional Commits format.
- Use the PR template to describe the changes, the motivation behind them, and how they were tested.
- Reference any related issues (e.g.,
Closes #123).
Review and Feedback
- At least one maintainer must review and approve the PR before it can be merged.
- Be prepared to address feedback and make requested changes.
- If you make updates, push them to the same branch; the PR will update automatically.
CI Requirements
The following checks must pass for a PR to be considered for merging:
- Build Validation: The
live-buildconfiguration must be valid. - Linting: All linters must report zero issues.
- API Tests: The
pytestsuite for the System API must pass.
Merging Policy
NeuralDrive uses a Squash and Merge policy. This keeps the main branch history clean and ensures that each feature or fix is represented by a single, well-documented commit.
Note: Only maintainers have the permission to merge PRs into the
mainbranch.
This chapter is for contributors and maintainers.
Issue Guidelines
We use GitHub Issues to track bugs, feature requests, and tasks. Effective issue reporting helps maintainers understand and resolve problems faster.
Bug Reports
Before filing a bug report, search the existing issues to see if it has already been reported. If not, use the "Bug Report" template and include:
- System Version: The version of NeuralDrive you are using (found in
/etc/neuraldrive/version). - Hardware Specs: CPU, RAM, and specifically your GPU model and driver version.
- Steps to Reproduce: A clear, numbered list of steps that lead to the issue.
- Expected vs Actual Behavior: What you expected to happen and what actually happened.
- Logs: Relevant logs from
journalctlor the System API.
Feature Requests
Feature requests should be submitted using the "Feature Request" template. Good requests include:
- Problem Statement: What problem does this feature solve?
- Proposed Solution: A description of how the feature should work.
- Alternatives Considered: Any other ways you thought about solving the problem.
- Context: Why this feature is important for the NeuralDrive appliance.
Issue Labels
Maintainers use several labels to organize the backlog:
bug: Something is broken.enhancement: New feature or improvement.documentation: Changes to docs.help wanted: Tasks that are ready for community contribution.good first issue: Simple tasks for new contributors.triage: New issues that need further investigation.
Issue Lifecycle
- New: The issue has been created and is waiting for triage.
- Accepted: A maintainer has verified the bug or approved the feature request.
- In Progress: Someone is actively working on the issue.
- Resolved: The fix or feature has been merged into
main.
Tip: If you find an issue you want to work on, please leave a comment so others know it is being handled.
This chapter is for contributors and maintainers.
System Overview
NeuralDrive is a specialized Linux distribution designed to function as a headless LLM appliance. It prioritizes reliability, security, and ease of use by abstracting the complexities of GPU drivers and model orchestration.
Runtime Stack
The system follows a layered architecture that moves from low-level hardware management to high-level user interfaces.
+-------------------------------------------------------+
| Web Browser (UI) |
+-------------------------------------------------------+
| (HTTPS)
+-------------------------------------------------------+
| Caddy Proxy |
| (TLS, Routing, Authentication, Rate Limiting) |
+-----------+---------------+-----------+---------------+
| | |
+-----------v-----------+ | +-------v-------+ +---|---+
| Open WebUI | | | System API | | TUI |
| (Frontend Application)| | | (FastAPI) | | (TTY) |
+-----------+-----------+ | +-------+-------+ +---|---+
| | | |
+-----------v---------------v-----------v---------------v-------+
| Ollama |
| (Inference Engine & Model Manager) |
+-------------------------------+-------------------------------+
|
+-------------------------------v-------------------------------+
| GPU Hardware / Drivers |
| (NVIDIA CUDA, AMD ROCm, Intel OneAPI) |
+---------------------------------------------------------------+
| Debian 12 Base |
+---------------------------------------------------------------+
Component Roles
Caddy Proxy
Acts as the secure gateway for the entire appliance. It handles TLS termination using self-signed or ACME-provided certificates. Caddy routes traffic to the appropriate backend service based on the URL path and enforces Bearer token authentication for API requests.
Ollama
The core inference engine. It manages model lifecycles (downloading, loading, unloading) and provides an OpenAI-compatible API. It is isolated within its own systemd service with restricted device access.
Open WebUI
A self-contained web interface that communicates with Ollama. It provides a user-friendly environment for chatting with models, managing documents (RAG), and configuring user profiles.
System API
A custom FastAPI application that provides programmatic control over the appliance. It handles tasks like restarting services, retrieving logs, and updating network configurations.
Textual TUI
A terminal-based user interface that appears on the physical console. It allows administrators to view system status, networking info, and perform the initial setup wizard without needing a network connection.
Data Flow
- User Request: An HTTPS request arrives at Caddy on port 443.
- Routing: Caddy determines if the request is for the WebUI (
/), the inference API (/v1/), or the System API (/system/). - Authentication: If the request is for an API endpoint, Caddy verifies the Bearer token.
- Backend Processing: The request is proxied to the relevant local service (e.g., localhost:11434 for Ollama).
- Response: The backend service returns data to Caddy, which then passes it back to the user over the encrypted connection.
This chapter is for contributors and maintainers.
Boot Sequence
The NeuralDrive boot sequence is designed to move from a cold start to a fully functional LLM appliance as quickly as possible. It uses systemd to manage parallelization and service ordering.
Timeline of Events
- Firmware (UEFI/BIOS): The system initializes hardware and locates the EFI system partition on the boot media.
- GRUB Bootloader: Loads the kernel and the initial RAM disk (initrd).
- Kernel & initramfs: The kernel boots and the
live-bootscripts mount the compressed SquashFS filesystem. If persistence is detected, the overlayfs layer is established. - systemd Init: The systemd process (PID 1) starts and begins processing unit files.
Service Ordering
The following list details the startup order of NeuralDrive-specific services:
1. Initialization Phase
neuraldrive-setup.service: A oneshot service that runs/usr/lib/neuraldrive/first-boot.sh. It checks for the existence of a sentinel file (/etc/neuraldrive/.setup-complete). If missing, it blocks the TTY and runs the setup wizard.neuraldrive-zram.service: Configures swap space in RAM to handle memory-intensive model loading.
2. Hardware and Security Phase
neuraldrive-gpu-detect.service: Runs/usr/lib/neuraldrive/gpu-detect.shto identify the GPU vendor and load the appropriate kernel modules (NVIDIA, AMD, or Intel).neuraldrive-certs.service: Checks if TLS certificates exist in/etc/neuraldrive/tls/. If not, it generates a new self-signed CA and server certificate.
3. Application Phase
neuraldrive-ollama.service: Starts the inference engine. This serviceRequires=neuraldrive-gpu-detectto ensure drivers are loaded first.neuraldrive-webui.service: Launches the Open WebUI container/process. ItWants=neuraldrive-ollamabut can start independently.neuraldrive-system-api.service: Starts the FastAPI backend.
4. Gateway Phase
neuraldrive-caddy.service: Starts the Caddy proxy. ItRequires=neuraldrive-certsto ensure it has valid TLS material for binding to port 443.
5. Console Phase
neuraldrive-show-ip.service: A simple oneshot that prints the current IP address and mDNS hostname to the console.- TUI (getty@tty1): The Textual TUI is launched on the main console, providing a dashboard for local administration.
Dependency Visualization
[Hardware Detect] -> [GPU Detect] -> [Ollama] -> [WebUI]
\
[ZRAM Setup] \
+-> [Caddy]
[Certs Gen] -----------------------------------/
Note: Failures in the
gpu-detectservice will preventollamafrom starting, effectively putting the appliance into a "degraded" mode where only the System API and TUI are fully functional for troubleshooting.
This chapter is for contributors and maintainers.
Service Dependency Graph
NeuralDrive uses systemd to manage a complex tree of service dependencies. Understanding this graph is essential for troubleshooting startup issues and adding new components.
Dependency Types
We primarily use three types of systemd dependencies:
Requires=: Strong dependency. If the required unit fails, this unit will not start.Wants=: Weak dependency. This unit will attempt to start the wanted unit, but will proceed even if it fails.After=/Before=: Controls ordering. Does not imply a requirement, only the sequence in which units are started.
Core Dependency Tree
The following diagram illustrates the relationship between the primary NeuralDrive services.
multi-user.target
├── neuraldrive-caddy.service
│ ├── After: network-online.target, neuraldrive-certs.service
│ ├── Requires: neuraldrive-certs.service
│ └── Wants: network-online.target
├── neuraldrive-webui.service
│ ├── After: network.target, neuraldrive-ollama.service
│ └── Wants: neuraldrive-ollama.service
├── neuraldrive-ollama.service
│ ├── After: network.target, neuraldrive-gpu-detect.service
│ └── Requires: neuraldrive-gpu-detect.service
├── neuraldrive-system-api.service
│ └── After: network.target
├── neuraldrive-gpu-detect.service
│ └── Before: neuraldrive-ollama.service
└── neuraldrive-certs.service
├── After: local-fs.target, network-online.target
└── Before: neuraldrive-caddy.service
Service Breakdown
neuraldrive-ollama
This is the most critical service in the application stack. It requires neuraldrive-gpu-detect to ensure that kernel modules for NVIDIA, AMD, or Intel GPUs are loaded before the Ollama binary attempts to initialize its compute provider.
neuraldrive-caddy
As the edge proxy, Caddy is the final piece of the puzzle. It requires neuraldrive-certs because it cannot bind to port 443 without valid certificate files in /etc/neuraldrive/tls/. It also requires network-online.target to ensure network interfaces are available before starting.
neuraldrive-gpu-monitor
This service runs independently of Ollama. It Wants=neuraldrive-gpu-detect but can run in a fallback mode using CPU-only monitoring if no GPU is found.
Failure Cascades
- GPU Detection Failure: If
gpu-detectfails,ollamawill not start. Consequently, the WebUI will show connection errors, though the System API will remain available for logs. - Certificate Failure: If
certs.servicefails to generate or find certificates,caddywill fail to start. This makes the appliance unreachable over the network via HTTPS.
Tip: Use
systemctl list-dependencies neuraldrive-caddy.serviceon a running system to see a live representation of the current dependency tree.
This chapter is for contributors and maintainers.
Storage Architecture
NeuralDrive uses a hybrid storage model that combines an immutable base system with a persistent writable layer. This design ensures that the appliance remains stable over time while allowing for model storage and user data persistence.
Partition Layout
The standard NeuralDrive image expects a 4-partition layout on the boot media (typically a USB drive or internal SSD):
- Partition 1 (EFI): FAT32. Contains the GRUB bootloader and EFI binaries.
- Partition 2 (Boot): Ext4. Contains the Linux kernel and initrd.
- Partition 3 (Live System): ISO9660 or SquashFS. Contains the compressed, read-only root filesystem.
- Partition 4 (Persistence): Ext4 or LUKS2-encrypted. Contains the
persistencelayer used bylive-boot.
Immutable Root (SquashFS)
The core operating system is stored in a highly compressed SquashFS image. During the boot process, this image is mounted as the read-only root (/). This ensures that:
- Accidental changes to system binaries are impossible.
- The system always boots into a known-good state.
- The disk footprint of the OS remains small.
Persistence and OverlayFS
To allow for data persistence, NeuralDrive uses overlayfs. This kernel feature merges the read-only SquashFS layer with a writable directory on the persistence partition.
Key Persistent Paths
While the entire filesystem can be made persistent, NeuralDrive is configured to prioritize specific directories:
/var/lib/neuraldrive/models/: Stores the large LLM weights for Ollama./var/lib/neuraldrive/webui/: Stores the Open WebUI database and user-uploaded documents./etc/neuraldrive/: Stores system configuration, API keys, and TLS certificates./var/log/: Persists system logs across reboots for troubleshooting.
Model Storage Management
Because LLM models can be dozens of gigabytes in size, NeuralDrive handles model storage separately from the main OS updates. When a user downloads a model via the WebUI or System API, it is saved directly to the persistence partition. This means models survive system updates (re-flashing the SquashFS partition).
Encryption (LUKS2)
For deployments requiring high security, the persistence partition can be encrypted using LUKS2. This is handled during the first-boot setup wizard. If encryption is enabled:
- The user provides a passphrase.
- The persistence partition is formatted as a LUKS2 volume.
- The system adds the necessary
crypttabentries to the initramfs to prompt for the password at boot.
Warning: If the persistence partition is lost or corrupted, all downloaded models and user configurations will be deleted. Always ensure the system is shut down cleanly to prevent filesystem corruption on the writable layer.
This chapter is for contributors and maintainers.
Network Architecture
NeuralDrive is designed to operate as a secure network appliance. It uses a combination of Caddy for edge routing, Avahi for service discovery, and nftables for firewalling.
Edge Proxy (Caddy)
Caddy serves as the single point of entry for all network traffic. It listens on two ports with distinct responsibilities:
Port 443 — Web Dashboard
| External Path | Internal Destination | Purpose |
|---|---|---|
/* | localhost:3000 | Open WebUI |
Port 8443 — API Gateway
| External Path | Internal Destination | Purpose |
|---|---|---|
/v1/* | localhost:11434 | Ollama OpenAI-compatible API (authenticated) |
/api/* | localhost:11434 | Ollama Native API (authenticated) |
/system/* | localhost:3001 | System API (FastAPI) |
/monitor/* | localhost:1312 | GPU Hot Dashboard |
/health | 200 OK | Liveness Probe |
This dual-port architecture separates browser traffic from programmatic API access, allowing each to be managed and monitored independently.
Service Discovery (mDNS)
To simplify headless access, NeuralDrive runs avahi-daemon. By default, the appliance advertises itself as neuraldrive.local. This allows users to access the WebUI at https://neuraldrive.local without needing to know the IP address.
The mDNS name can be changed via the System API or the first-boot wizard.
Firewall (nftables)
The system uses nftables with a "default drop" policy. The firewall configuration is managed via /etc/neuraldrive/nftables.conf, loaded through a systemd drop-in at /etc/systemd/system/nftables.service.d/neuraldrive.conf.
Permitted Traffic
- Inbound TCP 443/8443: WebUI and API access.
- Inbound TCP 22: SSH (rate-limited to 3 attempts per minute).
- Inbound UDP 5353: mDNS for service discovery.
- Inbound ICMP: Echo requests (rate-limited to 5 per second).
- Outbound: All traffic permitted (required for downloading models and system updates).
Internal Port Assignments
Services are bound to localhost whenever possible to ensure they are only accessible via the Caddy proxy or the local TUI.
- 3000: Open WebUI
- 3001: System API (FastAPI)
- 11434: Ollama
- 1312: GPU Monitor
Note: The System API and Ollama services enforce their own authentication (API keys), but Caddy provides the first layer of defense by requiring valid TLS and potentially enforcing IP-based allowlists.
This chapter is for contributors and maintainers.
Security Model
The NeuralDrive security model is built on the principle of defense-in-depth. As an appliance that often handles sensitive or private data, the system must protect against both external network attacks and local privilege escalation.
Threat Model
The primary threats NeuralDrive is designed to mitigate include:
- Unauthorized Inference: Using the LLM without a valid API key.
- System Tampering: Unauthorized changes to the system configuration or service units.
- Data Exfiltration: Accessing stored model weights or user chat history.
- Denial of Service: Exhausting system resources (GPU VRAM or system memory) through malicious requests.
Defense Layers
1. Service Isolation
Each major component runs as a dedicated, non-root user. This limits the "blast radius" if a single service is compromised.
| Service | User | UID |
|---|---|---|
| Ollama | neuraldrive-ollama | 901 |
| WebUI | neuraldrive-webui | 902 |
| Caddy | neuraldrive-caddy | 903 |
| Monitor | neuraldrive-monitor | 904 |
| System API | neuraldrive-api | 905 |
2. systemd Hardening
Every service unit employs advanced systemd hardening directives:
ProtectSystem=strict: The root filesystem is read-only for the service.ProtectHome=yes: Access to/homeis denied.PrivateTmp=yes: A private/tmpdirectory is created.NoNewPrivileges=yes: Prevents the service and its children from gaining new privileges viasetuidbinaries.PrivateDevices=no: Explicitly disabled for the Ollama service to allow access to GPU device nodes (/dev/nvidia*,/dev/dri/*) required for accelerated inference.- DeviceAllow removal: All
DeviceAllowlines were removed from the Ollama service unit. On cgroup v2 systems,DeviceAllowuses eBPF device filters that blocked CUDA access even with explicit allow rules for GPU devices. Removing these rules was necessary to enable reliable GPU acceleration.
3. Authentication and Authorization
NeuralDrive uses a dual-key system for authentication:
- Admin Password: Used for local TUI access and the initial WebUI account creation.
- API Key: A 32-character token (prefixed with
nd-) used for all programmatic access to the inference and system APIs.
The System API key is stored in /etc/neuraldrive/api.key with 0600 permissions, owned by the neuraldrive-api user.
4. Transport Layer Security (TLS)
All network communication is encrypted via TLS 1.3. The system generates a unique CA and server certificate during the first-boot process. Caddy enforces HTTPS for all routes.
SSH Security
SSH is disabled by default. When enabled via the System API:
- Only key-based authentication is permitted.
- Only the
neuraldrive-adminuser is allowed to log in. fail2banmonitors logs and bans IPs after three failed attempts.
Immutable OS
The read-only SquashFS root filesystem prevents persistent malware from being installed on the system. Any changes made to the system directories (outside of the persistence layer) are discarded on reboot.
Warning: Security is a shared responsibility. While NeuralDrive provides a hardened base, users must ensure their API keys are kept secret and that physical access to the appliance is restricted.
This chapter is for contributors and maintainers.
live-build Overview
live-build is a set of scripts used to create Debian Live system images. It is the core framework used by NeuralDrive to generate its bootable ISO files.
How live-build Works
The live-build process is divided into several stages, each responsible for a different part of the system creation:
- Bootstrap: Downloads a minimal Debian base system using
debootstrap. - Chroot: Enters the base system and installs additional packages, executes hooks, and applies configurations.
- Binary: Packages the chroot into a SquashFS image and creates the final bootable medium (ISO or HDD image).
- Source: (Optional) Creates a source image containing the source code for all packages used.
Configuration Logic
The behavior of live-build is controlled by the contents of the config/ directory. When you run lb config, these files are read to generate a master configuration for the build.
Key Directories
config/package-lists/: Defines which packages are installed from the Debian repositories.config/includes.chroot/: Files placed here are copied directly into the chroot filesystem before it is packed.config/hooks/: Executable scripts that run inside the chroot to perform complex setup tasks.config/archives/: Custom repository definitions and GPG keys.
NeuralDrive Implementation
NeuralDrive extends the standard live-build workflow with a custom wrapper (build.sh). This wrapper handles pre-build tasks like validating the environment and post-build tasks like branding the ISO.
Build Stages in NeuralDrive
- Pre-Configuration: Setting version strings and updating model metadata.
- Standard live-build Workflow: Running
lb clean,lb config, andlb build. - Artifact Management: Moving the finished ISO to the
output/directory and generating checksums.
Benefits of live-build
- Reproducibility: The same configuration produces the same image every time.
- Flexibility: Easily switch between different Debian branches (Stable, Testing, Sid).
- Automation: The entire process can be run in a CI/CD environment without manual intervention.
For official documentation on live-build, refer to the Debian Live Manual.
This chapter is for contributors and maintainers.
Directory Structure
This chapter describes the purpose of the primary directories and files in the NeuralDrive repository.
Root Directory
build.sh: The main entry point for starting a build.Dockerfile: Defines the containerized build environment.docker-compose.yml: Orchestrates the builder container and volume mounts.neuraldrive-build.yaml.example: Template for CI/CD or local build configurations.
config/
The config/ directory is the heart of the live-build setup.
archives/: Contains.listand.keyfiles for external repositories (e.g., ROCm, Intel, Debian Backports).hooks/:live/: Scripts that run inside the chroot during the build process. They must be named with a numeric prefix (e.g.,01-setup-system.chroot).
includes.chroot/: This directory mirrors the root filesystem of the final appliance.etc/neuraldrive/: Configuration files for Ollama, WebUI, and the System API.etc/systemd/system/: Systemd unit files for all NeuralDrive services.usr/lib/neuraldrive/: Location for custom scripts, Python applications, and their virtual environments.
package-lists/:neuraldrive.list.chroot: List of standard Debian packages to install.nvidia.list.chroot: Packages required for NVIDIA GPU support.
preseed/: (Empty) NeuralDrive uses a live system approach rather than a traditional Debian installer, so preseed files are not used.
scripts/
Contains utility scripts for developers and maintainers:
neuraldrive-flash.sh: Writes a generated ISO to a physical USB drive.download-models.sh: Downloads model weights from the Ollama registry for pre-loading.seed-models.sh: Stages downloaded models into the build filesystem.apply-branding.sh: Applies NeuralDrive branding to the Open WebUI interface.validate-config.sh: Validates the build configuration before starting.post-build.sh: Post-build cleanup and image finalization.
docs/
Source files for the documentation.
user-guide/: End-user documentation.dev-guide/: This developer guide.
tests/
Integration and unit tests.
test_api.py: Pytest suite for the System API.test-boot.sh: Launches the ISO in QEMU for boot verification.test-gpu.sh: Shell script for on-target GPU validation.
plan/
Internal design documents and implementation plans used during the development of NeuralDrive.
This chapter is for contributors and maintainers.
Build Hooks
Hooks are scripts that live-build executes inside the chroot environment during the build process. They are used to perform configuration tasks that cannot be handled by simple file inclusion or package installation.
Hook Execution Order
Hooks are executed in alphabetical order. In NeuralDrive, we use a numeric prefix (e.g., 01-, 02-) to ensure a predictable sequence. All hooks are located in config/hooks/live/ and use the .chroot suffix.
Current Hooks Breakdown
01-setup-system.chroot
Performs base system configuration, including:
- Setting the default locale and timezone.
- Configuring the hostname.
- Creating the
neuraldrive-adminuser. - Setting up the
sudoersfile.
02-setup-autologin.chroot
Configures the system to automatically log into the TTY1 console and launch the NeuralDrive TUI. This involves modifying getty service overrides.
03-install-extras.chroot
Installs components that are not available via standard APT repositories, such as the Ollama binary. It also handles the installation of GPU-specific firmware.
04-install-python-apps.chroot
Sets up the Python virtual environments for the System API, WebUI, and TUI. It runs pip install for all requirements and ensures the environments are correctly owned by their respective service users.
05-generate-configs.chroot
Generates default configuration files and ensures correct permissions for sensitive files (like API keys and TLS directories). It also enables all NeuralDrive systemd services.
Writing New Hooks
When adding a new hook:
- Naming: Use the
.chrootsuffix and place the file inconfig/hooks/live/. - Interpreter: Always start with
#!/bin/shor#!/bin/bash. - Safety: Include
set -eto ensure the build fails if the hook encounters an error. - Permissions: Ensure the script is executable (
chmod +x).
Note: Hooks run as the root user inside the chroot. Be careful when modifying system files and always verify that the changes will be persistent in the final SquashFS image.
This chapter is for contributors and maintainers.
Package Lists
Package lists define which software is retrieved from APT repositories and installed into the NeuralDrive image.
Core List (neuraldrive.list.chroot)
This list contains the essential packages for the appliance:
- Base System:
systemd,udev,kmod,ca-certificates. - Networking:
caddy,avahi-daemon,nftables,curl,wget. - Python Stack:
python3,python3-venv,python3-pip. - Utilities:
vim,htop,pciutils,usbutils,p7zip-full.
GPU-Specific Lists
To support different hardware configurations, we use specialized package lists:
NVIDIA (nvidia.list.chroot)
nvidia-driver: The core proprietary driver.nvidia-smi: System management interface for monitoring.nvidia-cuda-toolkit: Required for compute tasks.libnvidia-encode1: For video encoding/decoding if needed by secondary apps.
AMD (ROCm)
Packages for ROCm support are typically pulled from the official Radeon repositories defined in the archives/ directory. These include rocm-hip-sdk and amdgpu-dkms.
Intel (OneAPI)
Similar to AMD, Intel packages like intel-oneapi-runtime-libs and intel-opencl-icd are sourced from the Intel OneAPI repository.
How live-build Handles Lists
During the chroot stage, live-build reads every file with the .list.chroot extension and passes the package names to apt-get install.
Dependencies
live-build handles dependency resolution automatically. However, to keep the image size small, we explicitly use --no-install-recommends in the global build configuration.
Customizing Package Lists
If you need to add a package to your custom build:
- Create a new file in
config/package-lists/(e.g.,custom.list.chroot). - Add the package names, one per line.
- Run a new build.
Tip: For temporary testing, you can add packages to
neuraldrive.list.chroot, but it is better to keep custom additions in a separate file for better maintainability.
This chapter is for contributors and maintainers.
Archive Sources
NeuralDrive supplements the standard Debian repositories with third-party archives to provide the latest GPU drivers and specialized software. These are managed via the config/archives/ directory.
Repository Configuration
Each third-party archive requires two files:
.listfile: Defines the repository URL and components (e.g.,deb https://repo.radeon.com/rocm/apt/latest focal main)..keyfile: The GPG public key used to verify the packages in the repository.
Currently Configured Archives
Debian Backports
Used to pull newer versions of certain packages (like the Linux kernel) while remaining on the Stable (Bookworm) base.
NVIDIA Repository
Provides the latest proprietary drivers and CUDA toolkit directly from NVIDIA.
ROCm (AMD)
Provides the Radeon Open Compute stack. We pin this to specific versions to ensure compatibility with Ollama's build requirements.
Intel OneAPI
Provides the necessary libraries for Intel Arc and Data Center GPUs.
Repository Pinning
To prevent third-party repositories from accidentally upgrading core Debian packages, we use APT pinning. This is configured in config/archives/*.pref.chroot files.
Example pin for the NVIDIA repository:
Package: *
Pin: origin developer.download.nvidia.com
Pin-Priority: 600
Adding a New Archive
To add a new repository:
- Download the GPG key and place it in
config/archives/repo-name.key. - Create a list file at
config/archives/repo-name.list.chroot. - (Optional) Create a preferences file at
config/archives/repo-name.pref.chrootif pinning is required.
Warning: Be cautious when adding third-party archives. Every new source increases the risk of package conflicts and can significantly increase the size of the final ISO image. Always verify the authenticity of GPG keys before adding them to the project.
This chapter is for contributors and maintainers.
Docker Build Environment
The Docker-based build environment provides a consistent, isolated workspace for generating NeuralDrive images regardless of the host operating system.
Dockerfile Walkthrough
The Dockerfile in the project root defines the builder image. It is based on debian:bookworm to match the target OS.
Key components of the Dockerfile:
- Base Layer: Installs
live-build,debootstrap, and other core utilities. - Workdir: Sets
/buildas the working directory. - Volume: Declares
/outputas a destination for the finished ISO. - Entrypoint: A script that runs
lb clean,lb config, andlb buildin sequence.
Docker Compose Configuration
The docker-compose.yml simplifies the process of launching the builder with the correct permissions and mounts.
services:
builder:
build: .
privileged: true
volumes:
- .:/build
- ./output:/output
environment:
- BUILD_VARIANT=full
Privileged Mode
The privileged: true flag is mandatory. live-build uses chroot, mount, and mknod, all of which require elevated privileges. Additionally, generating SquashFS and ISO images requires access to the host's loop devices.
Building with Docker
To start a build:
docker compose run --rm builder
The finished ISO will appear in the ./output/ directory on your host machine.
Benefits and Limitations
Benefits
- No Host Contamination: Build dependencies are not installed on your primary OS.
- Cross-Platform: Build from macOS or Windows (using Docker Desktop).
- CI Readiness: The same Docker image used for local development is used in GitHub Actions.
Limitations
- Performance: Building inside a container can be slightly slower due to I/O overhead on non-Linux hosts.
- Loop Device Contention: If multiple builds are run simultaneously on the same host, they may compete for the same loop devices, leading to failures.
Tip: If you encounter "Permission Denied" errors when accessing the
./output/directory on Linux, ensure that your user has permission to write to that folder, as files created by the root user inside the container may have restricted permissions on the host.
This chapter is for contributors and maintainers.
CI/CD Pipeline
NeuralDrive uses GitHub Actions to automate the testing, building, and distribution of system images.
Workflow Structure
The primary workflow is defined in .github/workflows/build.yml. It consists of several jobs that run in sequence.
1. Lint and Test
- Runs
shellcheckon all scripts inconfig/hooks/andscripts/. - Runs
ruffon the Python codebase. - Executes the
pytestsuite for the System API. - This job runs on every push and pull request.
2. Build Matrix
When a change is merged into main or a tag is created, the build job is triggered. It uses a matrix to build multiple variants of NeuralDrive simultaneously:
- Full: Includes drivers for NVIDIA, AMD, and Intel.
- NVIDIA-Only: Optimized for NVIDIA hardware.
- Minimal: CPU-only, intended for testing and low-power hardware.
3. Artifact Publishing
Finished ISO images are uploaded as GitHub Action artifacts. For tagged releases, the workflow also:
- Generates SHA256 checksums.
- Signs the checksums using the project's GPG key.
- Creates a new GitHub Release and uploads the ISOs and signatures.
Configuration (neuraldrive-build.yaml)
The CI pipeline can be configured via a neuraldrive-build.yaml file in the repository root. This allows maintainers to:
- Toggle specific build variants.
- Define which models are pre-loaded into the images.
- Set custom version strings for nightlies.
Runner Requirements
Building ISO images requires a Linux runner with support for nested virtualization or privileged containers. We use large GitHub-hosted runners to ensure there is enough disk space and CPU power to complete builds within the 60-minute timeout.
Automated Testing in CI
In addition to static analysis, the CI pipeline attempts a "Dry Run" build:
- It runs
lb configto verify the configuration is valid. - It performs the bootstrap stage to ensure the Debian repositories are accessible.
- Full binary builds are only performed on the
mainbranch to conserve resources.
Note: Because CI runners do not have physical GPUs, we cannot perform full GPU validation in the cloud. These tests remain a manual requirement for the release checklist.
This chapter is for contributors and maintainers.
GPU Auto-Detection
The gpu-detect.sh script is a critical component of the NeuralDrive boot sequence. It is responsible for identifying the installed hardware and ensuring the correct compute stack is initialized.
Logic Overview
The script runs during the neuraldrive-gpu-detect.service phase. It performs the following steps:
- PCI Enumeration: Uses
lspcito scan for VGA and 3D controllers. - Vendor Identification: Matches the PCI IDs against known vendor strings (NVIDIA, AMD, Intel).
- Module Loading: Calls
modprobeto load the appropriate kernel modules (e.g.,nvidia,amdgpu, ori915). - Configuration Generation: Writes the detected state to
/run/neuraldrive/gpu.conf.
Vendor Detection Details
NVIDIA
If an NVIDIA card is detected (PCI vendor ID 10de), the script:
- Loads the
nvidia,nvidia-current-uvm, andnvidia-drmmodules viamodprobe. Note that on Debian systems, the CUDA Unified Video Memory module is namednvidia-current-uvm, notnvidia-uvm. - Executes
nvidia-modprobe -uto create the/dev/nvidia-uvmand/dev/nvidia-uvm-toolsdevice nodes. Without these nodes, CUDA memory allocation fails silently, and Ollama falls back to CPU. - Enables persistence mode with
nvidia-smi -pm 1. - Sets
VENDOR=NVIDIAin the config file. - If module loading fails, records
NVIDIA_MODULE_MISSING=true.
Boot-Time Module Loading
In addition to the detection script, the system includes /etc/modules-load.d/nvidia-uvm.conf. This file contains nvidia-current-uvm to ensure the module is automatically loaded at boot.
Ollama Service Integration
As a safety net, the Ollama systemd unit also includes ExecStartPre commands for both modprobe nvidia-current-uvm and nvidia-modprobe -u. This ensures the necessary drivers and device nodes are present even if the primary detection service is delayed.
cgroup v2 and Device Access
On systems using cgroup v2, standard DeviceAllow rules in systemd units utilize eBPF filters that can inadvertently block CUDA access, even when explicit allow rules are defined. NeuralDrive avoids this by removing all DeviceAllow directives from the Ollama service and relying on PrivateDevices=no instead.
AMD
If an AMD card is detected (PCI vendor ID 1002), the script:
- Loads the
amdgpumodule. - Sets
VENDOR=AMD. - If module loading fails, records
AMD_MODULE_MISSING=true.
Intel
If an Intel GPU is detected (PCI vendor ID 8086), the script:
- Loads the
i915module. - Sets
VENDOR=INTEL. - If module loading fails, records
INTEL_MODULE_MISSING=true.
The gpu.conf File
The output of the detection process is stored in a runtime environment file:
# /run/neuraldrive/gpu.conf
VENDOR=NVIDIA
Additional keys may be present for error conditions (e.g., NVIDIA_MODULE_MISSING=true) or Secure Boot detection (SECURE_BOOT=true). This file is available to subsequent services for determining the active compute provider.
Troubleshooting and Fallbacks
If no GPU is detected, or if module loading fails:
- The script sets
VENDOR=CPU. - A message is logged to standard output.
- Ollama will start in CPU-only mode, which is significantly slower but allows the appliance to remain functional.
Modifying Detection Logic
To add support for new hardware or refine the detection process, modify /usr/lib/neuraldrive/gpu-detect.sh in the repository.
Note: Changes to the detection script require a re-build of the ISO or a manual update to the file on the persistence layer for testing.
This chapter is for contributors and maintainers.
Ollama Integration
Ollama serves as the core inference engine for NeuralDrive. It is managed as a systemd service and configured to optimize resource usage on the appliance.
Installation
The Ollama binary is installed to /usr/local/bin/ollama during the build process via the 03-install-extras hook. We use the official static binary to ensure compatibility across different Debian versions.
Service Configuration
The neuraldrive-ollama.service manages the lifecycle of the inference engine.
Service Unit Highlights
- User: Runs as
neuraldrive-ollama(UID 901). - Dependencies:
Requires=neuraldrive-gpu-detect.service. - Security: The service uses
PrivateDevices=noto allow GPU access. Note that allDeviceAllowdirectives were removed because cgroup v2's eBPF device filter blocked CUDA access even with explicit allow rules. - Resource Limits:
MemoryHigh=90%: Triggers aggressive swapping/GC when system memory is nearly full.MemoryMax=95%: The hard limit before the OOM killer intervenes.
- GPU Initialization: The unit includes
ExecStartPrecommands to ensure CUDA is ready:ExecStartPre=-/sbin/modprobe nvidia-current-uvm: Loads the CUDA Unified Video Memory module (namednvidia-current-uvmin the Debian package).ExecStartPre=-/usr/bin/nvidia-modprobe -u: Creates the/dev/nvidia-uvmand/dev/nvidia-uvm-toolsdevice nodes.
Persistent Config Overrides
The service unit includes two EnvironmentFile directives to manage configuration:
EnvironmentFile=/etc/neuraldrive/ollama.conf: Contains baked-in system defaults.EnvironmentFile=-/var/lib/neuraldrive/config/ollama.conf: Allows persistent user-defined overrides. The-prefix ensures the service starts even if this file is missing.
Configuration (ollama.conf)
System-wide settings are defined in the environment files:
OLLAMA_HOST=127.0.0.1:11434: Ensures the API is only accessible locally (proxied by Caddy).OLLAMA_MODELS=/var/lib/neuraldrive/models/: Directs model weights to the persistence layer.OLLAMA_KEEP_ALIVE=5m: Models are unloaded from VRAM after 5 minutes of inactivity.OLLAMA_MAX_LOADED_MODELS=0: Set to0for auto mode. Ollama manages multiple models based on available VRAM using LRU (Least Recently Used) eviction.OLLAMA_NUM_PARALLEL=1: Processes one request at a time to maintain deterministic performance.
API Usage Details
Loading Models
To load a model, send a POST request to /api/generate with keep_alive set to -1. Note that keep_alive must be an integer; passing it as a string ("-1") will result in a rejection.
Unloading Models
To unload a model, send a POST request to /api/generate with keep_alive set to 0. To verify the eviction, poll /api/ps until the model no longer appears. A race condition exists where the 200 OK response may return before the eviction process is fully complete.
Monitoring
GET /api/ps returns a list of running models, including the size_vram utilized by each.
GPU Support
Ollama automatically detects the compute provider based on the drivers loaded by gpu-detect.sh.
- NVIDIA: Uses the CUDA runner.
- AMD: Uses the ROCm/HIP runner.
- Intel: Uses the OneAPI runner.
- CPU: Falls back to the AVX/AVX2 optimized CPU runner.
Model Management
Models can be managed via the Open WebUI or the ollama CLI. When a model is "pulled," it is stored as a series of blobs in the persistent /var/lib/neuraldrive/models/ directory.
Tip: To interact with Ollama manually for troubleshooting, use the
neuraldrive-adminuser:sudo -u neuraldrive-ollama ollama list
This chapter is for contributors and maintainers.
Open WebUI Integration
Open WebUI provides the primary user interface for NeuralDrive. It is a feature-rich chat environment that communicates with the Ollama backend.
Installation and Environment
Open WebUI is installed into a Python virtual environment at /usr/lib/neuraldrive/webui/venv/. This isolation prevents dependency conflicts with the system Python or the System API.
The service is managed by neuraldrive-webui.service and runs as the neuraldrive-webui user (UID 902).
Configuration (webui.env)
Key configuration parameters are stored in /etc/neuraldrive/webui.env:
OLLAMA_BASE_URL=http://localhost:11434: The internal address of the Ollama service.DATA_DIR=/var/lib/neuraldrive/webui: Persistent location for the SQLite database and user uploads.ENABLE_SIGNUP=false: Disables public account creation for security.WEBUI_AUTH=true: Enforces login for all users.WEBUI_NAME=NeuralDrive: Customizes the branding of the interface.
Service Lifecycle
The WebUI service Wants=neuraldrive-ollama. This means systemd will attempt to start Ollama whenever the WebUI is started. However, the WebUI is capable of running even if Ollama is temporarily unavailable, showing a "Connection Error" in the settings.
Data Persistence
The /var/lib/neuraldrive/webui directory contains:
webui.db: The SQLite database containing user accounts, chat history, and settings.uploads/: Documents uploaded for RAG (Retrieval-Augmented Generation).cache/: Temporary files and model templates.
Customization
To modify the default behavior of Open WebUI on NeuralDrive:
- Update the environment variables in
config/includes.chroot/etc/neuraldrive/webui.env. - For UI changes, the CSS or frontend assets can be modified in the source before building.
Note: Major updates to Open WebUI often require database migrations. These are handled automatically by the application on startup, but it is recommended to back up the
webui.dbfile before performing a system upgrade.
This chapter is for contributors and maintainers.
Caddy Reverse Proxy
Caddy is the edge proxy and security gateway for the NeuralDrive appliance. It handles TLS termination, URL routing, and authentication for API endpoints.
The Caddyfile
The routing logic is defined in /etc/neuraldrive/Caddyfile. This file is loaded by the neuraldrive-caddy.service.
Key Routing Rules
NeuralDrive uses two separate server blocks — one for the web dashboard and one for the API gateway:
:443 {
# TLS termination for the Web UI
tls /etc/neuraldrive/tls/server.crt /etc/neuraldrive/tls/server.key
# All requests proxy to Open WebUI
reverse_proxy localhost:3000
}
:8443 {
# TLS termination for the API gateway
tls /etc/neuraldrive/tls/server.crt /etc/neuraldrive/tls/server.key
# Ollama Inference API (/v1/* and /api/*)
# Requires Bearer token matching NEURALDRIVE_API_KEY
@api_authenticated {
path /v1/* /api/*
header Authorization "Bearer {env.NEURALDRIVE_API_KEY}"
}
handle @api_authenticated {
reverse_proxy localhost:11434
}
# Unauthenticated API requests get 401
handle @api_routes {
respond 401
}
# GPU Monitoring
handle /monitor/* {
reverse_proxy localhost:1312
}
# System Management API
handle /system/* {
reverse_proxy localhost:3001
}
# Health Check (public)
handle /health {
respond "OK" 200
}
}
The dual-port architecture keeps the user-facing web UI separate from the machine-to-machine API gateway, allowing each to be managed independently.
Security Features
TLS Management
Caddy uses the certificates generated by neuraldrive-certs.service. By default, these are self-signed RSA 4096-bit certificates. Caddy is configured to only allow modern TLS protocols (1.2 and 1.3).
API Authentication
For requests to /v1/* and /api/*, Caddy can be configured to enforce Bearer token authentication. The valid API key is sourced from the NEURALDRIVE_API_KEY environment variable, which is populated from /etc/neuraldrive/caddy.env.
Capabilities
The neuraldrive-caddy.service uses AmbientCapabilities=CAP_NET_BIND_SERVICE. This allows the non-root neuraldrive-caddy user to bind to privileged ports like 443.
Environment Variables (caddy.env)
NEURALDRIVE_API_KEY: The master 32-character key for the appliance.DOMAIN_NAME: (Optional) Used for ACME/Let's Encrypt integration if the user provides a public domain.
Customizing Routes
To add a new service to the NeuralDrive stack:
- Assign it a local port (e.g., 8080).
- Add a
handle_pathblock to theCaddyfile. - Re-build the image or restart the
neuraldrive-caddyservice.
Tip: Use
caddy validate --config /etc/neuraldrive/Caddyfileto check for syntax errors before restarting the service.
This chapter is for contributors and maintainers.
System Management API
The NeuralDrive System API is a custom FastAPI application that provides programmatic control over the appliance's hardware and software configuration.
Application Structure
The source code is located at /usr/lib/neuraldrive/api/neuraldrive_api/. The application is consolidated in a single entry point:
main.py: Route definitions, token verification, and all endpoint logic.
Authentication
All endpoints (except /system/ca-cert) require a Bearer token. The API verifies this token against the master key stored in /etc/neuraldrive/api.key.
Primary Endpoints
System Status
GET /system/status: Returns CPU/RAM usage and system uptime.GET /system/gpu: Returns detailed GPU metrics (temp, VRAM, utilization).
Service Management
GET /system/services: Lists the status of all NeuralDrive services.POST /system/services/{name}/{action}: Allows starting, stopping, or restarting services.GET /system/logs: Retrieves the last N lines of the system journal for a specific service.
Configuration
POST /system/network/hostname: Updates the system hostname and mDNS name.GET /system/security: Returns the current firewall status and SSH settings.POST /system/api-keys/rotate: Generates a new master API key.
systemd Integration
The API interacts with systemd via the systemctl CLI or the dbus Python bindings. It is limited to a whitelist of NeuralDrive-specific services to prevent unauthorized modification of core OS components.
Development and Testing
The API can be run locally for development:
# With venv active
uvicorn main:app --host 0.0.0.0 --port 3001
Testing is handled via pytest in the tests/test_api.py file. These tests mock the system calls to ensure the API logic is correct without needing a full NeuralDrive environment.
Warning: The System API runs as a privileged user (
neuraldrive-api) with specific sudo permissions to manage services. Never expose port 3001 directly to the internet; always route traffic through the Caddy proxy.
This chapter is for contributors and maintainers.
Terminal User Interface (TUI)
The NeuralDrive TUI provides a local management console for administrators. It is designed to be usable directly from a physical keyboard and monitor without requiring a network connection.
Technology Stack
The TUI is built using the Textual framework, a modern Python library for building sophisticated terminal applications. It uses async I/O to maintain a responsive interface even while performing long-running system tasks.
Interface Structure
The TUI is divided into several screens:
Dashboard
The default screen showing:
- System hostname and version.
- Current IP addresses (IPv4 and IPv6).
- mDNS address (
neuraldrive.local). - CPU, Memory, and Disk usage gauges.
- GPU status overview. Manual refresh is available via the R key, alongside a live clock.
Models
Lists all LLM models currently stored in the persistence layer. Shows model name and metadata columns (params, quantization, disk size, VRAM usage, and status). Users can Load, Unload, or Delete models. This screen refreshes automatically on user action.
Services
Provides a list of all NeuralDrive systemd units with their current status (active, inactive, failed). Users can select a service to view its recent logs or trigger a restart. This screen auto-polls every 5 seconds.
Logs
System-wide log viewer for NeuralDrive services and kernel messages.
Chat
A lightweight chat interface allowing users to test models locally. It includes a model selector dropdown and supports streaming responses via @work(exclusive=True). Model selection persists across screen switches.
Hotkeys
- F1: Dashboard
- F2: Models
- F3: Services
- F4: Logs
- F5: Chat
- Q: Quit
Navigation Model
The TUI uses a zone-based focus system.
- Tab / Shift+Tab: Cycle focus between different zones within a screen.
- Arrow Keys: Navigate within the currently focused zone.
- Enter: Activate the selected item or button.
Custom Widgets
Several custom composite widgets are used to build the interface:
SafeHeader: A subclass of Textual'sHeaderthat catchesNoMatchesexceptions during_on_mount, working around Textual bug #4258.ServiceItem: Displays service name, status label, and control buttons (Start, Stop, Restart).ModelItem: Displays model name, metadata, and action buttons (Load, Unload, Delete).
Crash Dump Logging
The TUI overrides App._handle_exception to write crash dumps to /var/lib/neuraldrive/logs/tui-crash-*.log with a full traceback. The entire main() function is also wrapped in a try/except block to catch crashes occurring outside the Textual event loop. Screenshots are saved to /var/lib/neuraldrive/screenshots/.
CLI Flags
--wizard: Removes the sentinel file (/etc/neuraldrive/first-boot-complete) and forces the first-boot wizard to re-run on the next launch.
Command Palette
The Textual command palette is explicitly disabled (ENABLE_COMMAND_PALETTE = False).
Auto-Login and Startup
The TUI is launched automatically on TTY1 via a getty@tty1 service override created by the 02-setup-autologin.chroot build hook. This override configures autologin for the neuraldrive-admin user, and a .bashrc snippet detects TTY1 and runs /usr/local/bin/neuraldrive-tui — a launcher script that activates the Python virtual environment and starts the application.
Code Location
The source code for the TUI is located at /usr/lib/neuraldrive/tui/.
main.py: The mainNeuralDriveTUIapplication class and screen orchestration.styles.tcss: Textual CSS stylesheet for the interface.widgets/: Custom UI components (gauges, log viewers).screens/: Individual screen definitions (dashboard, models, services, network, logs, chat, wizard).
Refresh Intervals
- Dashboard: Manual refresh (R key) with live clock.
- Services: Auto-polls every 5 seconds.
- Models: Refreshes on user action.
- System Metrics: Refreshed every 2 seconds.
Modifying the TUI
To add a new screen or widget:
- Define the component in the
widgets/orscreens/directory. - Register the new screen in
main.py. - Test locally by running
python main.py(ensure you have thetextuallibrary installed in your venv).
Tip: Use the
textual consoletool during development to see live debug output and CSS reload notifications.
This chapter is for contributors and maintainers.
First-Boot Wizard
The First-Boot Wizard is a specialized mode of the TUI that guides the user through the initial configuration of the appliance.
Execution Trigger
The wizard is not a separate service. It is an integrated component of the TUI application (main.py). Upon startup, the TUI checks for the existence of the sentinel file /etc/neuraldrive/first-boot-complete. If this file is missing, the TUI presents the wizard interface before allowing access to the main dashboard.
Wizard Flow
The wizard consists of the following steps:
- Welcome: Introduction and hardware verification.
- Storage/Persistence: Detects the boot device, creates the persistence partition, and initializes the directory structure:
/var/lib/neuraldrive/ollama/var/lib/neuraldrive/models/var/lib/neuraldrive/config/var/lib/neuraldrive/webui/var/lib/neuraldrive/logs
- Security: Prompts for the
neuraldrive-adminpassword and generates initial credentials. - Network: Configuration of Ethernet or Wi-Fi.
- Models: Selection of initial models for download.
- Done: Finalizes configuration and generates the sentinel file.
Credential Generation
- Admin Password: The user is prompted to set the password for the
neuraldrive-adminaccount. - API Key: The system automatically generates a 32-character random string, prefixed with
nd-. This key is displayed once and then stored in the persistence layer.
Sentinel File
Completion of the wizard creates the sentinel file at /etc/neuraldrive/first-boot-complete. This ensures that subsequent reboots bypass the wizard and proceed directly to the standard dashboard.
CLI Re-run
To re-run the wizard on a configured system, use the following command:
neuraldrive-tui --wizard
This command removes the sentinel file, forcing the wizard to launch on the next application start.
Customizing the Wizard
The wizard logic is integrated into the TUI application. To add a new step:
- Create a new
Screenclass in thescreens/directory. - Add the screen to the wizard orchestration loop in
main.py.
Note: For development, you can re-trigger the wizard by using the
--wizardflag. Warning: This may overwrite existing credentials and configuration.
This chapter is for contributors and maintainers.
Certificate Generation
NeuralDrive includes an automated system for managing TLS certificates, ensuring that all network communication is encrypted from the moment the appliance first boots.
The generate-certs.sh Script
The generate-certs.sh script is located at /usr/lib/neuraldrive/generate-certs.sh. It is executed by the neuraldrive-certs.service.
Certificate Parameters
The script uses openssl to generate a self-signed Root CA and a Server Certificate with the following parameters:
- Algorithm: RSA 4096-bit.
- Digest: SHA-256.
- Validity: 365 days.
- Subject Alternative Names (SAN):
DNS:neuraldrive.localDNS:<hostname>.localIP:<eth0_ip>IP:127.0.0.1
Certificate Storage
All certificate material is stored in the persistent directory /etc/neuraldrive/tls/:
neuraldrive-ca.crt: The public Root CA certificate. Users should install this on their client machines to trust the appliance.server.crt: The certificate presented by Caddy to clients.server.key: The private key for the server certificate (Permission0600).ca.key: The private key for the Root CA (Permission0600).
Persistence and Regeneration
The certificates are generated once during the first-boot process. Because they are stored on the persistence partition, they survive system updates.
Regeneration Triggers
The neuraldrive-certs.service uses an ExecCondition that checks for the existence of /etc/neuraldrive/tls/server.crt. If the file is present, the service exits without action. A new certificate is generated only if:
- The server certificate file has been manually deleted.
- The system is performing its first boot and no certificates exist yet.
Exporting the CA
To allow client browsers to connect without security warnings, the neuraldrive-ca.crt can be downloaded via the System API at GET /system/ca-cert.
Warning: Never share or export the
.keyfiles. If the private keys are compromised, the security of the appliance's network communication is invalidated.
This chapter is for contributors and maintainers.
Test Strategy
NeuralDrive uses a multi-layered testing strategy to ensure that the appliance is stable across a wide variety of hardware and software configurations.
Test Philosophy
We prioritize integration testing over unit testing. Since NeuralDrive is a system appliance, its stability depends on the interaction between the kernel, drivers, systemd, and the application stack.
Key Principles
- Reproducibility: Tests should yield the same results given the same ISO image and virtualized environment.
- Automation: Wherever possible, tests should run in the CI pipeline without manual intervention.
- Hardware Diversity: While CI handles basic logic, manual "Target Hardware" testing is mandatory for every release.
Test Categories
1. Boot Testing
Ensures the ISO image is correctly formatted and can boot to a functional state. This is primarily handled via QEMU.
2. Hardware and GPU Testing
Validates that gpu-detect.sh correctly identifies hardware and that the appropriate compute stack (CUDA, ROCm, OneAPI) is loaded. This must be done on physical hardware.
3. API Testing
Verifies the endpoints of the System API and the Ollama inference API. These tests ensure that the core logic of the appliance is working as expected.
4. Security Auditing
Periodic checks of service isolation, systemd hardening, and firewall rules. This involves running automated security scanners against the running appliance.
The Test Life Cycle
- Local Development: Developers run unit tests and QEMU boot tests.
- Pull Request: CI runs linting, API tests, and build validation.
- Pre-Release: Maintainers perform full ISO builds and verify them on a variety of target hardware.
- Post-Release: Community feedback and bug reports are triaged and integrated back into the test suite.
Note: For detailed instructions on running specific tests, refer to the subsequent chapters in this section.
This chapter is for contributors and maintainers.
QEMU Boot Tests
QEMU is the primary tool for verifying that the generated ISO images boot correctly and that the initial system services are initialized.
The test-boot.sh Script
Located at scripts/test-boot.sh, this script automates the process of launching an ISO in a virtual machine.
Usage
./scripts/test-boot.sh build/neuraldrive-dev.iso
VM Configuration
The script configures QEMU with the following parameters:
- CPU:
host(if available) orqemu64. - Memory: 8GB.
- Boot Mode: UEFI (via OVMF firmware).
- Networking: User-mode networking with port forwarding:
4443->443(WebUI)3001->3001(System API)
- Persistence: A virtual 20GB disk is created to simulate the persistence partition.
What is Validated?
A successful QEMU boot test confirms:
- GRUB Integrity: The bootloader loads and displays the menu.
- Kernel/Initrd: The system successfully transitions from the initramfs to the SquashFS root.
- systemd Startup: Core services reach the
multi-user.target. - TUI Initialization: The console on TTY1 displays either the dashboard or the setup wizard.
- Network Connectivity: The virtual machine receives an IP address and the forwarded ports respond to requests.
Adding New Boot Tests
To test specific scenarios (like multiple disks or specific network configurations), you can pass additional flags to test-boot.sh.
# Example: Testing with a simulated secondary disk
./scripts/test-boot.sh --extra-drive /path/to/disk.img build/neuraldrive-dev.iso
Tip: Use the
-nographicflag if you are testing on a headless server. You can then connect to the TUI via a virtual serial console or SSH if enabled.
This chapter is for contributors and maintainers.
GPU Testing
GPU testing is the most critical part of the NeuralDrive validation process. Because the appliance's value depends on its ability to utilize hardware acceleration, every release must be verified on real hardware.
On-Target Validation
The test-gpu.sh script is included in the ISO image at /usr/lib/neuraldrive/test-gpu.sh.
1. Detection Verification
The first step is verifying that gpu-detect.sh has identified the hardware correctly.
cat /run/neuraldrive/gpu.conf
Check that NEURALDRIVE_GPU_VENDOR matches your hardware and that the appropriate kernel modules are loaded (lsmod | grep -E "nvidia|amdgpu|i915").
2. Compute Stack Functional Test
Run a simple inference task to ensure the compute provider (CUDA/ROCm/OneAPI) is functional.
# Verify Ollama can see the GPU
ollama list
# Run a small model
ollama run tinyllama "Hello, what is your name?"
3. VRAM and Performance
Use vendor-specific tools to monitor VRAM usage during inference:
- NVIDIA:
nvidia-smi - AMD:
rocm-smi - Intel:
intel_gpu_top
Verify that the model weights are fully loaded into VRAM and that the inference speed (tokens per second) is within the expected range for the hardware.
Hot Dashboard Testing
NeuralDrive includes a dedicated GPU monitoring service (neuraldrive-gpu-monitor.service). Access the dashboard at https://<ip>/monitor/ to verify:
- Real-time temperature reporting.
- Power consumption metrics.
- Multi-GPU visibility (if applicable).
Testing Matrix
Maintainers maintain a spreadsheet of verified hardware configurations. Before a major release, tests are performed on:
- NVIDIA GeForce (Consumer)
- NVIDIA RTX/A-Series (Professional)
- AMD Radeon RX (Consumer)
- Intel Arc (Consumer)
Note: If you have access to hardware not currently in our test matrix, please run
test-gpu.shand share the results on GitHub.
This chapter is for contributors and maintainers.
API Tests
NeuralDrive uses the pytest framework to test the System API and ensure that the backend logic remains correct through code changes.
Test Environment
API tests are located in tests/test_api.py. They use the FastAPI.testclient to simulate HTTP requests without needing a running server.
Mocking System Calls
Since many API endpoints interact with the underlying OS (e.g., restarting services, reading logs), the tests use the unittest.mock library to simulate these interactions. This allows the tests to run in a non-Debian environment (like a macOS development machine or a standard CI runner).
Running the Tests
To run the API test suite locally:
# Ensure your dev venv is active
pip install pytest httpx
pytest tests/test_api.py
Coverage Areas
The test suite covers:
1. Authentication
- Verifying that requests without a Bearer token are rejected.
- Verifying that incorrect tokens are rejected.
- Verifying that valid tokens allow access.
2. Service Management
- Mocking
systemctlcalls to verify that the API correctly handles service start/stop/restart commands. - Verifying that the API correctly parses service status output.
3. Log Retrieval
- Testing the logic that reads and truncates system journals.
- Ensuring that the API correctly handles cases where a service does not exist or has no logs.
4. Configuration Changes
- Verifying that network configuration changes are correctly written to the internal config files.
- Testing the API key rotation logic.
Adding New Tests
When adding a new endpoint to the System API:
- Create a corresponding test function in
test_api.py. - Mock any new system calls or filesystem interactions.
- Assert that the response status code and body match the expected output.
Tip: Use
pytest -vfor verbose output andpytest --covto check the test coverage of the API source code.
This chapter is for contributors and maintainers.
Hardware Compatibility Testing
Hardware Compatibility Testing (HCT) is the process of verifying that NeuralDrive runs reliably on various physical machine configurations.
The HCT Process
HCT is performed manually by contributors and community members. It focuses on the areas where virtualization (QEMU) cannot provide accurate results.
1. Boot Compatibility
- UEFI vs BIOS: Testing boot success on both modern UEFI and legacy BIOS systems.
- Secure Boot: Verifying if the image boots with Secure Boot enabled (requires signed kernels).
- USB Controller Compatibility: Ensuring the live system can boot from USB 2.0, 3.0, and 3.1 ports.
2. Network Stability
- Ethernet Chipsets: Testing common drivers (Intel, Realtek, Mellanox).
- Wi-Fi Support: Verifying that firmware for common Wi-Fi cards (Intel Wireless, Broadcom) is included and functional.
3. Storage Performance
- Persistence Latency: Measuring the performance impact of the OverlayFS layer on different types of media (USB stick vs. NVMe SSD).
- LUKS Performance: Ensuring that encrypted persistence does not significantly degrade model loading times.
Reporting Results
We use a "Hardware Compatibility List" (HCL) to track verified systems. When reporting a test result, include:
- Manufacturer and Model (e.g., Dell PowerEdge R740, Razer Blade 15).
- CPU and RAM.
- GPU Model and VRAM.
- NeuralDrive Version.
- Status (Verified, Issues Found, Not Working).
Community Testing Program
NeuralDrive encourages users to participate in the testing program by providing pre-release "Beta" ISOs. Feedback from these tests is used to refine the gpu-detect.sh script and include missing firmware in the base image.
Tip: If a system fails to boot, capturing the output of
journalctl -b(if reachable via SSH) or taking a photo of the console screen is essential for debugging.
This chapter is for contributors and maintainers.
Performance Benchmarking
Benchmarking allows us to track the performance of the NeuralDrive appliance over time and compare different hardware configurations.
Methodology
We focus on two primary metrics: Inference Speed and Resource Efficiency.
1. Inference Speed (Tokens per Second)
This is measured using the Ollama API. We use a standardized set of prompts and models (e.g., Llama 3 8B) to ensure consistency.
- Time to First Token (TTFT): The delay between sending a request and receiving the first character.
- Tokens per Second (TPS): The average generation speed once the model has started responding.
2. Resource Efficiency
- VRAM Utilization: How much of the available GPU memory is consumed by the model weights and the KV cache.
- System Memory Overhead: The RAM usage of the base OS, Caddy, WebUI, and the System API.
- Power Consumption: Measured via
nvidia-smior external power meters during peak inference.
Benchmarking Tools
Internal Benchmark Script
NeuralDrive includes a utility at /usr/lib/neuraldrive/benchmark.sh. It performs the following:
- Downloads a specific test model.
- Runs a series of 5 prompts.
- Calculates the average TPS and TTFT.
- Logs the results along with system metadata (CPU/GPU info).
External Tools
- Ollama-Benchmark: A community tool for stress-testing Ollama instances.
- Prometheus/Grafana: For long-term monitoring of performance metrics (available via the
neuraldrive-gpu-monitorservice).
Comparing Configurations
Benchmarks are used to evaluate:
- Quantization Levels: Comparing 4-bit (q4_0) vs 8-bit (q8_0) performance.
- Driver Versions: Detecting regressions in new NVIDIA or ROCm driver releases.
- Filesystem Impact: Comparing model loading times from SquashFS vs. persistence layers.
Note: Benchmark results are highly dependent on hardware. Always include the specific CPU and GPU models when sharing performance data.
This chapter is for contributors and maintainers.
Versioning
NeuralDrive follows a structured versioning scheme to ensure that users and developers can easily identify the age and feature set of a given image.
Calendar Versioning (CalVer)
We use a variation of Calendar Versioning (CalVer) for our releases. This reflects the project's nature as a collection of upstream components (Debian, Ollama, WebUI) that change frequently.
The format is: YYYY.MM.REVISION
- YYYY: The four-digit year of release.
- MM: The two-digit month of release.
- REVISION: The total number of releases ever made. This number never resets — it always increments, even across year/month boundaries.
Examples: 2026.04.1, 2026.05.2, 2027.01.53
The REVISION serves as a monotonically increasing release counter. Given any two NeuralDrive versions, the one with the higher REVISION is always newer, regardless of the date components.
Version File
The primary source of truth for the system version is the file /etc/neuraldrive/version. This file is written during the build process (by build.sh) into config/includes.chroot/etc/neuraldrive/version and is used by the TUI, WebUI, and System API to display the version.
Git Tags
Git tags are the source of truth for determining REVISION numbers. Tags follow the format vYYYY.MM.REVISION (e.g., v2026.04.1).
Use scripts/tag-release.sh to create the next release tag:
./scripts/tag-release.sh --dry-run # preview
./scripts/tag-release.sh # create tag
git push origin v2026.04.1 # push it
The script counts all existing v* tags and sets REVISION to count + 1.
Development Builds
Commits on main that are not on an exact release tag produce dev versions labeled with the date and short git hash:
dev-2026.04.15-a1b2c3d
The build system resolves version automatically:
NEURALDRIVE_VERSIONenv var (if set explicitly)- Exact git tag on HEAD (stripped of
vprefix) - Dev fallback:
dev-YYYY.MM.DD-SHORTHASH
Component Versioning
While the appliance has its own version, the individual components are also tracked:
- Debian Base: Debian 12 (Bookworm).
- Ollama: Tracked via the binary version (e.g., 0.1.32).
- Open WebUI: Tracked via the git tag or pip version.
The System API provides an endpoint (GET /system/status) that returns the version string.
Note: Major architectural changes that break backward compatibility with older persistence partitions will be signaled by a "Breaking Change" notice in the release notes.
This chapter is for contributors and maintainers.
Release Checklist
The release checklist ensures that every version of NeuralDrive is thoroughly tested and meets our quality standards before being distributed to the public.
Pre-Build Phase
-
Changelog: All changes since the last release are documented in
CHANGELOG.md. -
Version: The
etc/neuraldrive/versionfile is updated. - Dependencies: Python requirements and system package lists are verified for compatibility.
- Documentation: Developer and User Guides reflect the latest features and architectural changes.
Build Phase
-
Clean Build:
lb clean --allis run before starting the production build. - Variants: ISO images for all supported variants (Full, NVIDIA-Only, Minimal) are generated.
- Checksums: SHA256SUMS files are created for all artifacts.
Testing Phase
- QEMU Boot: All variants successfully boot to the TUI in a virtual environment.
- NVIDIA GPU: Verified functional on at least one GeForce and one professional (A-series) card.
- AMD GPU: Verified functional on at least one ROCm-compatible Radeon card.
- Intel GPU: Verified functional on an Arc GPU (if applicable for the release).
- Setup Wizard: The first-boot experience is tested from start to finish, including persistence encryption.
- WebUI & API: All primary routes respond correctly over HTTPS.
Distribution Phase
- Signing: The SHA256SUMS file is signed using the project's GPG key.
- GitHub Release: A new release is created with a detailed description and attached artifacts.
- Social: Announcements are posted to the project's Discord, Matrix, and Twitter channels.
Post-Release Phase
- Community Support: Monitor feedback and report any critical bugs.
-
Bugfix Releases: If critical regressions are found, a
.1or.2patch is prepared immediately.
Tip: This checklist is integrated into our GitHub PR template and must be completed by the maintainer before a merge to
main.
This chapter is for contributors and maintainers.
ISO Signing
To ensure the integrity and authenticity of the NeuralDrive images, every official release is digitally signed using GPG.
The Signing Process
The project maintainers use a dedicated GPG key to sign the SHA256SUMS file associated with each release.
1. Generating Checksums
sha256sum neuraldrive-*.iso > SHA256SUMS
2. Signing the Checksum File
The maintainer signs the SHA256SUMS file with a detached signature:
gpg --detach-sign --armor SHA256SUMS
This generates a SHA256SUMS.asc file.
Verification for Users
Users can verify the integrity of their download by following these steps:
1. Import the Public Key
The public key is available on the GitHub repository and key servers.
gpg --import neuraldrive-public.key
2. Verify the Signature
gpg --verify SHA256SUMS.asc SHA256SUMS
This should output "Good signature from NeuralDrive (Release Key)".
3. Verify the ISO
sha256sum -c SHA256SUMS --ignore-missing
This should output "OK" for the downloaded ISO.
Secure Boot Signing
In addition to GPG signing for distribution, the Linux kernel and GRUB bootloader within the ISO must be signed with a Microsoft-trusted key for Secure Boot to work without manual CA installation. NeuralDrive currently uses the standard Debian Shim and GRUB binaries, which are signed by Debian's official key.
Warning: Never use an ISO image that fails the checksum verification or signature check. This protects against corrupted downloads and potentially malicious tampering.
This chapter is for contributors and maintainers.
Image Variants
NeuralDrive is distributed in several variants, each tailored for different hardware targets and use cases. This helps keep the ISO size manageable and ensures that the system is optimized for its intended GPU provider.
Full Variant (Recommended)
The Full variant includes the complete driver stack for all supported GPU vendors:
- NVIDIA (Proprietary)
- AMD (ROCm)
- Intel (OneAPI)
Characteristics
- Size: ~6-8GB.
- Hardware Support: Any compatible system with a modern GPU.
- Best For: General use, mixed hardware environments, or users who may swap GPUs between machines.
NVIDIA-Only Variant
The NVIDIA-Only variant is optimized for systems with NVIDIA hardware. It excludes the AMD and Intel compute libraries to reduce disk footprint.
Characteristics
- Size: ~4-5GB.
- Hardware Support: NVIDIA GeForce/RTX/A-Series GPUs only.
- Best For: Dedicated AI workstations and servers using NVIDIA hardware.
Minimal (CPU-Only) Variant
The Minimal variant excludes all proprietary GPU drivers and compute stacks. It is intended for testing, development, or low-power hardware.
Characteristics
- Size: ~1.5-2GB.
- Hardware Support: CPU-only (AVX/AVX2 optimized).
- Best For: Virtual machines, CI/CD testing, or systems where the GPU is not supported by Ollama.
Build Comparison
| Feature | Full | NVIDIA-Only | Minimal |
|---|---|---|---|
| Ollama | Yes | Yes | Yes |
| WebUI | Yes | Yes | Yes |
| NVIDIA Drivers | Yes | Yes | No |
| ROCm Libraries | Yes | No | No |
| Intel OneAPI | Yes | No | No |
| SquashFS Size | Large | Medium | Small |
Custom Variants
Developers can create their own variants by modifying the config/package-lists/ directory and adding a new BUILD_VARIANT flag to the build.sh script.
Tip: For custom enterprise deployments, we recommend starting with the NVIDIA-Only or Full variant and removing any unnecessary networking or utility packages to further reduce the attack surface.