This chapter is for contributors and maintainers.
GPU Auto-Detection
The gpu-detect.sh script is a critical component of the NeuralDrive boot sequence. It is responsible for identifying the installed hardware and ensuring the correct compute stack is initialized.
Logic Overview
The script runs during the neuraldrive-gpu-detect.service phase. It performs the following steps:
- PCI Enumeration: Uses
lspcito scan for VGA and 3D controllers. - Vendor Identification: Matches the PCI IDs against known vendor strings (NVIDIA, AMD, Intel).
- Module Loading: Calls
modprobeto load the appropriate kernel modules (e.g.,nvidia,amdgpu, ori915). - Configuration Generation: Writes the detected state to
/run/neuraldrive/gpu.conf.
Vendor Detection Details
NVIDIA
If an NVIDIA card is detected (PCI vendor ID 10de), the script:
- Loads the
nvidia,nvidia-current-uvm, andnvidia-drmmodules viamodprobe. Note that on Debian systems, the CUDA Unified Video Memory module is namednvidia-current-uvm, notnvidia-uvm. - Executes
nvidia-modprobe -uto create the/dev/nvidia-uvmand/dev/nvidia-uvm-toolsdevice nodes. Without these nodes, CUDA memory allocation fails silently, and Ollama falls back to CPU. - Enables persistence mode with
nvidia-smi -pm 1. - Sets
VENDOR=NVIDIAin the config file. - If module loading fails, records
NVIDIA_MODULE_MISSING=true.
Boot-Time Module Loading
In addition to the detection script, the system includes /etc/modules-load.d/nvidia-uvm.conf. This file contains nvidia-current-uvm to ensure the module is automatically loaded at boot.
Ollama Service Integration
As a safety net, the Ollama systemd unit also includes ExecStartPre commands for both modprobe nvidia-current-uvm and nvidia-modprobe -u. This ensures the necessary drivers and device nodes are present even if the primary detection service is delayed.
cgroup v2 and Device Access
On systems using cgroup v2, standard DeviceAllow rules in systemd units utilize eBPF filters that can inadvertently block CUDA access, even when explicit allow rules are defined. NeuralDrive avoids this by removing all DeviceAllow directives from the Ollama service and relying on PrivateDevices=no instead.
AMD
If an AMD card is detected (PCI vendor ID 1002), the script:
- Loads the
amdgpumodule. - Sets
VENDOR=AMD. - If module loading fails, records
AMD_MODULE_MISSING=true.
Intel
If an Intel GPU is detected (PCI vendor ID 8086), the script:
- Loads the
i915module. - Sets
VENDOR=INTEL. - If module loading fails, records
INTEL_MODULE_MISSING=true.
The gpu.conf File
The output of the detection process is stored in a runtime environment file:
# /run/neuraldrive/gpu.conf
VENDOR=NVIDIA
Additional keys may be present for error conditions (e.g., NVIDIA_MODULE_MISSING=true) or Secure Boot detection (SECURE_BOOT=true). This file is available to subsequent services for determining the active compute provider.
Troubleshooting and Fallbacks
If no GPU is detected, or if module loading fails:
- The script sets
VENDOR=CPU. - A message is logged to standard output.
- Ollama will start in CPU-only mode, which is significantly slower but allows the appliance to remain functional.
Modifying Detection Logic
To add support for new hardware or refine the detection process, modify /usr/lib/neuraldrive/gpu-detect.sh in the repository.
Note: Changes to the detection script require a re-build of the ISO or a manual update to the file on the persistence layer for testing.