Audience: Everyone
Glossary
This alphabetical list defines technical terms and concepts utilized throughout the NeuralDrive documentation.
- API Key: A unique authentication token (
nd-xxxx) used to secure access to the inference and system management APIs. - Avahi: A system that facilitates service discovery on a local network via mDNS. It allows the
neuraldrive.localhostname to resolve without a central DNS server. - Caddy: A high-performance, memory-safe web server that serves as NeuralDrive's reverse proxy, managing TLS encryption and request routing.
- CUDA: NVIDIA's parallel computing platform and programming model that enables hardware acceleration on NVIDIA GPUs.
- GGUF: The primary file format used by NeuralDrive for storing and distributing quantized LLM weights. It is optimized for fast loading and efficient memory usage.
- Inference: The process of using a trained machine learning model to generate an output (e.g., text, images, or embeddings) based on input data.
- Live System: An operating system designed to boot and run entirely from removable media (like a USB drive) without requiring installation to a permanent hard disk.
- LUKS: Linux Unified Key Setup. The standard for Linux disk encryption, used by NeuralDrive to secure data on the persistence partition.
- mDNS: Multicast DNS. A protocol that resolves hostnames in small networks that do not have a dedicated local DNS server.
- Ollama: The underlying inference engine in NeuralDrive that manages downloading, loading, and serving large language models.
- Open WebUI: A feature-rich, self-hosted web interface that provides a user-friendly chat environment for interacting with local LLMs.
- Overlayfs: A union filesystem that allows NeuralDrive to layer a writable storage area (the persistence partition) over a read-only base system.
- Persistence: A dedicated writable partition on the NeuralDrive USB media that stores downloaded models, user accounts, and system configuration between reboots.
- Quantization: The process of reducing the precision of a model's weights (e.g., from 16-bit to 4-bit) to reduce its memory footprint and increase inference speed.
- RAG: Retrieval-Augmented Generation. A technique that combines LLM generation with external data retrieval to improve the accuracy and relevance of responses.
- ROCm: AMD's open-source software stack for GPU computing, enabling hardware acceleration on compatible AMD graphics cards.
- SquashFS: A highly compressed, read-only filesystem used for the base NeuralDrive operating system image.
- TUI: Terminal User Interface. The text-based management console that appears on the physical NeuralDrive device for initial setup and monitoring.
- VRAM: Video RAM. The high-speed memory dedicated to the GPU, which determines the maximum size of the model that can be hardware-accelerated.
- zram: A kernel feature that creates a compressed swap area in system RAM, increasing effective memory capacity for memory-intensive LLM tasks.