Audience: Developers

API Endpoint Reference

NeuralDrive provides two primary interfaces for model inference: an OpenAI-compatible API for standard tool integration and the native Ollama API for low-level control.

Authentication

All API requests must include the nd-xxxx API key in the Authorization header:

Authorization: Bearer nd-xxxx

OpenAI-Compatible API

Base URL: https://<IP_ADDRESS>:8443/v1/

MethodPathDescription
POST/v1/chat/completionsChat completions (supports streaming).
POST/v1/completionsText completions for non-chat models.
GET/v1/modelsLists all available local models.
POST/v1/embeddingsGenerates vector embeddings for a given input.

Chat Completion Example

curl https://neuraldrive.local:8443/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer nd-xxxx" \
  -d '{
    "model": "llama3:8b",
    "messages": [
      {"role": "user", "content": "How do I secure an API?"}
    ]
  }'

Native Ollama API

Base URL: https://<IP_ADDRESS>:8443/api/

MethodPathDescription
POST/api/generateLow-level text generation.
POST/api/chatNative chat completion format.
GET/api/tagsList locally installed model tags.
POST/api/pullDownload a new model from the registry.
POST/api/showRetrieve detailed model metadata.
DELETE/api/deleteRemove a local model.
POST/api/copyCreate a copy or alias of a model.

Native Chat Example

curl https://neuraldrive.local:8443/api/chat \
  -H "Authorization: Bearer nd-xxxx" \
  -d '{
    "model": "llama3:8b",
    "messages": [
      {"role": "user", "content": "Explain quantization."}
    ],
    "stream": false
  }'

Note: For information on how to manage the NeuralDrive system itself (logs, services, networking), see the System Management API reference.