Skip to content

Ollama

Ollama is the engine powering the homelab’s local AI capabilities. It allows running Large Language Models (LLMs) like Llama 3 or Mistral directly on the server, ensuring privacy and local-first execution.

In this homelab, Ollama is integrated into the system for use by agents (like PicoClaw) and can be accessed via the dashboard using any compatible web client (like Page Assist).

PortDescriptionAccess
11434API EndpointLocal Network & Proxy

The service is proxied via Nginx and accessible at:

  • Local: http://<homelab-ip>:11434 (requires Local Network access)
  • Remote: https://ollama-home.javiersc.com (via Cloudflare Tunnel)

Ollama does not require external secrets for its basic operation. Authentication, if needed by clients, is handled at the proxy or application level.

  • State: Configured via homelab.backupPaths = [ "/var/lib/ollama" ].
  • Exclusions: The models/ directory is excluded from backups to save space, as models can be easily redownloaded using ollama pull.

Ensure that the service is configured to listen on all interfaces:

services.ollama.host = "0.0.0.0";

If the dashboard link fails, check the Nginx Proxy status and ensure the port 11434 is registered in homelab.proxies.ollama. The local firewall rules are derived in modules/nixos/system/network.nix.

For hardware without a dedicated GPU (like the Intel N150), use small quantized models to maintain low latency, especially for voice interaction:

ModelSizeBest For
gemma2:2b2BHigh quality (Classic)
gemma4:e2b2BNew! High efficiency/Voice
gemma4:e4b4BHigh reasoning (Slower on CPU)
llama3.2:1b1BGeneral purpose (Balanced)
qwen2.5:0.5b500MVoice (Lowest latency)
qwen2.5:1.5b1.5BComplex instructions