Ollama

Overview

Ollama is the engine powering the homelab’s local AI capabilities. It allows running Large Language Models (LLMs) like Llama 3 or Mistral directly on the server, ensuring privacy and local-first execution.

In this homelab, Ollama is integrated into the system for use by agents (like PicoClaw) and can be accessed via the dashboard using any compatible web client (like Page Assist).

Ports

Port	Description	Access
`11434`	API Endpoint	Local Network & Proxy

External Access

The service is proxied via Nginx and accessible at:

Local: http://<homelab-ip>:11434 (requires Local Network access)
Remote: https://ollama-home.javiersc.com (via Cloudflare Tunnel)

Secrets

Ollama does not require external secrets for its basic operation. Authentication, if needed by clients, is handled at the proxy or application level.

Backup

State: Configured via homelab.backupPaths = [ "/var/lib/ollama" ].
Exclusions: The models/ directory is excluded from backups to save space, as models can be easily redownloaded using ollama pull.

Troubleshooting

Service is not responding locally

Ensure that the service is configured to listen on all interfaces:

services.ollama.host = "0.0.0.0";

Dashboard link doesn’t work

If the dashboard link fails, check the Nginx Proxy status and ensure the port 11434 is registered in homelab.proxies.ollama. The local firewall rules are derived in modules/nixos/system/network.nix.

Recommended Models (N150 CPU)

For hardware without a dedicated GPU (like the Intel N150), use small quantized models to maintain low latency, especially for voice interaction:

Model	Size	Best For
`gemma2:2b`	2B	High quality (Classic)
`gemma4:e2b`	2B	New! High efficiency/Voice
`gemma4:e4b`	4B	High reasoning (Slower on CPU)
`llama3.2:1b`	1B	General purpose (Balanced)
`qwen2.5:0.5b`	500M	Voice (Lowest latency)
`qwen2.5:1.5b`	1.5B	Complex instructions