ollama
Containers running Ollama
Usage with docker
Two compose modes are available:
- CPU mode (default): compose.yaml
- GPU mode (NVIDIA): compose.yaml + compose-gpu.yaml (override)
CPU mode (default)
- Start :
docker compose up -d - Stop :
docker compose down
GPU mode (NVIDIA)
- Ensure that GPU support is enabled in docker :
docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
- Start :
docker compose -f compose.yaml -f compose-gpu.yaml up -d - Stop :
docker compose -f compose.yaml -f compose-gpu.yaml down
[!TIP] alternative :
COMPOSE_FILE=compose.yaml:compose-gpu.yaml docker compose up -d
CLI and API usage
Use these commands once Ollama is started (CPU or GPU mode):
- To use Ollama CLI :
# pull models from https://ollama.com/library
docker compose exec ollama ollama pull llama3
docker compose exec ollama ollama pull gemma2
# interactive model
docker compose exec ollama ollama run llama3.1
- To use Ollama API :
# list models
curl -sS http://localhost:11434/api/tags | jq -r '.models[].name'
# pull model from https://ollama.com/library
curl http://localhost:11434/api/pull -d '{
"name": "llama3"
}'
# use model
curl http://localhost:11434/api/generate -d '{
"model": "llama3.2",
"prompt": "Why is the sky blue?"
}'
- To create custom model from OLLAMA Modelfile, a sample models/geoassistant is available :
docker compose exec ollama /bin/bash
ollama create geoassistant -f /models/geoassistant/Modelfile
ollama run geoassistant
# Do you know the most visited museums in Paris?
Ressources
Ollama :
- ollama.com - Library (available models)
- ollama - API
- github.com - ollama/ollama
- hub.docker.com - ollama/ollama
GPU support :
Clients :