Open Models

Run open-source LLMs on a real Linux machine. One-click recipes for Ollama with DeepSeek, Llama, Qwen, Gemma, Granite, TinyLlama and more. No quotas, no API keys.

What you get#

Each Open Models recipe installs ollama on a Linux box, pulls the model, and starts it as a systemd service. The model is exposed on the box's Ollama port and you can talk to it from any co-deployed machine, from a chat UI on the same workspace, or from a coding agent.

No API keys. Your model, your machine, your data.
No quotas. The bottleneck is the box, not a rate limit.
GPU when you need it. Pick a GPU-backed box for the bigger models, a CPU box for the small ones.

Available recipes#

The full list lives at easyenv.io/ai/open-models. The current featured set:

ollama_deepseek - DeepSeek. Strong on reasoning and code generation.
ollama_qwen2_5_coder - Qwen 2.5 Coder 7B. Best paired with OpenCode for agent-driven coding.
ollama_qwen2_5 - Qwen 2.5 7B. Tool-capable, JSON-tuned chat model.
ollama_llama3.2, ollama_llama3_2_3b, ollama_llama3.1_8b - Meta's Llama family.
ollama_granite3_2 - IBM Granite 3.2 8B. Strong instruction following, tool-capable.
ollama_gemma - Google Gemma open-weight models.
ollama_tinyllama - Compact model for low-RAM boxes.

Pick a machine#

7B models run on a CPU box if you can wait, but you will want a GPU-backed box for anything serious. 8B+ models with long context windows need at least 16 GB of RAM (and a GPU helps a lot).

bash

# Boot a workspace with an Ollama recipe pre-installed
easyenv workspace create \
  --template ollama_qwen2_5_coder \
  --gpu

Open WebUI as a chat surface#

Want a polished chat UI in the browser? Add the openwebui recipe on a co-deployed machine. It auto-discovers the Ollama box on the same workspace and you get a ChatGPT-style interface backed by your own model.

Pair with an agent#

Every Open Models recipe is designed to pair with a Personal Assistant. The agents look for a co-deployed Ollama box on the workspace VPN and configure themselves automatically. See Personal Assistants for the agent catalog.

Bring your own model#

Anything Ollama supports works. Boot any of the recipes above (or a blank Ubuntu machine with Ollama installed) and run ollama pull <model> from the terminal. The service picks it up on the next request.

Hugging Face GGUF

You can also point Ollama at any GGUF on Hugging Face. Run ollama run hf.co/<org>/<repo> and Ollama will fetch and cache the weights on the machine.