← Back to blog
2026.2.15|Tutorial|0 COMMENTS
Getting Started with Ollama: Run LLMs Locally
Ollama lets you run large language models entirely on your own hardware. This tutorial walks you through installation and running your first model.
Step 1: Install Ollama#
# macOS / Linux
curl -fsSL https://ollama.com/install.sh | sh
# Verify installation
ollama --versionStep 2: Pull and Run a Model#
# Pull Llama 3.2 (3B — runs on 8 GB RAM)
ollama pull llama3.2
# Start an interactive chat session
ollama run llama3.2Step 3: Use the REST API#
const response = await fetch("http://localhost:11434/api/chat", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
model: "llama3.2",
messages: [{ role: "user", content: "Why is the sky blue?" }],
stream: false,
}),
});
const data = await response.json();
console.log(data.message.content);Recommended Models#
- llama3.2:3b — fastest, fits in 8 GB VRAM
- mistral:7b — excellent instruction following
- qwen2.5-coder:7b — best for code generation
- nomic-embed-text — embeddings for RAG pipelines