← Back to blog
2026.2.15|Tutorial
Getting Started with Ollama: Run LLMs Locally
OllamaLLMLocal AITypeScript
Ollama lets you run large language models entirely on your own hardware. This tutorial walks you through installation and running your first model.
Step 1: Install Ollama
# macOS / Linux
curl -fsSL https://ollama.com/install.sh | sh
# Verify installation
ollama --versionStep 2: Pull and Run a Model
# Pull Llama 3.2 (3B — runs on 8 GB RAM)
ollama pull llama3.2
# Start an interactive chat session
ollama run llama3.2Step 3: Use the REST API
const response = await fetch("http://localhost:11434/api/chat", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
model: "llama3.2",
messages: [{ role: "user", content: "Why is the sky blue?" }],
stream: false,
}),
});
const data = await response.json();
console.log(data.message.content);Recommended Models
- llama3.2:3b — fastest, fits in 8 GB VRAM
- mistral:7b — excellent instruction following
- qwen2.5-coder:7b — best for code generation
- nomic-embed-text — embeddings for RAG pipelines