← Back to blog
2026.2.15|Tutorial

Getting Started with Ollama: Run LLMs Locally

OllamaLLMLocal AITypeScript
Getting Started with Ollama: Run LLMs Locally

Ollama lets you run large language models entirely on your own hardware. This tutorial walks you through installation and running your first model.

Step 1: Install Ollama

# macOS / Linux
curl -fsSL https://ollama.com/install.sh | sh

# Verify installation
ollama --version

Step 2: Pull and Run a Model

# Pull Llama 3.2 (3B — runs on 8 GB RAM)
ollama pull llama3.2

# Start an interactive chat session
ollama run llama3.2

Step 3: Use the REST API

const response = await fetch("http://localhost:11434/api/chat", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({
    model: "llama3.2",
    messages: [{ role: "user", content: "Why is the sky blue?" }],
    stream: false,
  }),
});

const data = await response.json();
console.log(data.message.content);

Recommended Models

  1. llama3.2:3b — fastest, fits in 8 GB VRAM
  2. mistral:7b — excellent instruction following
  3. qwen2.5-coder:7b — best for code generation
  4. nomic-embed-text — embeddings for RAG pipelines