This week's tip of the week is running open weight models (commonly called open source models) with Claude Code.

Most people assume Claude Code requires an Anthropic API key or subscription. It doesn't.

I was showing someone Claude Code recently and the API spend was a blocker for them. Claude Code accepts any Anthropic-compatible API endpoint via ANTHROPIC_BASE_URL, which means you can point it at open weight models running locally instead. Ollama is the easiest way to get a local model running. Once installed, make sure to enable "Expose Ollama to the network" in Ollama's settings so Claude Code can reach it.

Ollama Settings with Expose Ollama to the network enabled

Then pull a model, like the recently released Gemma 4 model from Google.

ollama pull gemma4:e2b

Then set these env vars:

export ANTHROPIC_BASE_URL="http://localhost:11434"
export ANTHROPIC_AUTH_TOKEN="" # empty or literally anything
export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1

And launch Claude Code pointing at your model:

claude --model gemma4:e2b

You can also add the environment variables in the command if you don’t want them in your shell startup script:

ANTHROPIC_AUTH_TOKEN=ollama ANTHROPIC_BASE_URL=http://localhost:11434 ANTHROPIC_AP
I_KEY="" claude --model gemma4:e2b

For the models themselves, you'll want one that supports tool/function calling since that's what powers Claude Code's agentic capabilities like reading files and running commands. On a machine with 16-32GB RAM, good starting points are gemma4:e2b or qwen2.5:7b. Explore all the models.

Claude Code running gemma4:e2b

Honest take: open weight models are not as capable as Anthropic models. You will get lower edit accuracy and more retries. This is not a like-for-like swap and I don't use it as my daily driver. But if you are new to the field and want to learn AI-assisted coding workflows without burning a hole in your wallet, this is a legitimate way in.

That's it! Short and sweet. Until the next one!

Keep Reading