The premise was simple. Get a personal AI agent running for myself — always on, reachable from my phone, on the cheapest stack I could make work.
OpenClaw is the platform that turned out to fit the brief. It’s @steipete’s open-source personal AI assistant. It sits between an LLM and the real world, with chat apps like WhatsApp as the interface and tools (browser, shell, web search) as the hands.
Here’s the shape of what my instance ended up as.
The hardware
- Raspberry Pi 5, running Debian 13 (Trixie) ARM64. Sits next to the router. On 24/7. Costs pennies in electricity.
- MacBook on the same LAN, reachable from the Pi over SSH. Used as a secondary box for larger storage and a fallback LLM.
That’s the whole fleet. Both are devices I already owned.
The platform
OpenClaw runs as a systemd user service on the Pi, so it comes up automatically on boot and stays running. Installed via npm, configured through a single openclaw.json file. That’s it. No Docker, no Kubernetes, no managed anything.
The brain: GPT-5.3 Codex, via ChatGPT OAuth
This is the part that surprised me most.
OpenClaw can authenticate to OpenAI’s models through ChatGPT account OAuth instead of an API key. For personal-volume use, that means the agent runs on my flat ChatGPT subscription rather than per-token API billing. No metered invoices, no fear of leaving a loop running overnight.
GPT-5.3 Codex is the default model for everything — the chat interface, the scheduled jobs, the tools.
The fallback: local LLM on the Mac
Ollama runs on the Mac with Qwen3 8B, exposed to the agent as an alternative model. It’s not the primary anymore — for the judgement-heavy tasks I throw at it, the hosted model is still meaningfully smarter — but having a private, zero-cost local path in reserve felt worth keeping.
I’d originally hoped to make this the primary brain. The hardware reality is that 8GB of RAM on an M1 only realistically gets you to 3B-class models at usable speeds, and those weren’t quite smart enough for what I wanted. Bumping up to an 8B as a fallback was the compromise.
The interface: WhatsApp
The agent lives in WhatsApp. You message it like a contact.
OpenClaw’s WhatsApp channel plugin handles the bridge — personal WhatsApp, not the Business API. It’s allowlist-based, so only approved numbers can actually talk to it. Setup was much less painful than I expected.
The big advantage: no new app to install, no new habit to build. The thing I already use all day became the control surface for the agent.
The tools wired in
Four of them:
web_fetch— pulls any URL: RSS feeds, public APIs, web pages.web_search— live search via Tavily (free tier is enough at personal volume).exec— runs shell commands on the Pi.CloakBrowser— a stealth Chromium build that bypasses bot detection on JS-heavy sites that normal scrapers can’t touch.
CloakBrowser is the one I’d flag as non-obvious. Most flight, travel, and aggregator sites detect and block conventional scraping immediately. CloakBrowser patches Chromium at the source level so it looks like a real human browser. Without it, half the things I wanted the agent to monitor would just hit a wall.
So the shape of the thing ended up looking like this:
What it actually does
The jobs running right now are kind of random. They’re me poking at the edges — seeing what the agent can do, where it breaks, what’s worth pushing further. None of it is critical workflow, and that’s the point.
Two flavours of scheduled jobs.
OpenClaw cron jobs — the AI runs the task and decides what to say:
- Daily news briefing — pulls a long list of RSS feeds covering my work domain, summarises the last 24 hours into something readable, and pings the result to WhatsApp every morning.
- Lottery monitor — checks the next prize pool and tells me to buy a ticket if it’s above a threshold worth a punt.
- Market scan — hits a public jobs API a few times a week to surface where roles in my space are moving: which functions are hiring, what skills (especially AI-adjacent) are showing up in the postings, where the market is drifting.
System crontab job — a deterministic Python script, no AI in the loop:
- Flight price monitor — uses Playwright + CloakBrowser to scrape a flight aggregator for a specific trip I’m planning and pings WhatsApp if the total drops below a target.
The split between the two is deliberate. Summarising RSS feeds is a judgement task — the AI is good at it. Scraping a flight site for an exact route on exact dates is a precision task — a plain script with CloakBrowser is more reliable and more predictable.
What’s deliberately not connected (yet)
The obvious next integrations would be email, Notion, and the project management tools I live in every day. None of those are wired up right now, and that’s on purpose.
This is exploration mode. I want to understand the limits, the failure modes, the rough edges — what the agent quietly gets wrong, how often, what kind of supervision it actually needs — before I hand it the keys to anything I rely on. That kind of confidence is hard to build if the first thing you do is point it at your inbox.
Email and Notion are on the roadmap. Just not yet.
Design decisions worth flagging
- Always-on Pi instead of a cloud server. No monthly hosting bill. The whole agent stack lives at the edge of my own network, on a box that costs pennies a day to run.
- WhatsApp as the UI. No app to build, no notification system to design. The thing I already use all day becomes the control panel.
- Two types of scheduling. AI cron when the output needs judgement. Plain crontab when the output needs precision. Knowing which is which probably saved more pain than any model upgrade would have.
- CloakBrowser for the hard scrapes. A surprising number of “monitor this site” ideas die quietly on bot detection. Having a tool that gets through them changes what the agent can actually do.
- OAuth-backed model access. Running on a flat ChatGPT subscription instead of metered API calls makes the cost model predictable at personal volume. It’s the difference between “use it freely” and “watch the meter.”
Where Claude Code fit in
Claude Code is the thing that actually let me get all of this wired together without stalling.
I’m a business operator, not a deep engineer. The Pi side, the systemd service, the WhatsApp bridge, the OAuth setup, the cron syntax, the Playwright script, the CloakBrowser config — every one of those was something I could have spent a week getting stuck on. Instead I talked through each problem with Claude Code and kept moving.
That’s the part of this experiment I keep coming back to. The bottleneck used to be technical depth. Now the bottleneck is mostly just deciding what I actually want the thing to do.
Where it goes next
The setup isn’t finished. I keep adding tools, tweaking what’s scheduled, refining how the agent talks to me. Some of it works. Some of it doesn’t.
The longer-term plan is to swap the Pi out for a new Mac mini once the next chip lands. Same headless role, but with enough horsepower to actually host a usable local model in one box — which would collapse the brain and the gateway onto the same machine and probably let me move off the hosted backend too.
For now though, Pi + WhatsApp + GPT-5.3 Codex on a ChatGPT subscription is the sweet spot. The whole point of the exercise was figuring out how low the floor is. Turns out the floor is pretty low.
Back to building.
— Howie