A builder's unfiltered guide to working with Claude: every surface, every failure, every obsession

You know that feeling when you discover a new cuisine? Not a trendy restaurant that your coworker won't shut up about. The real thing. A hole-in-the-wall where the owner is the chef and the chef actually understands food. You order one thing. It's good. You come back the next day and order three things. A week later you're reorganizing your kitchen because you rabbit-holed about fermentation techniques and now you own a $200 crock you definitely didn't need.

That's me with Claude. Except the crock is a 30-program autonomous agent fleet orchestrated through a custom MCP server that ships code, runs tests, and deploys changes while I sleep. And the rabbit hole was "what if I built an entire operating system on top of Claude Code?"

The First Bite: The API, Then the Interface

My first taste of Anthropic wasn't Claude the chat product. It was Sonnet and Haiku through the API, about two years ago at Monks. We were piping model calls into data pipelines and internal tooling. It was good. Fast, reliable, cheaper than the alternatives for the work we needed done.

Then I discovered Opus. And then Claude the product. And I haven't been the same since.

I'd been using other LLMs, and the interaction always felt like talking to a very fast intern: eager, surface-level, occasionally hallucinating with total confidence, and used a ton of emojis. Claude felt different. Responses had structure. When I pushed back, it actually reconsidered instead of restating the same thing with more adjectives. And there were no emojis.

The first real shift was a technical architecture conversation. I described a multi-tenant system I was designing and asked Claude to poke holes in it. Instead of the usual "great idea, here are some considerations," I got a structured critique that identified a race condition I hadn't thought about. Specific. Technically grounded. Not hedging.

I ordered the second dish.

The Tasting Menu: Projects

If regular chat is a la carte, Projects is the tasting menu. You set context once (system instructions, uploaded files, domain knowledge) and every conversation inside that Project inherits it. Claude remembers who you are, what you're building, and how you like your feedback.

I created a Project for my product studio with architecture docs, coding conventions, and product briefs uploaded as context. Suddenly, Claude wasn't a general-purpose assistant answering generic questions. It was a team member who had read the wiki.

In a regular chat, you spend 40% of your tokens re-explaining context. In a Project, you walk in and say "add rate limiting to the enrichment pipeline" and Claude already knows what the enrichment pipeline is, what language it's written in, and where the entry point lives.

System instructions that define personality and constraints work well. So do architectural decision records and pinned coding conventions. Product briefs as context mean you can ask "what should v2 look like?" and get answers grounded in your actual product.

What doesn't work: uploading too much. There's a context window, and burying the signal in 50 uploaded files means Claude processes all of it but prioritizes none of it. Five high-signal documents beat 50 medium-signal ones. I learned to curate aggressively.

Learning to Cook at Home: Claude Code (CLI)

The CLI changed everything.

Desktop Claude is great for thinking, planning, and conversation. But if you write code for a living, there's a friction tax: copy from Claude, paste into editor, realize it doesn't quite fit, go back to Claude with the error, get a fix, paste again. It works. It's also slow enough that you start wondering if you should've just written it yourself.

Claude Code eliminates the paste. It reads your codebase directly. Edits files in place. Runs your tests. Commits your code. It's not a copilot suggesting the next line. It's an agent that can take a task description and execute across multiple files.

The first time I told Claude Code "add health check endpoints to the enrichment worker" and watched it read the existing route structure, create the handler, add tests, and commit, all without me touching my editor, it clicked. Claude Code is reportedly 90% self-written. The tool that helps me write code was itself written by the tool. Anthropic built Cowork in 10 days using Claude. The recursion isn't a gimmick. It's a signal about what's possible when you let the model do what it's good at.

CLAUDE.md is your secret weapon. Drop a markdown file in your repo root with project conventions, and Claude Code reads it on every session start. Mine contains git identity rules, deployment patterns, and architecture decisions. Claude follows them without being told.

Plan mode is underrated. Instead of letting Claude sprint into implementation, you can force it to show you the plan first. I use this for anything touching more than three files. Usually it's like an entire eng team stepped up to the plate and delivered a month of deliberations in one planning session.

Subagents are powerful and dangerous. Claude Code can spawn sub-processes to parallelize work. I once launched four subagents in the same repo to work on four different stories simultaneously. They all shared the same filesystem. One agent's git checkout changed the branch for all four. Commits ended up on wrong branches. The entire sprint was contaminated. I spent an evening cherry-picking and force-pushing.

The lesson: never parallelize agents in the same working directory. Serialize per repo, or use separate git worktrees. I learned this the hard way and codified it as a hard rule.

Context windows are finite. Long sessions degrade. Claude starts losing track of earlier decisions, repeating itself, or contradicting things it said 30 minutes ago. I rotate sessions at natural break points and write handoff notes so the next session picks up clean. Think of it like shift changes. The night crew reads the day crew's notes.

The Food Truck: Claude Mobile

Nobody's writing TypeScript on a 6-inch screen. But for thinking? Voice conversations while walking the dog. Quick architecture decisions while waiting for coffee. Reviewing a product brief during lunch.

The voice interaction model is natural enough that I've had full strategic planning sessions on morning walks. I describe a problem. Claude asks clarifying questions. We iterate on an approach. By the time I'm back at my desk, the plan is formed and I can execute instead of staring at a blank screen trying to figure out where to start.

I also built a mobile app (CacheBash) that sends tasks and directives to my agent fleet through an MCP server. So now I can dispatch work from my phone, and Claude agents on my laptop pick it up and execute. I'm not saying I've dispatched a code sprint watching my sons play at the park. But I'm not saying I haven't.

The Dish That Burned Me: Cowork & Artifacts

Cowork (the collaborative document editing mode) has a problem: it occasionally loses your work. Not in a "didn't save" way. In a "the document you spent two hours co-editing is just gone" way. No warning. No recovery. Just a blank canvas where your product requirements used to be.

This happened to me twice. The second time, I lost a comprehensive PRD for a product launch. The kind of document where every sentence represented a decision that took 20 minutes to make. Gone. I rebuilt it from memory, but the sting hasn't faded.

My current approach: I still use Cowork for brainstorming and rough drafts, but I copy important content out frequently. Anything mission-critical gets written in my repo, not in a Cowork session. Trust, but verify. And save.

Artifacts are better. Claude generates code, documents, SVGs, even interactive React components that render live in the conversation. I've mocked up dashboard layouts, generated architecture diagrams, and built interactive data visualizations without leaving the chat window. The limitation: Artifacts are ephemeral by default. Great for exploration, bad for production. I treat them as sketches, not blueprints.

Sandbox fills a specific niche well: "run this and show me the output." Data analysis, charting, quick scripts. Not a replacement for a real development environment, but for "does this regex actually match what I think it matches?" it's faster than opening a terminal.

Building My Own Kitchen: MCP and Skills

I enrolled in culinary school.

Model Context Protocol (MCP) is Anthropic's standard for connecting Claude to external tools and data sources. You build a server that exposes capabilities, and Claude can call them during conversations.

I built CacheBash, an MCP server that gives Claude access to a task queue, messaging system, program state management, sprint tracking, and fleet health monitoring. It started as "Claude needs a way to assign tasks to other Claude sessions." It became a full communications backbone for a multi-agent operating system.

The protocol is straightforward. JSON-RPC over stdio or HTTP. Define your tools with schemas. Claude calls them. You respond. The barrier to entry is low. I had a working server in a day.

Tool design is the hard part.

Claude is only as good as the tools you give it.

Vague tool descriptions produce vague tool usage. I write tool descriptions like API docs for a junior developer: explicit parameter descriptions, clear return types, examples of when to use each tool.

Claude discovers tool combinations you didn't anticipate. I built send_message and get_tasks as separate tools. Claude started using them together to implement a dispatch-verify-acknowledge pattern I hadn't designed. The agent would send a directive, check for an acknowledgment, and follow up if none arrived. I never told it to do that. It inferred the workflow from the tool descriptions and the system prompt.

Auth is always the hardest part. Per-program API keys, session management, tenant isolation. The MCP spec doesn't prescribe auth patterns, so you figure it out yourself. We went through five auth iterations before landing on something that works for both human users and autonomous agent sessions.

Skills are custom slash commands for Claude Code. /commit to handle git workflows. /prd to generate product requirements. /clu-analyze to run competitive intelligence analysis. Essentially prompt templates that chain tool calls together. Small investment, big quality-of-life improvement for repetitive workflows.

The Kitchen Brigade: Multi-Agent Claude

I run 30 specialized Claude programs. Each has a name (from Tron, because culture matters and building things should be fun), a role, a tier assignment, and a set of behavioral rules.

VECTOR is the strategic counsel. Operates at Opus tier. Thinks about architecture, product direction, and long-term plays. ISO is the operations chief. Also Opus. Dispatches work, enforces quality gates, monitors delivery. BASHER is the build lead. Opus orchestrator that never writes code directly. Instead, it decomposes tasks and delegates to Sonnet-tier subagents that do the actual implementation.

Below them: ALAN for architecture reviews. SARK for security assessments. QUORRA for product design. CASTOR for content. CLU for competitive intelligence. Thirty programs, each scoped to a domain, each with persistent memory of what it's learned.

Why this works: Separation of concerns. A strategic thinker and a code executor have different needs. Opus is expensive but brilliant at planning. Sonnet is cheaper and excellent at execution. Using the right model for the right job cuts costs without cutting quality. Each program maintains state between sessions, so when a program restarts, it reads its own history and picks up where it left off. No cold starts. Independent workstreams run simultaneously across different repos, different programs, same sprint.

Why this occasionally breaks spectacularly:

An orchestrator once killed three active sessions to make room for new work, without checking what those sessions were doing. Two of them had completed analysis that wasn't committed yet. All of it was lost.

A CI pipeline check passed status but failed verification. The orchestrator merged the PR based on the status message. The commit was broken. I caught it manually.

The first time I tried autonomous operations ("beast mode," where the fleet keeps working while I'm AFK), an agent stalled waiting for permission on something it should've decided itself. The entire pipeline stopped for hours because one program was afraid to push code to a new repository without asking.

Every failure became a codified rule that prevents it from happening again. The system gets better each time something breaks. Exactly how you get good at cooking.

What I'd Tell a New Diner

Start with Projects, not plain chat. The 10 minutes you spend writing system instructions pays back 10x in every conversation that follows.

Use Claude Code if you write code. The file-level integration makes everything else feel like a workaround. Read about CLAUDE.md. It's the single most impactful thing you can configure.

Talk to Claude on your phone. Voice conversations for planning are absurdly useful. Your best ideas don't happen at your desk.

Build an MCP server. Even a small one. Designing tools for Claude teaches you more about how it thinks than any amount of prompting. You start seeing every workflow as "what tools would Claude need to do this autonomously?"

Accept that sessions degrade. Rotate at natural break points. Write handoff notes. Fresh context produces better output than a tired session with 100k tokens of history.

Expand the autonomy boundary, then verify. My biggest productivity jumps came from trusting Claude with more. My biggest saves came from automated CI checks catching what the agent missed. Both are true at the same time.

The agent that reports "task complete" might not have checked if CI passed. Inspect what you expect.

The Kitchen Keeps Adding Dishes

I've gone from casual diner to someone who thinks about tool design patterns while showering. I've built products on this stack. I've shipped production systems where Claude agents coordinate, build, test, and deploy code with minimal human intervention.

Is it perfect? No. Cowork still gives me trust issues. Long sessions still degrade. Sometimes an agent makes a decision so obviously wrong that I stare at the screen wondering if we're talking about the same codebase.

But the trajectory is unmistakable. Every release is meaningfully better than the last. The capabilities that seemed experimental six months ago are now load-bearing parts of my workflow. I'm one person with a laptop and a conviction that the right cuisine is worth mastering, and I'm building faster than I ever did with a team.

I'm still ordering off the menu. And I haven't found a reason to eat anywhere else.

Christian Bourlier builds multi-agent orchestration systems at Rezzed.ai. CacheBash is open source on npm. Emojis sold separately.

Nobody Warned Me About This Place

The First Bite: The API, Then the Interface

The Tasting Menu: Projects

Learning to Cook at Home: Claude Code (CLI)

The Food Truck: Claude Mobile

The Dish That Burned Me: Cowork & Artifacts

Building My Own Kitchen: MCP and Skills

The Kitchen Brigade: Multi-Agent Claude

What I'd Tell a New Diner

The Kitchen Keeps Adding Dishes

Related Posts

CacheBash: An MCP Server That Lets Your AI Sessions Talk to Each Other

The Only Part of My AI Stack That Needed Coffee Was Also the Bottleneck

I Built 6 Open-Source Tools for Claude Code. Then I Connected Them.