The Three Levels of AI Usage
Most people are stuck on Level 1. Premium members operate at Level 2. This course is about Level 3 — and why it changes everything.
Level 1: Chat (Basic)
You ask a question, you get an answer. One prompt, one response. This is how 95% of people use AI. It is useful but limited — the AI has no memory, no tools, and no ability to act on your behalf.
Level 2: Workflows (Premium)
You chain multiple steps together. Reverse prompting, tool integration, multi-model approaches. You are orchestrating AI across tasks. But YOU are still the conductor. Every step requires your input.
Level 3: Agentic (VIP)
The AI operates autonomously. You define the goal, the constraints, and the tools — then it runs. It makes decisions, handles errors, calls other agents, and delivers results. You review and approve. This is where AI becomes a workforce, not a tool.
What we actually run:31 autonomous agents across 5 companies. A scout agent researches market opportunities. An analyst scores them. A builder creates the product. A QC agent verifies quality. A hustler lists and promotes it. All coordinated by a gateway daemon. Zero human intervention between "find an opportunity" and "product listed for sale."
Anatomy of an Agent
An AI agent is not magic. It is a loop with four components:
while task_not_complete:
1. OBSERVE — Read the current state (files, data, APIs, messages)
2. THINK — Decide what to do next (the LLM reasoning step)
3. ACT — Execute an action (write code, call API, send message)
4. EVALUATE — Check if the action succeeded, adjust if notThat is it. Every agent framework — LangChain, CrewAI, AutoGen, Claude Code, custom — implements some version of this loop. The differences are in what tools the agent can use, how it handles errors, and how multiple agents coordinate.
Your First Agent: File Organizer
Let us build something real. This agent watches a folder, classifies files, and moves them to the right place. We will use Claude Code (CLI) because it is the simplest agent runtime available.
Step 1: Define the Task
claude -p "Look at every file in ~/Downloads that was added today.
For each file:
1. Determine what type of document it is (invoice, receipt, photo,
document, code, media, other)
2. Create a subfolder in ~/Documents/Sorted/ for that type if it
doesn't exist
3. Move the file there
4. Give me a summary of what you moved and where"
Step 2: Make It Recurring
Wrap it in a cron job and it runs every evening:
# crontab -e
0 20 * * * claude -p "Sort today's downloads..." --output-format json >> ~/logs/sort.log 2>&1Step 3: Add Intelligence
Now make it smarter — have it read the contents of documents to classify them, extract key information, and log it to a spreadsheet:
claude -p "For each PDF in ~/Downloads from today:
1. Read the content
2. Classify: invoice, contract, receipt, letter, or other
3. Extract: date, amount (if financial), sender/company
4. Move to ~/Documents/Sorted/{type}/
5. Append a row to ~/Documents/file-log.csv with:
filename, type, date, amount, sender, original_path
Give me the final CSV contents when done."Build This Today
If you have Claude Code installed, run the Step 1 command right now. Watch it examine your Downloads folder, reason about each file, and organize them. This is not a hypothetical — do it and see an agent work in real-time.
Then set up the cron job from Step 2. Congratulations — you now have an autonomous agent running on your machine.
Multi-Agent Architecture
Single agents are useful. Multiple agents coordinating are transformative. Here is how we structure our production system:
The Hub-and-Spoke Model
┌─────────────┐
│ STRATEGIST │ ← Sets priorities, allocates work
│ (Opus 4) │
└──────┬──────┘
│
┌───────────────┼───────────────┐
│ │ │
┌──────▼──────┐ ┌──────▼──────┐ ┌──────▼──────┐
│ SCOUT │ │ BUILDER │ │ HUSTLER │
│ (Llama70B) │ │ (Llama70B) │ │ (GPT-5.4) │
│ Research │ │ Create │ │ Sell │
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘
│ │ │
┌──────▼──────┐ ┌──────▼──────┐ ┌──────▼──────┐
│ ANALYST │ │ HERMES │ │ OPS │
│ (Sonnet 4) │ │ (QC Agent) │ │ (Llama70B) │
│ Score │ │ Verify │ │ Monitor │
└─────────────┘ └─────────────┘ └─────────────┘Key design principles:
- Separation of concerns: Each agent has ONE job. Scout finds, analyst scores, builder creates, QC verifies. No agent does everything.
- Model matching: Use expensive models (Opus, GPT-5.4) for tasks requiring deep reasoning. Use cheap/local models (Llama 70B) for high-volume tasks. Use free local models wherever possible.
- Trust boundaries: Agents that spend money or take public actions require approval. Research agents run freely.
- Heartbeat supervision: Every agent has a check-in interval. If it goes silent, the ops agent investigates and restarts or escalates.
Hardware Considerations
You do not need a data center. Here is what different budgets get you:
Low-cost — Cloud Only
- A consumer ChatGPT or Claude subscription as your primary agent
- Free tier APIs for supporting tasks
- GitHub Actions for scheduled agent runs
- Good for 1-3 agents on simple tasks
$500-1500 — Local + Cloud Hybrid
- Mac Mini M4 (16GB) — runs small local models for free
- Ollama for local inference (Llama 3.2, Phi, Mistral)
- Cloud APIs for complex reasoning tasks only
- Good for 5-10 agents, mixed workloads
$3000-5000 — Serious Local Compute
- Mac Studio M4 Max (128GB) — runs 70B parameter models locally
- Llama 3.3 70B at full speed, zero API costs for most tasks
- Cloud APIs only for frontier models (Opus, GPT-5)
- Good for 20-30+ agents, production workloads
Understanding VRAM and Model Size
The single most important hardware spec for local AI is unified memory (Apple Silicon) or VRAM (GPU). Here is the rough math:
Model Parameters × Quantization Bytes = VRAM Required
7B model × Q4 (0.5 bytes/param) = ~4GB — runs on phones
13B model × Q4 = ~8GB — runs on laptop
30B model × Q4 = ~17GB — needs 24GB+ GPU
70B model × Q4 = ~42GB — needs 48GB+ or Mac Studio
405B model × Q4 = ~230GB — needs serious ironApple Silicon is uniquely good for this because CPU and GPU share the same memory pool. A 128GB Mac Studio can load a 70B model AND still have 30+ GB for running other agents, apps, and tasks simultaneously.
Emerging Tech to Watch
- MCP (Model Context Protocol):Anthropic's open standard for connecting AI to tools and data sources. Every major model provider is adopting it. Learn it now.
- Claude Code / Codex: CLI-native agent runtimes that give AI direct access to your filesystem, terminal, and APIs. This is how agents actually DO things, not just talk about them.
- Local model quantization advances: GGUF, GPTQ, AWQ — new quantization methods are making larger models fit on smaller hardware with less quality loss every month.
- Agent-to-agent protocols: Standards for agents to discover, negotiate with, and delegate to other agents. Still early but moving fast.
- Open source agent frameworks: LangGraph, CrewAI, AutoGen, DSPy — each with different strengths. We will cover which to use when in upcoming sessions.
This week's live stream: We are going to walk through our actual production gateway — how 31 agents are configured, how they communicate, how we handle failures, and how the whole system costs less than a Netflix subscription in metered API fees. Bring questions.
Get new lessons free
We publish free AI lessons weekly. Drop your email and we will send them directly — no spam, no sales sequences, just signal.
Your Homework
- Build the file organizer agent from the exercise above
- Set it up on a cron schedule so it runs daily without you
- Identify ONE repetitive task in your business that follows a predictable pattern — that is your next agent candidate
- In our next 1-on-1, we will design that agent together and get it running