0. 4:47pm Friday — what an autonomous agent actually does
The CFO Slacks: "why is gross margin down 3 points this quarter?" A human analyst would open BigQuery, dig through invoices, cross-reference cost reports, check a couple of policy docs, draft a memo. About four hours of work. With an autonomous agent the same flow looks like this:
MCPThe agent connects to the company's warehouse-mcp server and the policies-mcp server at session start. Both expose their tools via one standard handshake.
SkillThe prompt "why is gross margin down" matches the trigger of amargin-investigation skill the finance team wrote. The skill's procedure (run these queries, in this order, with these guardrails) is now in the agent's working memory.
ToolFollowing the skill, the agent emits a run_sql tool call to pull this-quarter vs last-quarter unit economics by SKU. The runtime validates the query is read-only, runs it, and feeds the result back into context.
RAGTwo SKUs show abnormal cost growth. The agent searches the policies KB for any pricing-floor or supplier-contract rule that applies to those SKUs. RAG returns the relevant policy clauses with citations.
APIThe agent calls POST /v1/memos on the internal API to file a draft memo with the findings + citations + a recommended action. Slack DM to the CFO with a permalink.
Four hours of work, six seconds of agent time. None of those five steps is interchangeable. Strip out RAG and the memo is unsourced. Strip out the skill and the agent picks a different query ordering every run. Strip out MCP and you wrote per-vendor integration code instead. The rest of this page is the disciplined version of that intuition.
1. TL;DR
An API is how programs talk to a system. An MCP server is a standard way to expose tools to AI assistants. A Skill is a packaged playbook the assistant pulls into its working memory when a trigger matches. RAG is the pattern of fetching relevant chunks from a search index and feeding them into a model's prompt before it answers. A Tool is the typed function the model chooses to invoke during generation. APIs and MCP are infrastructure; RAG and skills are knowledge; tools are the interface through which the model reaches all of them.
2. The 5 primitives, one card each
Each card states what the primitive is, what the model literally cannot do without it, and what becomes possible when it's present. The sequence diagrams below the grid show the actual shape of the data moving on the wire.
An HTTP endpoint a caller invokes to read or write a system.
A vendor-neutral JSON-RPC protocol for exposing tools to AI assistants.
A self-contained instruction bundle the assistant loads when a trigger matches.
Embed query → vector search → splice top-k chunks into the prompt → generate.
A typed function the LLM chooses to invoke during generation.
2.1 API — the protocol any program already speaks
time
│
▼
┌──── step 1: the caller knows the contract up front ─────────────────┐
│ │
│ from the docs / OpenAPI spec: │
│ POST /v1/leads │
│ headers: { Authorization: "Bearer sk_…" } │
│ body: { name, email, source } │
│ returns: 201 { id, createdAt } | 4xx { code, message } │
│ │
└─────────────────────────────────────────────────────────────────────┘
┌──── step 2: one round-trip = one resource op ───────────────────────┐
│ │
│ ┌─────────┐ POST /v1/leads ┌───────────────────────────┐ │
│ │ client │ ─────────────────────► │ server (your stack │ │
│ │ (any │ │ or theirs): │ │
│ │ HTTP- │ │ • auth │ │
│ │ speaker│ ◄──────────────────── │ • validate │ │
│ └─────────┘ 201 { id: "lead_42" } │ • insert into DB │ │
│ │ • enqueue webhook │ │
│ └───────────────────────────┘ │
│ │
│ stateless: the server doesn't remember the previous call │
│ idempotent reads: GET /v1/leads/42 returns the same row twice │
└─────────────────────────────────────────────────────────────────────┘
── what an API gives an agent: the ability to actually do things in
the world. without one, the model is a chat bot.2.2 MCP — one protocol, many servers, zero per-vendor SDK
time
│
▼
┌──── step 1: client connects to one MCP server ──────────────────────┐
│ │
│ AI client ──► initialize { │
│ protocolVersion: "2025-06-18", │
│ clientInfo: { name:"neww-agent", version:"1.4" } │
│ } │
│ AI client ◄── result { │
│ serverInfo: { name:"linear-server", version:"…" }, │
│ capabilities: { tools:{…}, resources:{…} } │
│ } │
└─────────────────────────────────────────────────────────────────────┘
┌──── step 2: client asks "what can you do?" ─────────────────────────┐
│ │
│ AI client ──► tools/list │
│ AI client ◄── { tools: [ │
│ { name:"linear_create_issue", │
│ inputSchema: { type:"object", required:["title","team"] }}, │
│ { name:"linear_list_issues", inputSchema: { … } }, │
│ { name:"linear_update_status", inputSchema: { … } } │
│ ]} │
│ │
│ ─────── the LLM now "knows" these 3 tools exist ─────── │
└─────────────────────────────────────────────────────────────────────┘
┌──── step 3: model decides to call one ──────────────────────────────┐
│ │
│ USER: "open a ticket for the staging deploy failure" │
│ │
│ LLM emits: │
│ { name:"linear_create_issue", │
│ args:{ title:"Staging deploy fails on apply step", │
│ team:"INFRA", priority:"high" }} │
│ │
│ runtime ──► tools/call { name, arguments } ──► MCP server │
│ │
│ MCP server runs the REAL Linear API under the hood, returns: │
│ ◄── { content:[{ type:"text", │
│ text:"Created INFRA-412" }]} │
│ │
│ LLM continues: "Filed INFRA-412 ✓" │
└─────────────────────────────────────────────────────────────────────┘
── what MCP gives an agent: one universal way to use any vendor's
tools without writing a per-vendor SDK. plug in 11 servers, get
11 toolkits the LLM can use immediately.2.3 Skill — a procedure the org agreed on, made loadable
time
│
▼
┌──── step 1: a skill is a file on disk ──────────────────────────────┐
│ │
│ ~/.claude/skills/finance-rebalance/skill.md │
│ ───────────────────────────────────────────────── │
│ --- │
│ name: finance-rebalance │
│ trigger: ["rebalance", "drift > 5%"] │
│ resources: ["policies/allocation.md"] │
│ --- │
│ Procedure (for the assistant to follow): │
│ 1. read_holdings(workspaceId) │
│ 2. compare allocation vs target band; if drift < 1% stop │
│ 3. require human approval before any execute_trade │
│ 4. emit a 1-page memo citing the policy file │
│ │
└─────────────────────────────────────────────────────────────────────┘
┌──── step 2: trigger matches → skill loaded into context ────────────┐
│ │
│ USER: "rebalance my IRA" │
│ │
│ matcher: "rebalance" → finance-rebalance ✓ │
│ runtime inlines skill.md + cited resources into the system │
│ prompt, before any user turn is processed │
│ │
│ ┌──────────────────────────────────────────────┐ │
│ │ system context now contains: │ │
│ │ • the skill body (procedure + safety rules) │ │
│ │ • policies/allocation.md (the cited file) │ │
│ └──────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
┌──── step 3: agent follows the procedure, not its instincts ─────────┐
│ │
│ instead of guessing what "rebalance" means, the agent uses the │
│ exact 4 steps in the skill — same procedure every time, across │
│ every user, with audit-friendly citations │
│ │
└─────────────────────────────────────────────────────────────────────┘
── what a skill gives an agent: a fixed playbook the org has agreed
on. removes variance run-to-run. crucial for regulated workflows.2.4 RAG — answers grounded in your own data
time
│
▼
┌──── step 1: embed the query into a vector ──────────────────────────┐
│ │
│ USER: "what's our refund policy for enterprise customers?" │
│ │
│ query ──► embedder ──► [0.0142, -0.221, 0.087, … 1535 more] │
│ │
└─────────────────────────────────────────────────────────────────────┘
┌──── step 2: find similar chunks in the vector store ────────────────┐
│ │
│ vector ──► Qdrant.search( index="company_kb", k=4 ) │
│ │
│ returns top-4 matches with similarity score + source: │
│ 0.92 policies/refund.md#L41-L78 "...enterprise SKUs may..." │
│ 0.88 contracts/MSA-v3.md#L210-218 "...refund window of 30..." │
│ 0.84 faq/billing.md#L12-L20 "...standard tier excl..." │
│ 0.71 blog/2024-refund-update.md "...we changed our..." │
│ │
└─────────────────────────────────────────────────────────────────────┘
┌──── step 3: splice chunks into the prompt and generate ─────────────┐
│ │
│ final prompt to the LLM: │
│ [system: you answer using ONLY the citations below] │
│ [context: {chunk_1}, {chunk_2}, {chunk_3}, {chunk_4}] │
│ [user: what's our refund policy for enterprise customers?] │
│ │
│ LLM responds, with citations pinned to the chunks: │
│ "Enterprise SKUs allow 30 days [policies/refund.md L41], │
│ subject to MSA §4.3 [contracts/MSA-v3.md L210]." │
│ │
└─────────────────────────────────────────────────────────────────────┘
── what RAG gives an agent: ground-truth answers from YOUR data,
even data the model has never seen. solves "the AI hallucinated
something that contradicts our policy."2.5 Tool — the mechanism every other primitive reaches the model through
time
│
▼
┌──── step 1: model is shown the available tools ─────────────────────┐
│ │
│ system: │
│ "you are an agent. use the tools when helpful." │
│ tools (schema sent with every turn): │
│ [ │
│ { name:"read_holdings", args:{ workspaceId:string } }, │
│ { name:"search_kb", args:{ query:string, k:int } }, │
│ { name:"run_sql", args:{ sql:string } } │
│ ] │
│ user: "top 5 customers by revenue last quarter?" │
│ │
└─────────────────────────────────────────────────────────────────────┘
┌──── step 2: model emits a tool call instead of an answer ───────────┐
│ │
│ LLM ──► { name:"run_sql", │
│ args:{ sql: "SELECT name, SUM(amount) AS rev │
│ FROM invoices │
│ WHERE paid_at BETWEEN '2026-01-01' AND │
│ '2026-03-31' │
│ GROUP BY name ORDER BY rev DESC LIMIT 5"}} │
│ │
│ the model has NOT answered yet — it has asked the runtime to do │
│ the work and feed the result back in │
│ │
└─────────────────────────────────────────────────────────────────────┘
┌──── step 3: runtime executes (sandboxed) and returns the result ────┐
│ │
│ runtime.executeTool("run_sql", {sql}) │
│ ──► validates: SELECT-only? not on auth table? row cap? │
│ ──► prisma.$queryRawUnsafe(sql) │
│ ──► tool_result = [ │
│ { name:"Acme", rev:412300 }, │
│ { name:"Globex", rev:308100 }, │
│ { name:"InitVivo", rev:285450 }, │
│ { name:"Yotsuba", rev:201020 }, │
│ { name:"Hooli", rev:177540 } │
│ ] │
│ │
└─────────────────────────────────────────────────────────────────────┘
┌──── step 4: model uses the result to answer ────────────────────────┐
│ │
│ LLM (final): "Last quarter the top 5 customers were Acme │
│ ($412K), Globex ($308K), InitVivo ($285K), Yotsuba ($201K), │
│ and Hooli ($178K)." │
│ │
└─────────────────────────────────────────────────────────────────────┘
── what tool-use gives an agent: the ability to ACT during generation,
not just produce text. every other primitive on this page reaches
the model through this mechanism.3. Five real-world scenarios across industries
Same five primitives, different verticals. Each step is tagged with the primitive carrying it so you can see the composition.
“Audit this 80-page MSA for unusual indemnity clauses and flag anything that diverges from our standard.”
- 1Skillcontract-review skill loaded — defines the diff procedure and citation requirements.
- 2RAGembeds each section and retrieves the org's standard MSA clauses for comparison.
- 3Toolclause_diff tool produces a structured before/after for every divergence.
- 4APIPOST to Linear opens a review ticket with the diff attached.
“For each new lead this week, find their company size, funding stage, and recent product launches.”
- 1APIGET /v1/leads?since=7d returns the new leads from the CRM.
- 2MCPbrightdata MCP scrapes LinkedIn, Crunchbase, and the company blog for each lead.
- 3Toolscore_lead tool merges enriched fields and computes a fit score.
- 4APIPATCH /v1/leads/{id} writes the enrichment back to the CRM.
“Watch the staging deploy. If it fails, open a Linear ticket with the failing step and ping me.”
- 1APIpolls GET /v1/deploys/{id}/status every 15s until 'failed' or 'success'.
- 2Toolparse_logs tool extracts the failing step and the first stack trace from build output.
- 3MCPlinear-server.create_issue opens INFRA-### with title, step, trace, and labels.
- 4MCPslack-server.dm posts to the on-call engineer with the ticket link.
“Draft a personalized reply to every negative review from last week, citing the actual feature we shipped that addresses it.”
- 1APIGET /v1/reviews?rating<=3&since=7d returns negative reviews.
- 2RAGembeds each review and retrieves matching changelog entries from the product KB.
- 3Skillreview-reply skill enforces tone, length, and 'no false claims' rules.
- 4Toolsend_review_reply tool drafts each response and stages for human approval.
“Reconcile yesterday's Stripe payouts against the GL. Surface anything that doesn't match.”
- 1MCPstripe-mcp lists payouts; quickbooks-mcp lists GL entries.
- 2Toolmatch_records tool diffs the two sets and groups by payout id.
- 3Skillreconciliation skill enforces the 'flag don't fix' rule — agent never adjusts GL silently.
- 4APIPOST /v1/reports/recon writes the report; if mismatches exist, opens a ticket.
“why is gross margin down 3 points this quarter?”
- 1Skillmargin-investigation skill loaded — defines the queries and the memo format.
- 2Toolrun_sql tool runs the cohort-by-SKU cost decomposition.
- 3RAGretrieves applicable pricing-floor and supplier-contract clauses.
- 4APIPOST /v1/memos files the draft memo; Slack DM with the link.
4. Things people get wrong
MCP is just APIs with a new name.
APIs are endpoints defined by the system that owns the data. MCP is a protocol that wraps any system so an AI client can discover its tools, call them, and stream results — using one handshake regardless of vendor. The MCP server usually calls APIs under the hood; that's the wrapping, not the equivalence.
RAG is just a database.
A database stores rows; RAG is the pattern of using a vector index (or hybrid search) to fetch only the relevant fragments and inject them into the prompt before generation. The store is one ingredient; retrieval + splicing + grounded generation is the recipe.
A skill is the same as a tool.
A tool executes something — runs SQL, sends an email. A skill tells the assistant how — it's text loaded into context. Skills don't run; they instruct. Most useful skills tell the agent which tools to use in which order.
If we have an API, we have an AI-ready surface.
APIs were designed for programs, not for LLMs. To be agent-ready a surface needs typed schemas, predictable error shapes, idempotency, and a discovery mechanism. MCP adds those things on top of your API. Tool definitions add them inside your agent runtime.
An agent is just an LLM with a system prompt.
A chatbot is an LLM with a system prompt. An agent is an LLM in a loop that can pause mid-turn, call tools, observe results, and decide its next step. The loop is the agent; the prompt is just the starting condition.
RAG and fine-tuning solve the same problem.
Use RAG for facts that change (this week's policies, yesterday's tickets, this customer's history). Use fine-tuning for behaviour that should be the same across all users (tone, format, hard skills the base model lacks). You will almost always end up using both.
5. Which primitive do I need? — a decision flow
┌─── "I want to add a new capability to the agent." ──────────────────┐ │ │ │ start here │ │ │ │ │ ┌──────────────────────────────┴─────────────────────────────┐ │ │ │ Is the capability a recurring procedure │ │ │ │ that humans should agree on once and re-use forever? │ │ │ └──────────────────────────────┬─────────────────────────────┘ │ │ │ │ │ yes ◄───────────┴───────────► no │ │ │ │ │ │ ▼ │ │ │ add a SKILL │ │ │ (file on disk) ▼ │ │ ┌────────────────────────────┐ │ │ │ Will OTHER AI clients │ │ │ │ (Cursor, Claude Code, …) │ │ │ │ also want to reach this │ │ │ │ capability? │ │ │ └────────────┬───────────────┘ │ │ │ │ │ yes ◄──────────┴──────────► no │ │ │ │ │ │ ▼ ▼ │ │ expose it as MCP add it as a TOOL │ │ (one server, many (typed function │ │ clients reuse it) your agent uses)│ │ │ │ ─────── if the capability READS data the model didn't see in │ │ training ── always pair it with RAG. │ │ │ │ ─────── if the capability is "talk to a system the company │ │ already exposes" ── that system already has an API; you │ │ are just wrapping it. │ └─────────────────────────────────────────────────────────────────────┘
6. Side-by-side
| Aspect | API | MCP | Skill | RAG | Tool |
|---|---|---|---|---|---|
| What it is | HTTP endpoint | Tool-exposure protocol | Markdown playbook | Retrieve-then-generate pattern | Typed function the LLM can call |
| Who calls it | Any program with creds | Any MCP-aware AI client | The assistant on trigger | Your app code / agent runtime | The LLM during generation |
| Stateful? | No (REST norm) | Yes (session per connection) | No (content only) | No (per-query) | Per-call |
| Auth model | API key / OAuth / session | None / token / OAuth | None (it's a file) | Inherits vector store auth | Inherits underlying surface |
| Latency profile | 100 ms – 5 s | 50 ms – 30 s (stdio cold start) | Free (in-context) | ~200 ms embed + ANN | Whatever it wraps |
| Discovery | Docs / OpenAPI | tools/list JSON-RPC | Frontmatter trigger | N/A | Schema declared at call time |
| Versioning | URL or header | protocolVersion in handshake | File version / git | Index version | Argument schema |
| Composes with | Anything | Tools, RAG, other MCP | Tools, RAG, MCP | Tools, MCP, APIs | RAG, MCP, APIs |
| neww.ai surface | apps/web/src/app/api/v1/* | 11 user-scope servers | Roadmap (not yet 1st-class) | Qdrant + lib/web-data/ | lib/agent/tools/index.ts |
7. The recipe for an autonomous agent
An autonomous agent — one that can take a goal and pursue it without a human in the inner loop — needs all five primitives, each doing the specific job nothing else can do:
- 1 · loopWrap an LLM in a planner that can call tools, observe results, and decide whether to continue or stop. This is the agent.
- 2 · SkillsGive the agent the org's playbooks for recurring tasks so the procedure is the same on every run. Removes variance.
- 3 · MCPWire it to every external system through one protocol. The agent picks up new toolkits with zero new SDK code.
- 4 · RAGConnect it to the org's living knowledge: policies, tickets, docs, code, recent decisions. Grounds every claim.
- 5 · ToolsWrap the primitives above in typed functions with input validation, output schemas, and sandboxing. This is the interface the model actually uses.
- 6 · APIsBehind each tool, the real side effect — sending the email, filing the ticket, placing the trade, running the SQL.
Drop any one of those layers and you regress: drop the loop and it's a chatbot; drop skills and you get inconsistency; drop MCP and you write SDKs forever; drop RAG and you hallucinate; drop tools and the model can't act; drop APIs and there's nothing real to act on.
8. In the neww.ai codebase
Every primitive on this page maps to a real file or route you can read today. No aspirational stubs.
| Primitive | neww.ai surface |
|---|---|
| API | apps/web/src/app/api/v1/* apps/web/src/app/api/agent/master/dispatch/route.ts |
| MCP | 11 user-scope servers in ~/.claude.json apps/web/src/app/api/mcp/route.ts (outbound) allsystemsmvp/tests/testmcps.py (connectivity probe) |
| Skill | Roadmap — Claude Code skills used by builders today; platform skill layer planned at lib/agent/skills/. |
| RAG | apps/web/src/lib/web-data/fabric.ts apps/web/src/lib/web-data/router.ts apps/web/src/lib/web-data/connectors/* (40 connectors) Qdrant + Meilisearch hybrid arm |
| Tool | apps/web/src/lib/agent/tools/index.ts apps/web/src/lib/agent/tools/data.ts apps/web/src/lib/agent/tools/security.ts Wired into the model via apps/web/src/lib/ai/orchestrator/tool-loop.ts |
| Routing | apps/web/src/lib/ai-router.ts — provider selection, retry, budget enforcement, cross-provider fallback |
9. Further reading
- Model Context Protocol specification — the open standard behind MCP servers.
- Anthropic tool-use guide — reference for how Claude invokes tools.
- neww.ai Help Center — quickstarts, connector setup, and packs.