AgentNavaKit lets you describe an AI agent in a single TypeScript object and run it on a managed runtime. The runtime executes the agent loop, handles tool calls, and streams typed events back — render them in any UI you like.
Two paths. The fast one hands a single prompt to your coding agent and ships a working agent in one shot. The manual one is five steps.
✦
One prompt → working agent
Paste this into Claude Code, Codex, or Cursor. It detects your package manager, installs AgentNavaKit, scaffolds a real agent, runs it, and curl-tests it.
Set up AgentNavaKit (`@agentnava/kit`) in this project end-to-end. Agents always run on the AgentNava backend — never locally. The lifecycle is: configure (upload spec + metadata) → test (ephemeral instance) → deploy (versioned production).
1. Detect the package manager from the lockfile (bun.lockb → bun, pnpm-lock.yaml → pnpm, yarn.lock → yarn, else npm). Use it consistently.
2. Install `@agentnava/kit` and `valibot`.
3. Check that the CLI is authenticated to my workspace by running `agentnava-kit whoami`. If it errors, stop and tell me to run `agentnava-kit login` (or set `AGENTNAVAKIT_API_KEY` if running in CI). Don't proceed until I confirm.
4. Create `agents/hello/role.md`:
- A single H1 "# What this agent does"
- One short paragraph describing a concise, helpful generalist that calls users "you"
5. Create `agents/hello.ts` calling `defineAgent({...})`:
- name: 'Hello'
- modelClass: 'standard'
- spec: ['./hello/role.md'] // relative path; CLI resolves at configure time
- triggers: [{ kind: 'chat' }]
6. Add package.json scripts:
- "agent:configure": "agentnava-kit configure agents/hello.ts"
- "agent:test": "agentnava-kit test agents/hello.ts"
- "agent:deploy": "agentnava-kit deploy agents/hello.ts"
7. Run `agent:configure` to upload the spec + metadata. Print whatever the CLI returns.
8. Run `agent:test`. The CLI returns a URL of the form `https://.test.agents.agentnava.com`. Wait for it.
9. Curl-test the chat endpoint at that URL with a sample message; show the streaming SSE output.
10. Print a one-paragraph summary of what was installed, the test URL, and what to try next. Do NOT run `agent:deploy` — that's production and the user does it explicitly.
Reference: https://docs.agentnava.com
Don't modify unrelated files. Ask before making any ambiguous choice.
Or do it by hand
1
Install the package
npm install @agentnava/kit valibot
# or: bun add @agentnava/kit valibot
AgentNavaKit ships TypeScript source. No build step. valibot provides input schemas for custom tools.
2
Connect your workspace
agentnava-kit login
# opens https://agentnava.com/cli/auth in your browser.
# pick a workspace; the CLI saves a token to ~/.agentnavakit/credentials.
Every command after this targets that workspace. To switch workspaces later: agentnava-kit login --workspace <slug>. For CI or scripts, set the env var instead:
export AGENTNAVAKIT_API_KEY=sk_live_…
# generate one at agentnava.com → Settings → API keys
3
Write the behavior spec
// agents/hello/role.md
# What this agent does
You are a concise, helpful generalist. Answer in two paragraphs max unless the user asks for detail. Always address the user as "you".
The Spec is one or more .md files. Their concatenated body is exactly what the runtime sends to the model on every turn — write what you want the agent to actually do.
Sends the spec files + the TS metadata to the backend as a configuration draft. Re-run after any change.
6
Test
agentnava-kit test agents/hello.ts
# ✓ test instance ready at:
# https://<temp-id>.test.agents.agentnava.com
# streaming events. ctrl-c to stop.
Spins up an ephemeral instance on the backend from the current configuration. Hit it from curl, your own UI, or share the URL with a teammate. No local server runs on your machine.
The response is an SSE stream of typed AgentEvents. Any UI that reads SSE can render the reply. See Sessions for the full chat protocol, including attachments and multi-turn history.
8
Deploy to production
agentnava-kit deploy agents/hello.ts
# ✓ deployed v1 — promoted to production
# https://hello.agents.agentnava.com
# embed snippet: <script src="..."></script>
Stamps the current configuration as a version and promotes it to the stable production URL. See Operate for custom domains, rollbacks, and BYOK.
Spec
An agent has two parts. The Spec — one or more .md files that describe what the agent does, how it talks, and what it won't do. The wiring — a TypeScript object that hooks the Spec up to a model, triggers, integrations, knowledge, and tools.
Why multiple .md files
Behavior is rarely one paragraph. Splitting the Spec into multiple files keeps each one focused, reviewable, and reusable:
By concern — separate what the agent does from how it talks from what it won't do. Changing tone shouldn't require re-reading the role file.
By owner — your legal team owns off-limits.md; product owns role.md; design owns tone.md. Each PR's diff lives where the reviewer expects it.
By reusability — one shared/legal-off-limits.md file can be referenced by every agent in your workspace. Change it once; every agent picks it up on next deploy.
The runtime concatenates the files in the order you list them — that becomes the agent's instructions.
The agent lifecycle
An agent moves through three stages. Each is one CLI command — they're all the same operation under the hood, just with different persistence and versioning rules.
Stage
Command
What it does
Configure
agentnava-kit configure
Sends your spec files + the TS metadata (name, model, triggers, connections, sources, tools) to the backend as a draft. No public URL yet. Re-run any time you change anything.
Test
agentnava-kit test
Configures (if needed), then spins up an ephemeral test instance. Returns a test URL you can hit from your own UI or from curl. Streams events back to your terminal. The instance disappears when you stop the command.
Deploy
agentnava-kit deploy
Same as test, but stamps the configuration as a new version, persists it on the backend, and promotes it to the production URL. The previous version stays alive until the new one passes a health check.
Test and deploy do the same thing — run your agent on the backend with the current configuration. The only difference: deploy persists + bumps the version, test doesn't.
The CLI reads the TS, resolves the spec paths, uploads every file's content, and posts the full configuration (spec + metadata) to the backend in one request. Idempotent — files with unchanged content are skipped.
3
Test
agentnava-kit test agents/concierge.ts
# ✓ test instance ready at:
# https://<temp-id>.test.agents.agentnava.com
# streaming events. ctrl-c to stop.
Test creates a live, ephemeral instance from the current configuration. Point your own UI at it, run curl against it, share the URL with a teammate for review. Re-run configure + test as you iterate. The instance is torn down when you stop the command.
4
Deploy
agentnava-kit deploy agents/concierge.ts
# ✓ deployed v4 — promoted to production
# https://bayhomes-concierge.agents.agentnava.com
# previous version v3 kept warm during health check
Deploy stamps the current configuration as a new version and replaces the live agent. The version number auto-increments. To roll back: agentnava-kit rollback bayhomes-concierge --to v3.
What the .md files look like
concierge/role.md
# What this agent does
Answers questions from prospective Bay Area buyers about BayHomes listings, school districts, and visit times.
It calls users "you" and always cites a listing ID when discussing a specific home.
concierge/tone.md
# How it talks
Friendly. Direct. Never pushy.
Two paragraphs max per reply unless the buyer asks for detail.
concierge/off-limits.md
# What it will not do
- Quote final or specific pricing — always defer to a human agent.
- Give legal advice.
- Share information about other clients.
When asked anything in this list, say so plainly and offer to connect the buyer with a human.
At a glance
Field
Type
Required
What it changes
name
string
yes
What users address the agent as in chat (e.g. "Hey Concierge, …") and how it introduces itself.
description
string
no
The one-line pitch users see when picking the agent from a catalog. No behavioral effect.
spec
string[]
yes
Paths to the .md files that describe the agent. Their concatenated body IS the agent's instructions.
modelClass
'standard' | 'premium' | 'advanced'
yes
How much reasoning the agent applies per turn. Higher classes think longer, cost more, and chain more tool calls before replying.
Whose account pays for model calls. 'auto' = the managed runtime; anything else = your own cloud account (BYOK).
triggers
Trigger[]
yes
When the agent wakes up. Without at least one trigger, the agent can never run.
connections
ConnectionDef[]
no
Live integrations the agent can use during a turn — to post to Slack, draft a PR, look up calendar slots, etc.
sources
Source[]
no
Knowledge the agent reads before replying. Pair with uploadable: true to let users or devs upload docs at runtime.
tools
ToolDef[]
no
Custom functions you write. The agent decides when to call them.
Fields
namestringRequired
What the agent is called in the conversation. The agent answers to this name; users address it by it. Also appears in the workspace header, the embed widget, and catalog cards — but the conversational identity is what matters.
name: 'BayHomes Concierge'
descriptionstringOptional
The one-line pitch users read when picking the agent from a catalog. Does not affect behavior. Skip it if you're not publishing the agent to a public catalog.
description: 'Embedded chat for prospective Bay Area buyers — listings, schools, viewings.'
specstring[]Required
IDs of spec files in your workspace's registry. Their concatenated body IS the instructions the runtime sends to the model on every turn. What lives at those IDs is exactly what the agent treats as ground truth about itself.
You upload spec files with agentnava-kit spec push (see the workflow above). The IDs are just paths-minus-.md:
spec: [
'concierge/role',
'concierge/tone',
'shared/legal-off-limits', // re-used across many agents
]
Files at these IDs are read at deploy time and embedded in the agent version. Editing a spec file requires re-pushing AND re-deploying — old agent versions stay on their old spec content.
How much reasoning the agent applies on each turn. Higher classes think longer, can chain more tool calls before replying, and cost more per turn.
Class
When to pick it
Default route
standard
Short replies, FAQ, simple Q&A. The agent answers in one shot, with at most one tool call.
GLM-4.7 tier
premium
Drafts, multi-step reasoning, RAG across several sources, summarization with citations.
Sonnet tier
advanced
Planning across many tools and steps — research, code generation, complex workflows.
Opus tier
To freeze behavior so the agent doesn't auto-upgrade when the runtime ships a new default model, pin a version:
modelClass: 'standard' // rolling — auto-upgrades over time
modelClass: 'standard@2026-05' // pinned to the May 2026 snapshot, forever
modelClass: 'premium@v3' // pinned to integer release v3
Whose account pays for the model calls. 'auto' bills the managed runtime — you don't manage any cloud credentials. Any other value routes inference through your own cloud account; tokens are billed there. See Operate → BYOK for how to attach credentials.
triggersTrigger[]Required
When the agent wakes up. Without at least one trigger, the agent never runs.
A self-driving loop runs the agent every N seconds, optionally capped. Useful for monitoring tasks.
manual
Your code calls agent.run({...}) programmatically. Useful for batch jobs.
connectionsConnectionDef[]Optional
Live integrations the agent can use during a turn. Each connection registers a handful of tools (Slack registers post_message, list_channels, etc.) and may add a trigger (e.g. Slack mention). Full catalog at Connections.
Custom functions you write in TypeScript. The agent decides when to call them based on what the user asks. Use this for behavior that no connection covers — calling your internal API, computing something, hitting a third-party SaaS the catalog doesn't yet support.
import { defineTool } from '@agentnava/kit';
import { object, string } from 'valibot';
const lookupZestimate = defineTool({
name: 'lookup_zestimate',
description: 'Get the Zillow zestimate for an address',
inputSchema: object({ address: string() }),
handler: async ({ address }) => {
// implementation
},
});
tools: [lookupZestimate]
The runtime drives the call/result loop and emits tool-start / tool-end events the host UI can render.
Uploading documents
A documents source needs files. Three ways to get them in:
At deploy. Add uri: './docs' + uploadOnDeploy: true to the source. The CLI uploads the folder when you run agentnava-kit deploy. Re-deploy to refresh.
From a script. Use the upload endpoint to push files programmatically from a backend.
From your UI. Drop in the upload widget so end-users (or your own admins) can attach files after the agent is deployed.
HTTP upload
POST https://<agent-id>.agents.agentnava.com/sources/<source-id>/files
Authorization: Bearer <workspace-key>
Content-Type: multipart/form-data
[email protected][email protected]
Returns { fileId, status: 'indexing' } per file. Indexing typically completes in 30–90s. The file becomes searchable across the agent's next turn.
The data-uploads="true" flag adds a file-attach button in the chat composer. Uploaded files become available to that user's conversation and (if the source is shared) to the agent's RAG index.
Event stream
The runtime emits a typed SSE stream the host UI consumes.
Any consumer that reads SSE can render the stream — React, Vue, Svelte, vanilla JS, a CLI. No framework lock-in.
Advanced
metaRecord<string, unknown>Optional
Anything you set here is readable from your tool handlers via ctx.spec.meta.X. Use it for per-deploy config that the agent itself doesn't need to reason about — brand colors, deploy targets, feature flags.
A connection wires an agent to a third-party service in one declaration. It handles auth, auto-registers a set of tools, and may add a trigger. Most agents only need connections — never individual defineTool calls.
✦
Add a connection with one prompt
Paste this in. Swap the service name to install any of the connections below.
Add the Slack connection to the AgentNavaKit agent in this project.
1. Locate the agent's TS file (likely `agents/*.ts` calling `defineAgent({...})`).
2. Import `connect` from `@agentnava/kit` if not already imported.
3. Add `connect.slack({ workspace: '' })` to the `connections` array. Ask me for the workspace slug if you can't infer it.
4. Run `agentnava-kit auth slack`, print the OAuth URL it returns, and wait for me to authorize it in the browser.
5. Once authorized, run `agentnava-kit configure .ts` to push the updated configuration.
6. Run `agentnava-kit test .ts` and verify the agent can call `post_message` to a test channel.
Reference: https://docs.agentnava.com#connections
Don't deploy to production. Don't touch unrelated agents.
Catalog
S
SlackOAuth
Post, read channels, reply in threads, mention-trigger.
connect.slack({ workspace: 'acme' })
Auto-registered tools
post_messagelist_channelsthread_replysearch
G
GitHubOAuth
PRs, issues, commits, repo file access.
connect.github({ org: 'acme' })
Auto-registered tools
open_prcomment_issuelist_reposget_file
M
GmailOAuth
Send, search, label. Trigger on inbound matching a query.
The agent now has nine new tools and a Slack mention trigger, with zero credential handling in the spec.
Sessions
Once an agent is deployed, end users interact with it through sessions. Each session is one user's ongoing conversation — its own message history, its own attached files, its own state. A single deployed agent serves many concurrent sessions in parallel; each one is independent.
How sessions get created
Embed widget. Mounting <script src="https://embed.agentnava.com/v1.js" data-agent="…"> on a page creates one session per browser tab automatically. The session ID is persisted in localStorage so the user can refresh and resume.
Your own UI. Call POST /agents/<id>/sessions to mint a session ID, then use it on every subsequent chat call.
Implicit (single-turn). Hit POST /chat directly without a session ID — the runtime mints a one-shot session that's torn down after the response.
The chat protocol
To send a turn to an existing session:
POST https://<agent-url>/sessions/<session-id>/messages
Authorization: Bearer <public-token>
Content-Type: application/json
{
"role": "user",
"content": "Can you walk me through this floor plan?",
"attachments": [
{ "id": "file_4f1c9a", "type": "image", "name": "floor-plan.png" },
{ "id": "file_2a9d12", "type": "file", "name": "hoa-rules.pdf" }
]
}
Response is an SSE stream of AgentEvents — message-delta chunks, then message-end, plus any tool-start/tool-end/phase events the agent emits along the way.
Attachments
An attachments entry references a file the user already uploaded for this session. Two ways to put files there:
// 1. Upload a file to a session — returns a file ID you attach to messages
POST https://<agent-url>/sessions/<session-id>/files
Authorization: Bearer <public-token>
Content-Type: multipart/form-data
[email protected]
// → { "id": "file_4f1c9a", "type": "image", "status": "ready" }
// 2. The embed widget does this automatically when the user drops a file
// into the chat composer (if data-uploads="true" is on the script tag).
Attached files live for the lifetime of the session. They're not added to the agent's RAG index — they're transient context for that one conversation. (For persistent knowledge, use document upload on a documents source instead.)
Session lifecycle
State
What it means
active
The session has been touched in the last hour. Messages stream as SSE.
idle
No activity for an hour, but history is still in memory. Resuming is instant.
hibernated
No activity for 24h. History persisted to durable storage; first resume after this state has a small wake-up cost (~200ms).
closed
Explicitly ended via DELETE /sessions/<id>, or by the agent emitting a handoff event. Attachments are deleted; messages remain queryable from the audit log.
Querying sessions
GET https://<agent-url>/sessions/<session-id>/messages
# → full message history, newest first
GET https://<agent-url>/sessions?since=2026-05-10T00:00:00Z
# → list of recent sessions (workspace-scoped, requires a workspace token)
For analytics, the workspace dashboard surfaces session counts, average turn count, common handoff reasons, and per-version metrics.
Operate
Deploy an agent, point a custom domain at it, embed it on a webpage, and route model calls through your own cloud account.
Deploy
agentnava-kit deploy agents/hello.ts
The CLI returns:
Public URL — https://<agent-id>.agents.agentnava.com
Version number — auto-incremented integer (v1, v2, …)
Embed snippet — one-line <script> tag for any webpage
Webhook URL — for webhook triggers
Health URL — https://<agent-id>.agents.agentnava.com/health
Deploys are atomic per agent. Each one stamps a new version and promotes it to production. The previous version stays warm until the new one passes a health check.
Versions & rollback
agentnava-kit versions hello # list every version of the agent
agentnava-kit rollback hello --to v3 # immediately re-point production to v3
Rollback is instant — versions stay queryable as long as the agent exists. Spec content and TS metadata for every shipped version is preserved.
Custom domain
Map agent.yourdomain.com to the agent's public URL via CNAME, then run:
Drops in any HTML page. data-theme accepts light, dark, or auto.
BYOK — bring your own keys
Route inference through your cloud account. The runtime still runs on AgentNava; only model calls hit the chosen provider. Credentials attach once at the workspace level and are validated with a probe call before any agent uses them.
No setup. AgentNava bills inference. Cheapest to start; switch to BYOK once volume picks up.
In defineAgent, omit provider or set it to 'auto':
defineAgent({
// ...
provider: 'auto', // or omit entirely
});
Tokens billed to your AWS account. Validates against Bedrock model access at connection time.
Attach an IAM user (or assume-role ARN) with bedrock:InvokeModelWithResponseStream on the model classes the workspace uses.
The runtime is charged per workspace plan; model usage is metered separately. With provider: 'auto', model calls are billed by AgentNava. With BYOK, they're billed by your cloud provider. Plan details and pricing are on agentnava.com/pricing.