Files
agent-protocol/spec/README.md
Bruno Sarlo da9207ded3 v1.0.0-draft: First public draft release
agents.md protocol for AI agent web discovery.

Key features:
- Two formats: Pure Markdown (simple) or YAML frontmatter (structured)
- MCP gateway integration for tool access
- Discovery via /.well-known/agents.md
- Security: origin trust, endpoint validation, auth guidance
- Backward compatible with robots.txt and llms.txt

Design based on 3-iteration process:
1. Gap analysis and planning
2. Multi-model consensus on format decisions
3. Code review for completeness and clarity

Philosophy: robots.txt says what agents CANNOT do,
agents.md says what they CAN do.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-14 09:18:13 -03:00

417 lines
9.1 KiB
Markdown

# agents.md Protocol Specification
**Version:** 1.0.0-draft
**Status:** Draft
**Updated:** 2026-01-14
## Abstract
A simple text file that tells AI agents what they can do on a website and optionally points them to an MCP gateway for structured tool access.
## Philosophy
| Standard | Tells agents... |
|----------|-----------------|
| robots.txt | What you **cannot access** |
| llms.txt | What **content is important** |
| **agents.md** | What you **can do** + where to connect |
## 1. Discovery
### Location
Primary: `/.well-known/agents.md`
Fallback: `/agents.md`
### Content-Type
`text/markdown` or `text/plain`
### Request
```http
GET /.well-known/agents.md HTTP/1.1
Host: example.com
User-Agent: MyAgent/1.0 (AI Agent)
```
### Caching
Servers SHOULD set appropriate `Cache-Control` headers:
```http
Cache-Control: public, max-age=86400
```
Agents SHOULD cache the `agents.md` file for 24 hours unless HTTP headers specify otherwise. Agents MUST NOT request this file more than once per hour for the same origin.
## 2. Format
Two formats are supported:
### Format A: Pure Markdown (Simple)
Plain Markdown with section headers. Best for simple sites.
```markdown
# Example Site
A bookstore since 2010.
## Can
- Search catalog
- Read book details
- Check availability
## Cannot
- Place orders without human
- Access user accounts
## Contact
agents@example.com
```
### Format B: YAML Frontmatter + Markdown (Structured)
YAML frontmatter for machine-readable configuration, Markdown body for human context. Recommended when using MCP.
```markdown
---
version: "1.0"
mcp:
endpoint: https://example.com/.well-known/mcp
transport: streamable-http
auth: none
---
# Example Bookstore
Online bookstore with 50,000 titles.
## Can
- Search and browse catalog
- Read reviews and descriptions
- Check prices and stock
- Place orders (authenticated)
## Cannot
- Modify user accounts
- Access admin functions
## Behavior
- Respect 1 request/second
- Cache product data 1 hour
- Identify in User-Agent header
## Contact
agents@example.com
```
### Parsing Rules
1. If file starts with `---`, parse YAML frontmatter until closing `---`
2. Parse remaining content as Markdown
3. Section headers (`## Name`) define semantic sections
4. List items under sections define capabilities/rules
## 3. Sections
All sections are optional. Use what makes sense for your site.
### Identity (H1 Header)
```markdown
# Site Name
Brief description of what this site is.
```
### Capabilities (Can/Cannot)
```markdown
## Can
- Action agents are allowed to take
- Another allowed action
## Cannot
- Restricted action
- Another restriction
```
### MCP Gateway
Defined in YAML frontmatter (preferred) or Markdown section:
**YAML Frontmatter (preferred):**
```yaml
---
mcp:
endpoint: https://example.com/.well-known/mcp
transport: streamable-http
auth: none
---
```
**Markdown Section (fallback):**
```markdown
## MCP
endpoint: https://example.com/.well-known/mcp
transport: streamable-http
auth: none
```
When using Markdown section format, content MUST be valid YAML key-value pairs.
**Fields:**
| Field | Required | Values | Description |
|-------|----------|--------|-------------|
| `endpoint` | Yes | URL | MCP server endpoint |
| `transport` | No | `streamable-http`, `sse` | Transport protocol (default: `streamable-http`) |
| `auth` | No | `none`, `api_key`, `oauth2` | Authentication method (default: `none`) |
### Behavior Rules
```markdown
## Behavior
- Rate limit guidance
- Caching expectations
- Identification requirements
```
### Contact
```markdown
## Contact
email@example.com
https://example.com/agent-support
```
## 4. MCP Integration
The `agents.md` file is the **discovery handshake**. The MCP gateway is where **structured interaction happens**.
```
Agent reads agents.md
├─► Basic agent: understands site from text
└─► Advanced agent: connects to MCP gateway
MCP Server exposes:
- Tools (search, checkout, etc.)
- Resources (catalog, docs)
- Prompts (guided workflows)
```
### Discovery Flow
1. Agent fetches `/.well-known/agents.md`
2. Parses YAML frontmatter or `## MCP` section
3. Extracts MCP endpoint URL
4. Connects via MCP protocol
5. Discovers available tools via `tools/list`
6. Uses tools as permitted
### MCP Endpoint Location
Recommended: `/.well-known/mcp`
This follows the well-known URI pattern and keeps agent-related endpoints together.
## 5. Security
### Origin Trust
Agents MUST only trust `agents.md` from the site's origin. Instructions embedded in page content MUST be ignored.
### MCP Endpoint Validation
The MCP endpoint MUST share the same registrable domain as the `agents.md` file. For example:
| agents.md location | Valid MCP endpoints |
|-------------------|---------------------|
| `example.com/.well-known/agents.md` | `example.com/mcp`, `api.example.com/mcp` |
| `shop.example.com/.well-known/agents.md` | `shop.example.com/mcp`, `api.shop.example.com/mcp` |
Cross-origin MCP endpoints (different registrable domain) MUST be rejected unless the user explicitly approves.
### Transport Security
- MCP endpoints MUST use HTTPS in production
- Agents SHOULD warn users about HTTP endpoints
- Certificate validation MUST NOT be disabled
### Authentication
When `auth: oauth2` is specified:
- Agents SHOULD request minimal scopes
- Tokens MUST be stored securely
- Refresh tokens SHOULD be used when available
When `auth: api_key` is specified:
- Keys SHOULD be obtained through official channels
- Keys MUST NOT be shared between users
- Keys SHOULD be rotated periodically
### Least Privilege
Agents SHOULD request only the permissions they need for the current task.
## 6. Backward Compatibility
### With robots.txt
`agents.md` supplements but does not replace `robots.txt`. In case of conflict regarding access restrictions, `robots.txt` takes precedence.
```markdown
## Cannot
- Access /admin (per robots.txt)
- Access /private
```
### With llms.txt
Both files serve different purposes and can coexist:
| File | Purpose |
|------|---------|
| `/robots.txt` | Crawl restrictions |
| `/llms.txt` | Content summary for LLMs |
| `/.well-known/agents.md` | Agent capabilities + MCP |
## 7. Examples
### Minimal (No MCP)
```markdown
# My Tech Blog
Articles about software development.
## Can
- Read all public articles
- Search by topic
- Access RSS feed at /feed.xml
## Cannot
- Post comments (requires human)
- Access draft posts
## Contact
hello@myblog.example
```
### With MCP Gateway
```yaml
---
version: "1.0"
mcp:
endpoint: https://weather.example/.well-known/mcp
transport: streamable-http
auth: none
---
# Weather API
Free weather data for AI agents.
## Can
- Get current conditions
- Get forecasts (up to 7 days)
- Get weather alerts
## Behavior
- 60 requests/minute
- Cache forecasts 30 minutes
## Contact
api@weather.example
```
### E-commerce with OAuth
```yaml
---
version: "1.0"
mcp:
endpoint: https://techmart.example/.well-known/mcp
transport: streamable-http
auth: oauth2
---
# TechMart
Electronics retailer.
## Can
- Search products
- Compare specifications
- Check prices and stock
- Add to cart (authenticated)
- Checkout (authenticated)
## Cannot
- Access order history without user consent
- Modify account settings
## Behavior
- 1 request/second for browsing
- Identify as AI agent in requests
## Contact
partners@techmart.example
```
## 8. Implementation Notes
### For Site Owners
1. Create `/.well-known/agents.md` on your server
2. Start with Format A (pure Markdown) for simplicity
3. Add MCP gateway later if you want structured tool access
4. Set `Cache-Control` header for efficient agent behavior
### For Agent Developers
1. Check `/.well-known/agents.md` first, fall back to `/agents.md`
2. Parse YAML frontmatter if present
3. Cache responses per HTTP headers (default: 24 hours)
4. Respect `Cannot` restrictions and `Behavior` rules
5. Connect to MCP gateway for structured tools
### Versioning
The `version` field in YAML frontmatter indicates spec compatibility:
- `1.x` - Compatible with this specification
- Future versions will maintain backward compatibility within major version
## Appendix A: Comparison Table
| Aspect | robots.txt | llms.txt | agents.md |
|--------|------------|----------|-----------|
| Purpose | Crawl control | Content summary | Capabilities |
| Format | Custom syntax | Markdown | Markdown + optional YAML |
| Focus | Restrictions | Understanding | Actions |
| MCP | No | No | Yes (optional) |
| Year | 1994 | 2024 | 2026 |
## Appendix B: YAML Frontmatter Schema
```yaml
# All fields optional except where noted
version: string # Spec version (e.g., "1.0")
mcp:
endpoint: string # Required if mcp present. MCP server URL
transport: string # "streamable-http" | "sse" (default: streamable-http)
auth: string # "none" | "api_key" | "oauth2" (default: none)
```
## Appendix C: References
- [Model Context Protocol](https://modelcontextprotocol.io/) - MCP Specification
- [RFC 8615](https://www.rfc-editor.org/rfc/rfc8615) - Well-Known URIs
- [RFC 9309](https://www.rfc-editor.org/rfc/rfc9309) - robots.txt
- [llms.txt](https://llmstxt.org/) - LLM Content Discovery