From da9207ded368370d6a417b9587499528b053365e Mon Sep 17 00:00:00 2001
From: Bruno Sarlo <bruno@sarlo.uy>
Date: Wed, 14 Jan 2026 09:18:13 -0300
Subject: [PATCH] v1.0.0-draft: First public draft release

agents.md protocol for AI agent web discovery.

Key features:
- Two formats: Pure Markdown (simple) or YAML frontmatter (structured)
- MCP gateway integration for tool access
- Discovery via /.well-known/agents.md
- Security: origin trust, endpoint validation, auth guidance
- Backward compatible with robots.txt and llms.txt

Design based on 3-iteration process:
1. Gap analysis and planning
2. Multi-model consensus on format decisions
3. Code review for completeness and clarity

Philosophy: robots.txt says what agents CANNOT do,
agents.md says what they CAN do.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 CHANGELOG.md   |  52 +++++++
 README.md      |  60 ++++++---
 spec/README.md | 360 ++++++++++++++++++++++++++++++++-----------------
 3 files changed, 327 insertions(+), 145 deletions(-)
 create mode 100644 CHANGELOG.md

diff --git a/CHANGELOG.md b/CHANGELOG.md
new file mode 100644
index 0000000..26f0583
--- /dev/null
+++ b/CHANGELOG.md
@@ -0,0 +1,52 @@
+# Changelog
+
+All notable changes to the agents.md protocol specification.
+
+## [1.0.0-draft] - 2026-01-14
+
+First public draft release.
+
+### Added
+
+- **Core specification** defining agents.md file format
+- **Two format options**: Pure Markdown (simple) and YAML frontmatter + Markdown (structured)
+- **MCP gateway integration** for pointing agents to Model Context Protocol servers
+- **Discovery mechanism** via `/.well-known/agents.md` (primary) and `/agents.md` (fallback)
+- **Security section** with origin trust, endpoint validation, and authentication guidance
+- **Backward compatibility** guidance for robots.txt and llms.txt coexistence
+- **Caching recommendations** for both servers and agents
+- **Examples** covering minimal sites, MCP-enabled APIs, and OAuth-protected e-commerce
+
+### Design Decisions
+
+Based on multi-model consensus:
+
+1. **Hybrid format** - Optional YAML frontmatter for machine-readable config, Markdown body for human context
+2. **Simple naming** - Keep `Can/Cannot` section names for intuitiveness
+3. **MCP as pointer** - Reference MCP endpoints, don't embed tool schemas
+4. **Websites first** - Focus on web use case, extensible for future
+
+### Philosophy
+
+| Standard | Purpose |
+|----------|---------|
+| robots.txt | What agents cannot access |
+| llms.txt | What content is important |
+| agents.md | What agents can do |
+
+## [0.2.0] - 2026-01-14
+
+Internal revision with MCP integration.
+
+### Changed
+- Simplified format from API-centric to text-centric
+- Added MCP gateway pointer concept
+- Removed JSON Schema tool definitions
+
+## [0.1.0] - 2026-01-14
+
+Initial internal draft.
+
+### Added
+- Basic protocol concept
+- API tool definitions (later removed)
diff --git a/README.md b/README.md
index c171f6a..2e42906 100644
--- a/README.md
+++ b/README.md
@@ -2,15 +2,18 @@
 
 **Tell AI agents what they can do on your website.**
 
-## The Gap
+[![Version](https://img.shields.io/badge/version-1.0.0--draft-blue)]()
+[![License](https://img.shields.io/badge/license-CC0-green)]()
 
-| File | Purpose |
-|------|---------|
-| robots.txt | What bots **cannot access** |
+## The Problem
+
+| File | What it tells agents |
+|------|---------------------|
+| robots.txt | What you **cannot access** |
 | llms.txt | What **content matters** |
-| **agents.md** | What agents **can do** |
+| **???** | What you **can do** |
 
-## Quick Start
+## The Solution
 
 Create `/.well-known/agents.md`:
 
@@ -31,13 +34,20 @@ An online bookstore.
 hello@mysite.com
 ```
 
-That's it. Plain text. Human readable.
+That's it. Plain text. Human readable. Machine parseable.
 
 ## With MCP Gateway
 
-Point agents to your MCP server for structured tool access:
+Point agents to your [MCP](https://modelcontextprotocol.io/) server for structured tool access:
 
-```markdown
+```yaml
+---
+version: "1.0"
+mcp:
+  endpoint: https://mysite.com/.well-known/mcp
+  transport: streamable-http
+  auth: oauth2
+---
 # My Site
 
 An online bookstore.
@@ -47,11 +57,6 @@ An online bookstore.
 - Check prices
 - Place orders (authenticated)
 
-## MCP
-endpoint: https://mysite.com/.well-known/mcp
-transport: streamable-http
-auth: oauth2
-
 ## Contact
 hello@mysite.com
 ```
@@ -68,20 +73,31 @@ Agent requests /.well-known/agents.md
 
 ## Documentation
 
-- [Specification](./spec/README.md) - Full protocol spec
-- [Examples](./examples/) - Real-world examples
-- [FAQ](./docs/FAQ.md) - Common questions
+- **[Specification](./spec/README.md)** - Full protocol spec (v1.0.0-draft)
+- **[Examples](./examples/)** - Real-world examples
+- **[FAQ](./docs/FAQ.md)** - Common questions
+- **[Changelog](./CHANGELOG.md)** - Version history
+
+## Quick Comparison
+
+| Aspect | robots.txt | llms.txt | agents.md |
+|--------|------------|----------|-----------|
+| Purpose | Crawl control | Content summary | Capabilities |
+| Format | Custom | Markdown | Markdown + YAML |
+| MCP | No | No | Yes |
 
 ## Status
 
-**Draft** - Version 0.2.0
+**v1.0.0-draft** - First public draft release
+
+Feedback welcome via issues.
 
 ## Related Standards
 
-- [robots.txt](https://www.rfc-editor.org/rfc/rfc9309) - Crawl restrictions (1994)
-- [llms.txt](https://llmstxt.org/) - Content for LLMs (2024)
-- [AGENTS.md](https://agents.md/) - Repository instructions (2025)
-- [MCP](https://modelcontextprotocol.io/) - Tool protocol (2024)
+- [robots.txt](https://www.rfc-editor.org/rfc/rfc9309) - RFC 9309 (1994)
+- [llms.txt](https://llmstxt.org/) - Jeremy Howard (2024)
+- [AGENTS.md](https://agents.md/) - OpenAI/Sourcegraph (2025)
+- [MCP](https://modelcontextprotocol.io/) - Anthropic (2024)
 
 ## License
 
diff --git a/spec/README.md b/spec/README.md
index b2c4251..1e4a66f 100644
--- a/spec/README.md
+++ b/spec/README.md
@@ -1,6 +1,6 @@
 # agents.md Protocol Specification
 
-**Version:** 0.2.0
+**Version:** 1.0.0-draft
 **Status:** Draft
 **Updated:** 2026-01-14
 
@@ -18,23 +18,40 @@ A simple text file that tells AI agents what they can do on a website and option
 
 ## 1. Discovery
 
-**Location:** `/.well-known/agents.md` or `/agents.md`
+### Location
 
-**Content-Type:** `text/markdown` or `text/plain`
+Primary: `/.well-known/agents.md`
+Fallback: `/agents.md`
 
-Agents request the file like any HTTP resource:
+### Content-Type
 
-```
+`text/markdown` or `text/plain`
+
+### Request
+
+```http
 GET /.well-known/agents.md HTTP/1.1
 Host: example.com
-User-Agent: MyAgent/1.0
+User-Agent: MyAgent/1.0 (AI Agent)
 ```
 
+### Caching
+
+Servers SHOULD set appropriate `Cache-Control` headers:
+
+```http
+Cache-Control: public, max-age=86400
+```
+
+Agents SHOULD cache the `agents.md` file for 24 hours unless HTTP headers specify otherwise. Agents MUST NOT request this file more than once per hour for the same origin.
+
 ## 2. Format
 
-Plain Markdown. Human readable. Machine parseable.
+Two formats are supported:
 
-### Minimal Example
+### Format A: Pure Markdown (Simple)
+
+Plain Markdown with section headers. Best for simple sites.
 
 ```markdown
 # Example Site
@@ -54,9 +71,18 @@ A bookstore since 2010.
 agents@example.com
 ```
 
-### With MCP Gateway
+### Format B: YAML Frontmatter + Markdown (Structured)
+
+YAML frontmatter for machine-readable configuration, Markdown body for human context. Recommended when using MCP.
 
 ```markdown
+---
+version: "1.0"
+mcp:
+  endpoint: https://example.com/.well-known/mcp
+  transport: streamable-http
+  auth: none
+---
 # Example Bookstore
 
 Online bookstore with 50,000 titles.
@@ -71,10 +97,6 @@ Online bookstore with 50,000 titles.
 - Modify user accounts
 - Access admin functions
 
-## MCP
-endpoint: https://example.com/.well-known/mcp
-transport: streamable-http
-
 ## Behavior
 - Respect 1 request/second
 - Cache product data 1 hour
@@ -84,11 +106,18 @@ transport: streamable-http
 agents@example.com
 ```
 
+### Parsing Rules
+
+1. If file starts with `---`, parse YAML frontmatter until closing `---`
+2. Parse remaining content as Markdown
+3. Section headers (`## Name`) define semantic sections
+4. List items under sections define capabilities/rules
+
 ## 3. Sections
 
-All sections are optional. Use what makes sense.
+All sections are optional. Use what makes sense for your site.
 
-### Identity (Header)
+### Identity (H1 Header)
 
 ```markdown
 # Site Name
@@ -110,24 +139,35 @@ Brief description of what this site is.
 
 ### MCP Gateway
 
-```markdown
-## MCP
-endpoint: <url>
-transport: streamable-http | sse | stdio
-auth: none | api_key | oauth2
+Defined in YAML frontmatter (preferred) or Markdown section:
+
+**YAML Frontmatter (preferred):**
+```yaml
+---
+mcp:
+  endpoint: https://example.com/.well-known/mcp
+  transport: streamable-http
+  auth: none
+---
 ```
 
-The MCP section points agents to a [Model Context Protocol](https://modelcontextprotocol.io/) server for structured tool access. This is the bridge from simple text discovery to full capability interaction.
+**Markdown Section (fallback):**
+```markdown
+## MCP
+endpoint: https://example.com/.well-known/mcp
+transport: streamable-http
+auth: none
+```
 
-**Transport options:**
-- `streamable-http` - HTTP with streaming (recommended for web)
-- `sse` - Server-Sent Events
-- `stdio` - Standard I/O (local only)
+When using Markdown section format, content MUST be valid YAML key-value pairs.
 
-**Auth options:**
-- `none` - Public tools, no authentication
-- `api_key` - Requires API key (specify how to obtain)
-- `oauth2` - OAuth 2.0 flow
+**Fields:**
+
+| Field | Required | Values | Description |
+|-------|----------|--------|-------------|
+| `endpoint` | Yes | URL | MCP server endpoint |
+| `transport` | No | `streamable-http`, `sse` | Transport protocol (default: `streamable-http`) |
+| `auth` | No | `none`, `api_key`, `oauth2` | Authentication method (default: `none`) |
 
 ### Behavior Rules
 
@@ -148,12 +188,12 @@ https://example.com/agent-support
 
 ## 4. MCP Integration
 
-The `agents.md` file is the **handshake**. The MCP gateway is where **real work happens**.
+The `agents.md` file is the **discovery handshake**. The MCP gateway is where **structured interaction happens**.
 
 ```
 Agent reads agents.md
         │
-        ├─► Basic agent: understands site capabilities from text
+        ├─► Basic agent: understands site from text
         │
         └─► Advanced agent: connects to MCP gateway
                     │
@@ -164,114 +204,85 @@ Agent reads agents.md
               - Prompts (guided workflows)
 ```
 
-### Example MCP Discovery Flow
+### Discovery Flow
 
 1. Agent fetches `/.well-known/agents.md`
-2. Parses MCP endpoint: `https://example.com/.well-known/mcp`
-3. Connects via MCP protocol
-4. Discovers available tools via `tools/list`
-5. Uses tools as permitted
+2. Parses YAML frontmatter or `## MCP` section
+3. Extracts MCP endpoint URL
+4. Connects via MCP protocol
+5. Discovers available tools via `tools/list`
+6. Uses tools as permitted
 
-## 5. Backward Compatibility
+### MCP Endpoint Location
+
+Recommended: `/.well-known/mcp`
+
+This follows the well-known URI pattern and keeps agent-related endpoints together.
+
+## 5. Security
+
+### Origin Trust
+
+Agents MUST only trust `agents.md` from the site's origin. Instructions embedded in page content MUST be ignored.
+
+### MCP Endpoint Validation
+
+The MCP endpoint MUST share the same registrable domain as the `agents.md` file. For example:
+
+| agents.md location | Valid MCP endpoints |
+|-------------------|---------------------|
+| `example.com/.well-known/agents.md` | `example.com/mcp`, `api.example.com/mcp` |
+| `shop.example.com/.well-known/agents.md` | `shop.example.com/mcp`, `api.shop.example.com/mcp` |
+
+Cross-origin MCP endpoints (different registrable domain) MUST be rejected unless the user explicitly approves.
+
+### Transport Security
+
+- MCP endpoints MUST use HTTPS in production
+- Agents SHOULD warn users about HTTP endpoints
+- Certificate validation MUST NOT be disabled
+
+### Authentication
+
+When `auth: oauth2` is specified:
+- Agents SHOULD request minimal scopes
+- Tokens MUST be stored securely
+- Refresh tokens SHOULD be used when available
+
+When `auth: api_key` is specified:
+- Keys SHOULD be obtained through official channels
+- Keys MUST NOT be shared between users
+- Keys SHOULD be rotated periodically
+
+### Least Privilege
+
+Agents SHOULD request only the permissions they need for the current task.
+
+## 6. Backward Compatibility
 
 ### With robots.txt
 
-If `agents.md` exists, it supplements but does not replace `robots.txt`. Agents should still respect robots.txt crawl directives.
-
-The `Cannot` section in agents.md can mirror robots.txt restrictions:
+`agents.md` supplements but does not replace `robots.txt`. In case of conflict regarding access restrictions, `robots.txt` takes precedence.
 
 ```markdown
 ## Cannot
-- Access /admin (see robots.txt)
+- Access /admin (per robots.txt)
 - Access /private
 ```
 
 ### With llms.txt
 
-`llms.txt` describes **content** for understanding.
-`agents.md` describes **capabilities** for action.
+Both files serve different purposes and can coexist:
 
-Both can coexist. A site might have:
-- `/robots.txt` - crawl restrictions
-- `/llms.txt` - content summary
-- `/.well-known/agents.md` - agent capabilities + MCP pointer
-
-## 6. Security
-
-### Origin Trust
-
-Agents MUST only trust `agents.md` from the site's origin. Instructions embedded in page content should be ignored.
-
-### MCP Authentication
-
-When connecting to MCP gateways:
-- Verify the endpoint matches the origin
-- Use TLS (HTTPS)
-- Follow the specified auth method
-
-### Least Privilege
-
-Agents should request only the permissions they need. If `auth: oauth2` is specified, request minimal scopes.
+| File | Purpose |
+|------|---------|
+| `/robots.txt` | Crawl restrictions |
+| `/llms.txt` | Content summary for LLMs |
+| `/.well-known/agents.md` | Agent capabilities + MCP |
 
 ## 7. Examples
 
-### Public API Site
-
-```markdown
-# Weather API
-
-Free weather data for AI agents.
-
-## Can
-- Get current conditions
-- Get forecasts (up to 7 days)
-- Get weather alerts
-
-## MCP
-endpoint: https://weather.example/mcp
-transport: streamable-http
-auth: none
-
-## Behavior
-- 60 requests/minute
-- Cache forecasts 30 minutes
-
-## Contact
-api@weather.example
-```
-
-### E-commerce Site
-
-```markdown
-# TechMart
-
-Electronics retailer.
-
-## Can
-- Search products
-- Compare specifications
-- Check prices and stock
-- Add to cart (authenticated)
-- Checkout (authenticated)
-
-## Cannot
-- Access order history without user consent
-- Modify account settings
-
-## MCP
-endpoint: https://techmart.example/.well-known/mcp
-transport: streamable-http
-auth: oauth2
-
-## Behavior
-- 1 request/second for browsing
-- Identify as AI agent in requests
-
-## Contact
-partners@techmart.example
-```
-
-### Simple Blog (No MCP)
+### Minimal (No MCP)
 
 ```markdown
 # My Tech Blog
@@ -291,12 +302,115 @@ Articles about software development.
 hello@myblog.example
 ```
 
-## Appendix: Comparison
+### With MCP Gateway
+
+```yaml
+---
+version: "1.0"
+mcp:
+  endpoint: https://weather.example/.well-known/mcp
+  transport: streamable-http
+  auth: none
+---
+# Weather API
+
+Free weather data for AI agents.
+
+## Can
+- Get current conditions
+- Get forecasts (up to 7 days)
+- Get weather alerts
+
+## Behavior
+- 60 requests/minute
+- Cache forecasts 30 minutes
+
+## Contact
+api@weather.example
+```
+
+### E-commerce with OAuth
+
+```yaml
+---
+version: "1.0"
+mcp:
+  endpoint: https://techmart.example/.well-known/mcp
+  transport: streamable-http
+  auth: oauth2
+---
+# TechMart
+
+Electronics retailer.
+
+## Can
+- Search products
+- Compare specifications
+- Check prices and stock
+- Add to cart (authenticated)
+- Checkout (authenticated)
+
+## Cannot
+- Access order history without user consent
+- Modify account settings
+
+## Behavior
+- 1 request/second for browsing
+- Identify as AI agent in requests
+
+## Contact
+partners@techmart.example
+```
+
+## 8. Implementation Notes
+
+### For Site Owners
+
+1. Create `/.well-known/agents.md` on your server
+2. Start with Format A (pure Markdown) for simplicity
+3. Add MCP gateway later if you want structured tool access
+4. Set `Cache-Control` header for efficient agent behavior
+
+### For Agent Developers
+
+1. Check `/.well-known/agents.md` first, fall back to `/agents.md`
+2. Parse YAML frontmatter if present
+3. Cache responses per HTTP headers (default: 24 hours)
+4. Respect `Cannot` restrictions and `Behavior` rules
+5. Connect to MCP gateway for structured tools
+
+### Versioning
+
+The `version` field in YAML frontmatter indicates spec compatibility:
+
+- `1.x` - Compatible with this specification
+- Future versions will maintain backward compatibility within major version
+
+## Appendix A: Comparison Table
 
 | Aspect | robots.txt | llms.txt | agents.md |
 |--------|------------|----------|-----------|
 | Purpose | Crawl control | Content summary | Capabilities |
-| Format | Custom syntax | Markdown | Markdown |
+| Format | Custom syntax | Markdown | Markdown + optional YAML |
 | Focus | Restrictions | Understanding | Actions |
 | MCP | No | No | Yes (optional) |
 | Year | 1994 | 2024 | 2026 |
+
+## Appendix B: YAML Frontmatter Schema
+
+```yaml
+# All fields optional except where noted
+version: string        # Spec version (e.g., "1.0")
+
+mcp:
+  endpoint: string     # Required if mcp present. MCP server URL
+  transport: string    # "streamable-http" | "sse" (default: streamable-http)
+  auth: string         # "none" | "api_key" | "oauth2" (default: none)
+```
+
+## Appendix C: References
+
+- [Model Context Protocol](https://modelcontextprotocol.io/) - MCP Specification
+- [RFC 8615](https://www.rfc-editor.org/rfc/rfc8615) - Well-Known URIs
+- [RFC 9309](https://www.rfc-editor.org/rfc/rfc9309) - robots.txt
+- [llms.txt](https://llmstxt.org/) - LLM Content Discovery