From f2b3a146854f03ea69f155291d2caca35c714aa5 Mon Sep 17 00:00:00 2001
From: Bruno Sarlo <bruno@sarlo.uy>
Date: Wed, 14 Jan 2026 09:01:56 -0300
Subject: [PATCH] v0.2: Simplify spec + add MCP gateway integration

Major revision based on first principles thinking:
- Simplified format: plain Markdown, human readable
- Focus on capabilities (Can/Cannot) not API schemas
- MCP gateway pointer for structured tool access
- Clear positioning vs robots.txt and llms.txt

The agents.md file is the handshake.
The MCP gateway is where real work happens.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
---
 README.md               |  88 ++++++++++++
 docs/FAQ.md             | 111 +++++++++++++++
 examples/ecommerce.md   | 165 ++++++++++++++++++++++
 examples/weather-api.md | 106 ++++++++++++++
 spec/README.md          | 302 ++++++++++++++++++++++++++++++++++++++++
 5 files changed, 772 insertions(+)
 create mode 100644 README.md
 create mode 100644 docs/FAQ.md
 create mode 100644 examples/ecommerce.md
 create mode 100644 examples/weather-api.md
 create mode 100644 spec/README.md

diff --git a/README.md b/README.md
new file mode 100644
index 0000000..c171f6a
--- /dev/null
+++ b/README.md
@@ -0,0 +1,88 @@
+# agents.md
+
+**Tell AI agents what they can do on your website.**
+
+## The Gap
+
+| File | Purpose |
+|------|---------|
+| robots.txt | What bots **cannot access** |
+| llms.txt | What **content matters** |
+| **agents.md** | What agents **can do** |
+
+## Quick Start
+
+Create `/.well-known/agents.md`:
+
+```markdown
+# My Site
+
+An online bookstore.
+
+## Can
+- Search catalog
+- Read book details
+- Check availability
+
+## Cannot
+- Place orders (requires human)
+
+## Contact
+hello@mysite.com
+```
+
+That's it. Plain text. Human readable.
+
+## With MCP Gateway
+
+Point agents to your MCP server for structured tool access:
+
+```markdown
+# My Site
+
+An online bookstore.
+
+## Can
+- Search and browse
+- Check prices
+- Place orders (authenticated)
+
+## MCP
+endpoint: https://mysite.com/.well-known/mcp
+transport: streamable-http
+auth: oauth2
+
+## Contact
+hello@mysite.com
+```
+
+## How It Works
+
+```
+Agent requests /.well-known/agents.md
+              │
+              ├─► Basic: reads text, understands capabilities
+              │
+              └─► Advanced: connects to MCP gateway for tools
+```
+
+## Documentation
+
+- [Specification](./spec/README.md) - Full protocol spec
+- [Examples](./examples/) - Real-world examples
+- [FAQ](./docs/FAQ.md) - Common questions
+
+## Status
+
+**Draft** - Version 0.2.0
+
+## Related Standards
+
+- [robots.txt](https://www.rfc-editor.org/rfc/rfc9309) - Crawl restrictions (1994)
+- [llms.txt](https://llmstxt.org/) - Content for LLMs (2024)
+- [AGENTS.md](https://agents.md/) - Repository instructions (2025)
+- [MCP](https://modelcontextprotocol.io/) - Tool protocol (2024)
+
+## License
+
+CC0 1.0 Universal - Public Domain
diff --git a/docs/FAQ.md b/docs/FAQ.md
new file mode 100644
index 0000000..bc8ed16
--- /dev/null
+++ b/docs/FAQ.md
@@ -0,0 +1,111 @@
+# Frequently Asked Questions
+
+## General
+
+### Why not just use robots.txt?
+
+robots.txt tells bots what they *cannot* do. agent.md tells AI agents what they *can* do. They're complementary:
+
+- robots.txt: "Don't crawl /admin"
+- agent.md: "You can search our catalog via this API"
+
+### Why Markdown?
+
+1. Human readable and editable
+2. Widely supported in documentation tools
+3. YAML frontmatter is a proven pattern
+4. Renders nicely on GitHub and documentation sites
+
+### Is this related to MCP (Model Context Protocol)?
+
+Inspired by MCP, but designed for web discovery rather than local tool execution. The tool definition format is similar to make agent.md familiar to developers already using MCP.
+
+### Why /.well-known/?
+
+Following RFC 8615 for well-known URIs. This is the same pattern used by:
+- `/.well-known/security.txt`
+- `/.well-known/apple-app-site-association`
+- `/.well-known/openid-configuration`
+
+## Implementation
+
+### Do I need to remove my robots.txt?
+
+No. Keep your robots.txt for traditional crawlers. The `robots` section in agent.md can mirror or extend those rules for AI agents.
+
+### Can I have different capabilities for different agents?
+
+Yes. Use content negotiation based on User-Agent or implement OAuth2 scopes for fine-grained access control.
+
+### How should agents authenticate?
+
+Start simple:
+1. No auth for public read-only tools
+2. API keys for rate-limited or premium features
+3. OAuth2 for user-specific actions
+
+### What if my API changes?
+
+Update your agent.md. Agents should re-fetch periodically (respect Cache-Control headers).
+
+## Security
+
+### Can malicious sites trick agents?
+
+The protocol specifies that agents MUST only parse agent.md from the site's origin. Instructions in page content are ignored.
+
+### How do I prevent abuse?
+
+1. Use rate limits (per-tool and global)
+2. Require API keys for sensitive operations
+3. Use OAuth2 scopes for user actions
+4. Monitor usage patterns
+
+### Should I expose all my APIs?
+
+No. Only expose what you want agents to use. Internal APIs, admin endpoints, and sensitive operations should not be in agent.md.
+
+## Compatibility
+
+### What about GraphQL APIs?
+
+You can define tools that call GraphQL endpoints:
+
+```yaml
+tools:
+  - name: query_products
+    endpoint: "POST /graphql"
+    parameters:
+      type: object
+      properties:
+        query:
+          type: string
+          description: "GraphQL query (limited to Product type)"
+```
+
+### Can I use this with OpenAPI/Swagger?
+
+Yes! You can generate agent.md from OpenAPI specs. We're working on tooling for this.
+
+### What about WebSocket endpoints?
+
+agent.md focuses on request-response patterns. For real-time features, document WebSocket endpoints in the Markdown section but define polling alternatives as tools.
+
+## Adoption
+
+### How do I tell if a site supports agent.md?
+
+1. Check `/.well-known/agent.md`
+2. Look for `<link rel="agent">` in HTML
+3. Check for `Link` header in HTTP response
+
+### What if the site doesn't have agent.md?
+
+Fall back to:
+1. Traditional web scraping (respecting robots.txt)
+2. Looking for documented APIs
+3. Using general browsing capabilities
+
+### Who decides what goes in agent.md?
+
+Site owners. This is an opt-in protocol. Sites choose what capabilities to expose.
diff --git a/examples/ecommerce.md b/examples/ecommerce.md
new file mode 100644
index 0000000..a3f7207
--- /dev/null
+++ b/examples/ecommerce.md
@@ -0,0 +1,165 @@
+# Example: E-commerce Site
+
+This example shows how an e-commerce site might expose its API to AI agents.
+
+## agent.md
+
+```yaml
+---
+protocol_version: "0.1"
+name: "Tech Store"
+description: "Browse products, check prices, and manage wishlists"
+
+robots:
+  disallow:
+    - /admin
+    - /checkout
+    - /account/orders
+  crawl_delay: 2
+
+tools:
+  - name: search_products
+    description: "Search for products by name, category, or features"
+    endpoint: "GET /api/products/search"
+    parameters:
+      type: object
+      properties:
+        q:
+          type: string
+          description: "Search query"
+        category:
+          type: string
+          enum: ["laptops", "phones", "tablets", "accessories"]
+        min_price:
+          type: number
+          minimum: 0
+        max_price:
+          type: number
+        in_stock:
+          type: boolean
+          default: true
+        sort:
+          type: string
+          enum: ["price_asc", "price_desc", "rating", "newest"]
+          default: "rating"
+        limit:
+          type: integer
+          default: 20
+          maximum: 50
+      required:
+        - q
+    auth: none
+    rate_limit: "100/minute"
+
+  - name: get_product
+    description: "Get detailed product information including specs and reviews"
+    endpoint: "GET /api/products/{id}"
+    parameters:
+      type: object
+      properties:
+        id:
+          type: string
+        include_reviews:
+          type: boolean
+          default: false
+      required:
+        - id
+    auth: none
+
+  - name: compare_products
+    description: "Compare specifications of multiple products"
+    endpoint: "POST /api/products/compare"
+    parameters:
+      type: object
+      properties:
+        product_ids:
+          type: array
+          items:
+            type: string
+          minItems: 2
+          maxItems: 5
+      required:
+        - product_ids
+    auth: none
+
+  - name: get_price_history
+    description: "Get price history for a product"
+    endpoint: "GET /api/products/{id}/price-history"
+    parameters:
+      type: object
+      properties:
+        id:
+          type: string
+        days:
+          type: integer
+          default: 30
+          maximum: 365
+      required:
+        - id
+    auth: api_key
+    scopes:
+      - price:read
+
+  - name: add_to_wishlist
+    description: "Add a product to user's wishlist"
+    endpoint: "POST /api/wishlist"
+    parameters:
+      type: object
+      properties:
+        product_id:
+          type: string
+        notify_on_sale:
+          type: boolean
+          default: true
+      required:
+        - product_id
+    auth: oauth2
+    scopes:
+      - wishlist:write
+
+auth:
+  api_key:
+    header: "X-API-Key"
+    obtain: "https://techstore.example/developers"
+
+  oauth2:
+    authorization_url: "https://techstore.example/oauth/authorize"
+    token_url: "https://techstore.example/oauth/token"
+    scopes:
+      wishlist:read: "View your wishlist"
+      wishlist:write: "Modify your wishlist"
+      price:read: "Access price history data"
+
+rate_limits:
+  global: "1000/hour"
+  per_tool: true
+
+contact:
+  email: "api@techstore.example"
+  url: "https://techstore.example/developers/docs"
+---
+
+# Tech Store Agent API
+
+AI agents can help users find products, compare prices, and track deals.
+
+## Public Tools (No Auth)
+
+- **search_products** - Find products by name or category
+- **get_product** - Get detailed product info
+- **compare_products** - Side-by-side comparison
+
+## Authenticated Tools
+
+### API Key Required
+- **get_price_history** - Historical pricing data
+
+### OAuth2 Required
+- **add_to_wishlist** - Save products for later
+
+## Best Practices
+
+1. Cache product details (they don't change often)
+2. Use price history to advise on purchase timing
+3. Respect rate limits during peak hours
+```
diff --git a/examples/weather-api.md b/examples/weather-api.md
new file mode 100644
index 0000000..d103f1f
--- /dev/null
+++ b/examples/weather-api.md
@@ -0,0 +1,106 @@
+# Example: Weather Service
+
+A simple weather API demonstrating minimal and practical agent.md usage.
+
+## agent.md
+
+```yaml
+---
+protocol_version: "0.1"
+name: "Weather Service"
+description: "Current weather and forecasts for any location"
+
+tools:
+  - name: get_current
+    description: "Get current weather conditions for a location"
+    endpoint: "GET /api/weather/current"
+    parameters:
+      type: object
+      properties:
+        location:
+          type: string
+          description: "City name, address, or coordinates (lat,lon)"
+        units:
+          type: string
+          enum: ["metric", "imperial"]
+          default: "metric"
+      required:
+        - location
+    response:
+      type: json
+      schema:
+        type: object
+        properties:
+          temperature:
+            type: number
+          feels_like:
+            type: number
+          humidity:
+            type: integer
+          conditions:
+            type: string
+          wind_speed:
+            type: number
+    auth: none
+    rate_limit: "60/minute"
+
+  - name: get_forecast
+    description: "Get weather forecast for upcoming days"
+    endpoint: "GET /api/weather/forecast"
+    parameters:
+      type: object
+      properties:
+        location:
+          type: string
+        days:
+          type: integer
+          minimum: 1
+          maximum: 14
+          default: 7
+        units:
+          type: string
+          enum: ["metric", "imperial"]
+          default: "metric"
+      required:
+        - location
+    auth: api_key
+    rate_limit: "30/minute"
+
+  - name: get_alerts
+    description: "Get active weather alerts for a location"
+    endpoint: "GET /api/weather/alerts"
+    parameters:
+      type: object
+      properties:
+        location:
+          type: string
+      required:
+        - location
+    auth: none
+
+auth:
+  api_key:
+    header: "X-Weather-Key"
+    obtain: "https://weather.example/api-keys"
+    description: "Free tier: 1000 requests/day"
+
+rate_limits:
+  global: "1000/day"
+
+contact:
+  url: "https://weather.example/api/docs"
+---
+
+# Weather API for Agents
+
+Simple, reliable weather data.
+
+## Free Tools
+- Current conditions (no key needed)
+- Weather alerts (no key needed)
+
+## API Key Required
+- Extended forecasts (up to 14 days)
+
+Get your free API key at weather.example/api-keys
+```
diff --git a/spec/README.md b/spec/README.md
new file mode 100644
index 0000000..b2c4251
--- /dev/null
+++ b/spec/README.md
@@ -0,0 +1,302 @@
+# agents.md Protocol Specification
+
+**Version:** 0.2.0
+**Status:** Draft
+**Updated:** 2026-01-14
+
+## Abstract
+
+A simple text file that tells AI agents what they can do on a website and optionally points them to an MCP gateway for structured tool access.
+
+## Philosophy
+
+| Standard | Tells agents... |
+|----------|-----------------|
+| robots.txt | What you **cannot access** |
+| llms.txt | What **content is important** |
+| **agents.md** | What you **can do** + where to connect |
+
+## 1. Discovery
+
+**Location:** `/.well-known/agents.md` or `/agents.md`
+
+**Content-Type:** `text/markdown` or `text/plain`
+
+Agents request the file like any HTTP resource:
+
+```
+GET /.well-known/agents.md HTTP/1.1
+Host: example.com
+User-Agent: MyAgent/1.0
+```
+
+## 2. Format
+
+Plain Markdown. Human readable. Machine parseable.
+
+### Minimal Example
+
+```markdown
+# Example Site
+
+A bookstore since 2010.
+
+## Can
+- Search catalog
+- Read book details
+- Check availability
+
+## Cannot
+- Place orders without human
+- Access user accounts
+
+## Contact
+agents@example.com
+```
+
+### With MCP Gateway
+
+```markdown
+# Example Bookstore
+
+Online bookstore with 50,000 titles.
+
+## Can
+- Search and browse catalog
+- Read reviews and descriptions
+- Check prices and stock
+- Place orders (authenticated)
+
+## Cannot
+- Modify user accounts
+- Access admin functions
+
+## MCP
+endpoint: https://example.com/.well-known/mcp
+transport: streamable-http
+
+## Behavior
+- Respect 1 request/second
+- Cache product data 1 hour
+- Identify in User-Agent header
+
+## Contact
+agents@example.com
+```
+
+## 3. Sections
+
+All sections are optional. Use what makes sense.
+
+### Identity (Header)
+
+```markdown
+# Site Name
+
+Brief description of what this site is.
+```
+
+### Capabilities (Can/Cannot)
+
+```markdown
+## Can
+- Action agents are allowed to take
+- Another allowed action
+
+## Cannot
+- Restricted action
+- Another restriction
+```
+
+### MCP Gateway
+
+```markdown
+## MCP
+endpoint: <url>
+transport: streamable-http | sse | stdio
+auth: none | api_key | oauth2
+```
+
+The MCP section points agents to a [Model Context Protocol](https://modelcontextprotocol.io/) server for structured tool access. This is the bridge from simple text discovery to full capability interaction.
+
+**Transport options:**
+- `streamable-http` - HTTP with streaming (recommended for web)
+- `sse` - Server-Sent Events
+- `stdio` - Standard I/O (local only)
+
+**Auth options:**
+- `none` - Public tools, no authentication
+- `api_key` - Requires API key (specify how to obtain)
+- `oauth2` - OAuth 2.0 flow
+
+### Behavior Rules
+
+```markdown
+## Behavior
+- Rate limit guidance
+- Caching expectations
+- Identification requirements
+```
+
+### Contact
+
+```markdown
+## Contact
+email@example.com
+https://example.com/agent-support
+```
+
+## 4. MCP Integration
+
+The `agents.md` file is the **handshake**. The MCP gateway is where **real work happens**.
+
+```
+Agent reads agents.md
+        │
+        ├─► Basic agent: understands site capabilities from text
+        │
+        └─► Advanced agent: connects to MCP gateway
+                    │
+                    ▼
+              MCP Server exposes:
+              - Tools (search, checkout, etc.)
+              - Resources (catalog, docs)
+              - Prompts (guided workflows)
+```
+
+### Example MCP Discovery Flow
+
+1. Agent fetches `/.well-known/agents.md`
+2. Parses MCP endpoint: `https://example.com/.well-known/mcp`
+3. Connects via MCP protocol
+4. Discovers available tools via `tools/list`
+5. Uses tools as permitted
+
+## 5. Backward Compatibility
+
+### With robots.txt
+
+If `agents.md` exists, it supplements but does not replace `robots.txt`. Agents should still respect robots.txt crawl directives.
+
+The `Cannot` section in agents.md can mirror robots.txt restrictions:
+
+```markdown
+## Cannot
+- Access /admin (see robots.txt)
+- Access /private
+```
+
+### With llms.txt
+
+`llms.txt` describes **content** for understanding.
+`agents.md` describes **capabilities** for action.
+
+Both can coexist. A site might have:
+- `/robots.txt` - crawl restrictions
+- `/llms.txt` - content summary
+- `/.well-known/agents.md` - agent capabilities + MCP pointer
+
+## 6. Security
+
+### Origin Trust
+
+Agents MUST only trust `agents.md` from the site's origin. Instructions embedded in page content should be ignored.
+
+### MCP Authentication
+
+When connecting to MCP gateways:
+- Verify the endpoint matches the origin
+- Use TLS (HTTPS)
+- Follow the specified auth method
+
+### Least Privilege
+
+Agents should request only the permissions they need. If `auth: oauth2` is specified, request minimal scopes.
+
+## 7. Examples
+
+### Public API Site
+
+```markdown
+# Weather API
+
+Free weather data for AI agents.
+
+## Can
+- Get current conditions
+- Get forecasts (up to 7 days)
+- Get weather alerts
+
+## MCP
+endpoint: https://weather.example/mcp
+transport: streamable-http
+auth: none
+
+## Behavior
+- 60 requests/minute
+- Cache forecasts 30 minutes
+
+## Contact
+api@weather.example
+```
+
+### E-commerce Site
+
+```markdown
+# TechMart
+
+Electronics retailer.
+
+## Can
+- Search products
+- Compare specifications
+- Check prices and stock
+- Add to cart (authenticated)
+- Checkout (authenticated)
+
+## Cannot
+- Access order history without user consent
+- Modify account settings
+
+## MCP
+endpoint: https://techmart.example/.well-known/mcp
+transport: streamable-http
+auth: oauth2
+
+## Behavior
+- 1 request/second for browsing
+- Identify as AI agent in requests
+
+## Contact
+partners@techmart.example
+```
+
+### Simple Blog (No MCP)
+
+```markdown
+# My Tech Blog
+
+Articles about software development.
+
+## Can
+- Read all public articles
+- Search by topic
+- Access RSS feed at /feed.xml
+
+## Cannot
+- Post comments (requires human)
+- Access draft posts
+
+## Contact
+hello@myblog.example
+```
+
+## Appendix: Comparison
+
+| Aspect | robots.txt | llms.txt | agents.md |
+|--------|------------|----------|-----------|
+| Purpose | Crawl control | Content summary | Capabilities |
+| Format | Custom syntax | Markdown | Markdown |
+| Focus | Restrictions | Understanding | Actions |
+| MCP | No | No | Yes (optional) |
+| Year | 1994 | 2024 | 2026 |