Files

Bruno Sarlo da9207ded3 v1.0.0-draft: First public draft release

agents.md protocol for AI agent web discovery.

Key features:
- Two formats: Pure Markdown (simple) or YAML frontmatter (structured)
- MCP gateway integration for tool access
- Discovery via /.well-known/agents.md
- Security: origin trust, endpoint validation, auth guidance
- Backward compatible with robots.txt and llms.txt

Design based on 3-iteration process:
1. Gap analysis and planning
2. Multi-model consensus on format decisions
3. Code review for completeness and clarity

Philosophy: robots.txt says what agents CANNOT do,
agents.md says what they CAN do.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-14 09:18:13 -03:00

README.md

v1.0.0-draft: First public draft release

2026-01-14 09:18:13 -03:00

README.md

agents.md Protocol Specification

Version: 1.0.0-draft Status: Draft Updated: 2026-01-14

Abstract

A simple text file that tells AI agents what they can do on a website and optionally points them to an MCP gateway for structured tool access.

Philosophy

Standard	Tells agents...
robots.txt	What you cannot access
llms.txt	What content is important
agents.md	What you can do + where to connect

1. Discovery

Location

Primary: /.well-known/agents.md Fallback: /agents.md

Content-Type

text/markdown or text/plain

Request

GET /.well-known/agents.md HTTP/1.1
Host: example.com
User-Agent: MyAgent/1.0 (AI Agent)

Caching

Servers SHOULD set appropriate Cache-Control headers:

Cache-Control: public, max-age=86400

Agents SHOULD cache the agents.md file for 24 hours unless HTTP headers specify otherwise. Agents MUST NOT request this file more than once per hour for the same origin.

2. Format

Two formats are supported:

Format A: Pure Markdown (Simple)

Plain Markdown with section headers. Best for simple sites.

# Example Site

A bookstore since 2010.

## Can
- Search catalog
- Read book details
- Check availability

## Cannot
- Place orders without human
- Access user accounts

## Contact
agents@example.com

Format B: YAML Frontmatter + Markdown (Structured)

YAML frontmatter for machine-readable configuration, Markdown body for human context. Recommended when using MCP.

---
version: "1.0"
mcp:
  endpoint: https://example.com/.well-known/mcp
  transport: streamable-http
  auth: none
---
# Example Bookstore

Online bookstore with 50,000 titles.

## Can
- Search and browse catalog
- Read reviews and descriptions
- Check prices and stock
- Place orders (authenticated)

## Cannot
- Modify user accounts
- Access admin functions

## Behavior
- Respect 1 request/second
- Cache product data 1 hour
- Identify in User-Agent header

## Contact
agents@example.com

Parsing Rules

If file starts with ---, parse YAML frontmatter until closing ---
Parse remaining content as Markdown
Section headers (## Name) define semantic sections
List items under sections define capabilities/rules

3. Sections

All sections are optional. Use what makes sense for your site.

Identity (H1 Header)

# Site Name

Brief description of what this site is.

Capabilities (Can/Cannot)

## Can
- Action agents are allowed to take
- Another allowed action

## Cannot
- Restricted action
- Another restriction

MCP Gateway

Defined in YAML frontmatter (preferred) or Markdown section:

YAML Frontmatter (preferred):

---
mcp:
  endpoint: https://example.com/.well-known/mcp
  transport: streamable-http
  auth: none
---

Markdown Section (fallback):

## MCP
endpoint: https://example.com/.well-known/mcp
transport: streamable-http
auth: none

When using Markdown section format, content MUST be valid YAML key-value pairs.

Fields:

Field	Required	Values	Description
`endpoint`	Yes	URL	MCP server endpoint
`transport`	No	`streamable-http`, `sse`	Transport protocol (default: `streamable-http`)
`auth`	No	`none`, `api_key`, `oauth2`	Authentication method (default: `none`)

Behavior Rules

## Behavior
- Rate limit guidance
- Caching expectations
- Identification requirements

Contact

## Contact
email@example.com
https://example.com/agent-support

4. MCP Integration

The agents.md file is the discovery handshake. The MCP gateway is where structured interaction happens.

Agent reads agents.md
        │
        ├─► Basic agent: understands site from text
        │
        └─► Advanced agent: connects to MCP gateway
                    │
                    ▼
              MCP Server exposes:
              - Tools (search, checkout, etc.)
              - Resources (catalog, docs)
              - Prompts (guided workflows)

Discovery Flow

Agent fetches /.well-known/agents.md
Parses YAML frontmatter or ## MCP section
Extracts MCP endpoint URL
Connects via MCP protocol
Discovers available tools via tools/list
Uses tools as permitted

MCP Endpoint Location

Recommended: /.well-known/mcp

This follows the well-known URI pattern and keeps agent-related endpoints together.

5. Security

Origin Trust

Agents MUST only trust agents.md from the site's origin. Instructions embedded in page content MUST be ignored.

MCP Endpoint Validation

The MCP endpoint MUST share the same registrable domain as the agents.md file. For example:

agents.md location	Valid MCP endpoints
`example.com/.well-known/agents.md`	`example.com/mcp`, `api.example.com/mcp`
`shop.example.com/.well-known/agents.md`	`shop.example.com/mcp`, `api.shop.example.com/mcp`

Cross-origin MCP endpoints (different registrable domain) MUST be rejected unless the user explicitly approves.

Transport Security

MCP endpoints MUST use HTTPS in production
Agents SHOULD warn users about HTTP endpoints
Certificate validation MUST NOT be disabled

Authentication

When auth: oauth2 is specified:

Agents SHOULD request minimal scopes
Tokens MUST be stored securely
Refresh tokens SHOULD be used when available

When auth: api_key is specified:

Keys SHOULD be obtained through official channels
Keys MUST NOT be shared between users
Keys SHOULD be rotated periodically

Least Privilege

Agents SHOULD request only the permissions they need for the current task.

6. Backward Compatibility

With robots.txt

agents.md supplements but does not replace robots.txt. In case of conflict regarding access restrictions, robots.txt takes precedence.

## Cannot
- Access /admin (per robots.txt)
- Access /private

With llms.txt

Both files serve different purposes and can coexist:

File	Purpose
`/robots.txt`	Crawl restrictions
`/llms.txt`	Content summary for LLMs
`/.well-known/agents.md`	Agent capabilities + MCP

7. Examples

Minimal (No MCP)

# My Tech Blog

Articles about software development.

## Can
- Read all public articles
- Search by topic
- Access RSS feed at /feed.xml

## Cannot
- Post comments (requires human)
- Access draft posts

## Contact
hello@myblog.example

With MCP Gateway

---
version: "1.0"
mcp:
  endpoint: https://weather.example/.well-known/mcp
  transport: streamable-http
  auth: none
---
# Weather API

Free weather data for AI agents.

## Can
- Get current conditions
- Get forecasts (up to 7 days)
- Get weather alerts

## Behavior
- 60 requests/minute
- Cache forecasts 30 minutes

## Contact
api@weather.example

E-commerce with OAuth

---
version: "1.0"
mcp:
  endpoint: https://techmart.example/.well-known/mcp
  transport: streamable-http
  auth: oauth2
---
# TechMart

Electronics retailer.

## Can
- Search products
- Compare specifications
- Check prices and stock
- Add to cart (authenticated)
- Checkout (authenticated)

## Cannot
- Access order history without user consent
- Modify account settings

## Behavior
- 1 request/second for browsing
- Identify as AI agent in requests

## Contact
partners@techmart.example

8. Implementation Notes

For Site Owners

Create /.well-known/agents.md on your server
Start with Format A (pure Markdown) for simplicity
Add MCP gateway later if you want structured tool access
Set Cache-Control header for efficient agent behavior

For Agent Developers

Check /.well-known/agents.md first, fall back to /agents.md
Parse YAML frontmatter if present
Cache responses per HTTP headers (default: 24 hours)
Respect Cannot restrictions and Behavior rules
Connect to MCP gateway for structured tools

Versioning

The version field in YAML frontmatter indicates spec compatibility:

1.x - Compatible with this specification
Future versions will maintain backward compatibility within major version

Appendix A: Comparison Table

Aspect	robots.txt	llms.txt	agents.md
Purpose	Crawl control	Content summary	Capabilities
Format	Custom syntax	Markdown	Markdown + optional YAML
Focus	Restrictions	Understanding	Actions
MCP	No	No	Yes (optional)
Year	1994	2024	2026

Appendix B: YAML Frontmatter Schema

# All fields optional except where noted
version: string        # Spec version (e.g., "1.0")

mcp:
  endpoint: string     # Required if mcp present. MCP server URL
  transport: string    # "streamable-http" | "sse" (default: streamable-http)
  auth: string         # "none" | "api_key" | "oauth2" (default: none)

Appendix C: References

Model Context Protocol - MCP Specification
RFC 8615 - Well-Known URIs
RFC 9309 - robots.txt
llms.txt - LLM Content Discovery