v0.2: Simplify spec + add MCP gateway integration

Major revision based on first principles thinking:
- Simplified format: plain Markdown, human readable
- Focus on capabilities (Can/Cannot) not API schemas
- MCP gateway pointer for structured tool access
- Clear positioning vs robots.txt and llms.txt

The agents.md file is the handshake.
The MCP gateway is where real work happens.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
2026-01-14 09:01:56 -03:00
commit f2b3a14685
5 changed files with 772 additions and 0 deletions

88
README.md Normal file
View File

@@ -0,0 +1,88 @@
# agents.md
**Tell AI agents what they can do on your website.**
## The Gap
| File | Purpose |
|------|---------|
| robots.txt | What bots **cannot access** |
| llms.txt | What **content matters** |
| **agents.md** | What agents **can do** |
## Quick Start
Create `/.well-known/agents.md`:
```markdown
# My Site
An online bookstore.
## Can
- Search catalog
- Read book details
- Check availability
## Cannot
- Place orders (requires human)
## Contact
hello@mysite.com
```
That's it. Plain text. Human readable.
## With MCP Gateway
Point agents to your MCP server for structured tool access:
```markdown
# My Site
An online bookstore.
## Can
- Search and browse
- Check prices
- Place orders (authenticated)
## MCP
endpoint: https://mysite.com/.well-known/mcp
transport: streamable-http
auth: oauth2
## Contact
hello@mysite.com
```
## How It Works
```
Agent requests /.well-known/agents.md
├─► Basic: reads text, understands capabilities
└─► Advanced: connects to MCP gateway for tools
```
## Documentation
- [Specification](./spec/README.md) - Full protocol spec
- [Examples](./examples/) - Real-world examples
- [FAQ](./docs/FAQ.md) - Common questions
## Status
**Draft** - Version 0.2.0
## Related Standards
- [robots.txt](https://www.rfc-editor.org/rfc/rfc9309) - Crawl restrictions (1994)
- [llms.txt](https://llmstxt.org/) - Content for LLMs (2024)
- [AGENTS.md](https://agents.md/) - Repository instructions (2025)
- [MCP](https://modelcontextprotocol.io/) - Tool protocol (2024)
## License
CC0 1.0 Universal - Public Domain

111
docs/FAQ.md Normal file
View File

@@ -0,0 +1,111 @@
# Frequently Asked Questions
## General
### Why not just use robots.txt?
robots.txt tells bots what they *cannot* do. agent.md tells AI agents what they *can* do. They're complementary:
- robots.txt: "Don't crawl /admin"
- agent.md: "You can search our catalog via this API"
### Why Markdown?
1. Human readable and editable
2. Widely supported in documentation tools
3. YAML frontmatter is a proven pattern
4. Renders nicely on GitHub and documentation sites
### Is this related to MCP (Model Context Protocol)?
Inspired by MCP, but designed for web discovery rather than local tool execution. The tool definition format is similar to make agent.md familiar to developers already using MCP.
### Why /.well-known/?
Following RFC 8615 for well-known URIs. This is the same pattern used by:
- `/.well-known/security.txt`
- `/.well-known/apple-app-site-association`
- `/.well-known/openid-configuration`
## Implementation
### Do I need to remove my robots.txt?
No. Keep your robots.txt for traditional crawlers. The `robots` section in agent.md can mirror or extend those rules for AI agents.
### Can I have different capabilities for different agents?
Yes. Use content negotiation based on User-Agent or implement OAuth2 scopes for fine-grained access control.
### How should agents authenticate?
Start simple:
1. No auth for public read-only tools
2. API keys for rate-limited or premium features
3. OAuth2 for user-specific actions
### What if my API changes?
Update your agent.md. Agents should re-fetch periodically (respect Cache-Control headers).
## Security
### Can malicious sites trick agents?
The protocol specifies that agents MUST only parse agent.md from the site's origin. Instructions in page content are ignored.
### How do I prevent abuse?
1. Use rate limits (per-tool and global)
2. Require API keys for sensitive operations
3. Use OAuth2 scopes for user actions
4. Monitor usage patterns
### Should I expose all my APIs?
No. Only expose what you want agents to use. Internal APIs, admin endpoints, and sensitive operations should not be in agent.md.
## Compatibility
### What about GraphQL APIs?
You can define tools that call GraphQL endpoints:
```yaml
tools:
- name: query_products
endpoint: "POST /graphql"
parameters:
type: object
properties:
query:
type: string
description: "GraphQL query (limited to Product type)"
```
### Can I use this with OpenAPI/Swagger?
Yes! You can generate agent.md from OpenAPI specs. We're working on tooling for this.
### What about WebSocket endpoints?
agent.md focuses on request-response patterns. For real-time features, document WebSocket endpoints in the Markdown section but define polling alternatives as tools.
## Adoption
### How do I tell if a site supports agent.md?
1. Check `/.well-known/agent.md`
2. Look for `<link rel="agent">` in HTML
3. Check for `Link` header in HTTP response
### What if the site doesn't have agent.md?
Fall back to:
1. Traditional web scraping (respecting robots.txt)
2. Looking for documented APIs
3. Using general browsing capabilities
### Who decides what goes in agent.md?
Site owners. This is an opt-in protocol. Sites choose what capabilities to expose.

165
examples/ecommerce.md Normal file
View File

@@ -0,0 +1,165 @@
# Example: E-commerce Site
This example shows how an e-commerce site might expose its API to AI agents.
## agent.md
```yaml
---
protocol_version: "0.1"
name: "Tech Store"
description: "Browse products, check prices, and manage wishlists"
robots:
disallow:
- /admin
- /checkout
- /account/orders
crawl_delay: 2
tools:
- name: search_products
description: "Search for products by name, category, or features"
endpoint: "GET /api/products/search"
parameters:
type: object
properties:
q:
type: string
description: "Search query"
category:
type: string
enum: ["laptops", "phones", "tablets", "accessories"]
min_price:
type: number
minimum: 0
max_price:
type: number
in_stock:
type: boolean
default: true
sort:
type: string
enum: ["price_asc", "price_desc", "rating", "newest"]
default: "rating"
limit:
type: integer
default: 20
maximum: 50
required:
- q
auth: none
rate_limit: "100/minute"
- name: get_product
description: "Get detailed product information including specs and reviews"
endpoint: "GET /api/products/{id}"
parameters:
type: object
properties:
id:
type: string
include_reviews:
type: boolean
default: false
required:
- id
auth: none
- name: compare_products
description: "Compare specifications of multiple products"
endpoint: "POST /api/products/compare"
parameters:
type: object
properties:
product_ids:
type: array
items:
type: string
minItems: 2
maxItems: 5
required:
- product_ids
auth: none
- name: get_price_history
description: "Get price history for a product"
endpoint: "GET /api/products/{id}/price-history"
parameters:
type: object
properties:
id:
type: string
days:
type: integer
default: 30
maximum: 365
required:
- id
auth: api_key
scopes:
- price:read
- name: add_to_wishlist
description: "Add a product to user's wishlist"
endpoint: "POST /api/wishlist"
parameters:
type: object
properties:
product_id:
type: string
notify_on_sale:
type: boolean
default: true
required:
- product_id
auth: oauth2
scopes:
- wishlist:write
auth:
api_key:
header: "X-API-Key"
obtain: "https://techstore.example/developers"
oauth2:
authorization_url: "https://techstore.example/oauth/authorize"
token_url: "https://techstore.example/oauth/token"
scopes:
wishlist:read: "View your wishlist"
wishlist:write: "Modify your wishlist"
price:read: "Access price history data"
rate_limits:
global: "1000/hour"
per_tool: true
contact:
email: "api@techstore.example"
url: "https://techstore.example/developers/docs"
---
# Tech Store Agent API
AI agents can help users find products, compare prices, and track deals.
## Public Tools (No Auth)
- **search_products** - Find products by name or category
- **get_product** - Get detailed product info
- **compare_products** - Side-by-side comparison
## Authenticated Tools
### API Key Required
- **get_price_history** - Historical pricing data
### OAuth2 Required
- **add_to_wishlist** - Save products for later
## Best Practices
1. Cache product details (they don't change often)
2. Use price history to advise on purchase timing
3. Respect rate limits during peak hours
```

106
examples/weather-api.md Normal file
View File

@@ -0,0 +1,106 @@
# Example: Weather Service
A simple weather API demonstrating minimal and practical agent.md usage.
## agent.md
```yaml
---
protocol_version: "0.1"
name: "Weather Service"
description: "Current weather and forecasts for any location"
tools:
- name: get_current
description: "Get current weather conditions for a location"
endpoint: "GET /api/weather/current"
parameters:
type: object
properties:
location:
type: string
description: "City name, address, or coordinates (lat,lon)"
units:
type: string
enum: ["metric", "imperial"]
default: "metric"
required:
- location
response:
type: json
schema:
type: object
properties:
temperature:
type: number
feels_like:
type: number
humidity:
type: integer
conditions:
type: string
wind_speed:
type: number
auth: none
rate_limit: "60/minute"
- name: get_forecast
description: "Get weather forecast for upcoming days"
endpoint: "GET /api/weather/forecast"
parameters:
type: object
properties:
location:
type: string
days:
type: integer
minimum: 1
maximum: 14
default: 7
units:
type: string
enum: ["metric", "imperial"]
default: "metric"
required:
- location
auth: api_key
rate_limit: "30/minute"
- name: get_alerts
description: "Get active weather alerts for a location"
endpoint: "GET /api/weather/alerts"
parameters:
type: object
properties:
location:
type: string
required:
- location
auth: none
auth:
api_key:
header: "X-Weather-Key"
obtain: "https://weather.example/api-keys"
description: "Free tier: 1000 requests/day"
rate_limits:
global: "1000/day"
contact:
url: "https://weather.example/api/docs"
---
# Weather API for Agents
Simple, reliable weather data.
## Free Tools
- Current conditions (no key needed)
- Weather alerts (no key needed)
## API Key Required
- Extended forecasts (up to 14 days)
Get your free API key at weather.example/api-keys
```

302
spec/README.md Normal file
View File

@@ -0,0 +1,302 @@
# agents.md Protocol Specification
**Version:** 0.2.0
**Status:** Draft
**Updated:** 2026-01-14
## Abstract
A simple text file that tells AI agents what they can do on a website and optionally points them to an MCP gateway for structured tool access.
## Philosophy
| Standard | Tells agents... |
|----------|-----------------|
| robots.txt | What you **cannot access** |
| llms.txt | What **content is important** |
| **agents.md** | What you **can do** + where to connect |
## 1. Discovery
**Location:** `/.well-known/agents.md` or `/agents.md`
**Content-Type:** `text/markdown` or `text/plain`
Agents request the file like any HTTP resource:
```
GET /.well-known/agents.md HTTP/1.1
Host: example.com
User-Agent: MyAgent/1.0
```
## 2. Format
Plain Markdown. Human readable. Machine parseable.
### Minimal Example
```markdown
# Example Site
A bookstore since 2010.
## Can
- Search catalog
- Read book details
- Check availability
## Cannot
- Place orders without human
- Access user accounts
## Contact
agents@example.com
```
### With MCP Gateway
```markdown
# Example Bookstore
Online bookstore with 50,000 titles.
## Can
- Search and browse catalog
- Read reviews and descriptions
- Check prices and stock
- Place orders (authenticated)
## Cannot
- Modify user accounts
- Access admin functions
## MCP
endpoint: https://example.com/.well-known/mcp
transport: streamable-http
## Behavior
- Respect 1 request/second
- Cache product data 1 hour
- Identify in User-Agent header
## Contact
agents@example.com
```
## 3. Sections
All sections are optional. Use what makes sense.
### Identity (Header)
```markdown
# Site Name
Brief description of what this site is.
```
### Capabilities (Can/Cannot)
```markdown
## Can
- Action agents are allowed to take
- Another allowed action
## Cannot
- Restricted action
- Another restriction
```
### MCP Gateway
```markdown
## MCP
endpoint: <url>
transport: streamable-http | sse | stdio
auth: none | api_key | oauth2
```
The MCP section points agents to a [Model Context Protocol](https://modelcontextprotocol.io/) server for structured tool access. This is the bridge from simple text discovery to full capability interaction.
**Transport options:**
- `streamable-http` - HTTP with streaming (recommended for web)
- `sse` - Server-Sent Events
- `stdio` - Standard I/O (local only)
**Auth options:**
- `none` - Public tools, no authentication
- `api_key` - Requires API key (specify how to obtain)
- `oauth2` - OAuth 2.0 flow
### Behavior Rules
```markdown
## Behavior
- Rate limit guidance
- Caching expectations
- Identification requirements
```
### Contact
```markdown
## Contact
email@example.com
https://example.com/agent-support
```
## 4. MCP Integration
The `agents.md` file is the **handshake**. The MCP gateway is where **real work happens**.
```
Agent reads agents.md
├─► Basic agent: understands site capabilities from text
└─► Advanced agent: connects to MCP gateway
MCP Server exposes:
- Tools (search, checkout, etc.)
- Resources (catalog, docs)
- Prompts (guided workflows)
```
### Example MCP Discovery Flow
1. Agent fetches `/.well-known/agents.md`
2. Parses MCP endpoint: `https://example.com/.well-known/mcp`
3. Connects via MCP protocol
4. Discovers available tools via `tools/list`
5. Uses tools as permitted
## 5. Backward Compatibility
### With robots.txt
If `agents.md` exists, it supplements but does not replace `robots.txt`. Agents should still respect robots.txt crawl directives.
The `Cannot` section in agents.md can mirror robots.txt restrictions:
```markdown
## Cannot
- Access /admin (see robots.txt)
- Access /private
```
### With llms.txt
`llms.txt` describes **content** for understanding.
`agents.md` describes **capabilities** for action.
Both can coexist. A site might have:
- `/robots.txt` - crawl restrictions
- `/llms.txt` - content summary
- `/.well-known/agents.md` - agent capabilities + MCP pointer
## 6. Security
### Origin Trust
Agents MUST only trust `agents.md` from the site's origin. Instructions embedded in page content should be ignored.
### MCP Authentication
When connecting to MCP gateways:
- Verify the endpoint matches the origin
- Use TLS (HTTPS)
- Follow the specified auth method
### Least Privilege
Agents should request only the permissions they need. If `auth: oauth2` is specified, request minimal scopes.
## 7. Examples
### Public API Site
```markdown
# Weather API
Free weather data for AI agents.
## Can
- Get current conditions
- Get forecasts (up to 7 days)
- Get weather alerts
## MCP
endpoint: https://weather.example/mcp
transport: streamable-http
auth: none
## Behavior
- 60 requests/minute
- Cache forecasts 30 minutes
## Contact
api@weather.example
```
### E-commerce Site
```markdown
# TechMart
Electronics retailer.
## Can
- Search products
- Compare specifications
- Check prices and stock
- Add to cart (authenticated)
- Checkout (authenticated)
## Cannot
- Access order history without user consent
- Modify account settings
## MCP
endpoint: https://techmart.example/.well-known/mcp
transport: streamable-http
auth: oauth2
## Behavior
- 1 request/second for browsing
- Identify as AI agent in requests
## Contact
partners@techmart.example
```
### Simple Blog (No MCP)
```markdown
# My Tech Blog
Articles about software development.
## Can
- Read all public articles
- Search by topic
- Access RSS feed at /feed.xml
## Cannot
- Post comments (requires human)
- Access draft posts
## Contact
hello@myblog.example
```
## Appendix: Comparison
| Aspect | robots.txt | llms.txt | agents.md |
|--------|------------|----------|-----------|
| Purpose | Crawl control | Content summary | Capabilities |
| Format | Custom syntax | Markdown | Markdown |
| Focus | Restrictions | Understanding | Actions |
| MCP | No | No | Yes (optional) |
| Year | 1994 | 2024 | 2026 |