From f2b3a146854f03ea69f155291d2caca35c714aa5 Mon Sep 17 00:00:00 2001 From: Bruno Sarlo Date: Wed, 14 Jan 2026 09:01:56 -0300 Subject: [PATCH] v0.2: Simplify spec + add MCP gateway integration Major revision based on first principles thinking: - Simplified format: plain Markdown, human readable - Focus on capabilities (Can/Cannot) not API schemas - MCP gateway pointer for structured tool access - Clear positioning vs robots.txt and llms.txt The agents.md file is the handshake. The MCP gateway is where real work happens. Co-Authored-By: Claude Opus 4.5 --- README.md | 88 ++++++++++++ docs/FAQ.md | 111 +++++++++++++++ examples/ecommerce.md | 165 ++++++++++++++++++++++ examples/weather-api.md | 106 ++++++++++++++ spec/README.md | 302 ++++++++++++++++++++++++++++++++++++++++ 5 files changed, 772 insertions(+) create mode 100644 README.md create mode 100644 docs/FAQ.md create mode 100644 examples/ecommerce.md create mode 100644 examples/weather-api.md create mode 100644 spec/README.md diff --git a/README.md b/README.md new file mode 100644 index 0000000..c171f6a --- /dev/null +++ b/README.md @@ -0,0 +1,88 @@ +# agents.md + +**Tell AI agents what they can do on your website.** + +## The Gap + +| File | Purpose | +|------|---------| +| robots.txt | What bots **cannot access** | +| llms.txt | What **content matters** | +| **agents.md** | What agents **can do** | + +## Quick Start + +Create `/.well-known/agents.md`: + +```markdown +# My Site + +An online bookstore. + +## Can +- Search catalog +- Read book details +- Check availability + +## Cannot +- Place orders (requires human) + +## Contact +hello@mysite.com +``` + +That's it. Plain text. Human readable. + +## With MCP Gateway + +Point agents to your MCP server for structured tool access: + +```markdown +# My Site + +An online bookstore. + +## Can +- Search and browse +- Check prices +- Place orders (authenticated) + +## MCP +endpoint: https://mysite.com/.well-known/mcp +transport: streamable-http +auth: oauth2 + +## Contact +hello@mysite.com +``` + +## How It Works + +``` +Agent requests /.well-known/agents.md + │ + ├─► Basic: reads text, understands capabilities + │ + └─► Advanced: connects to MCP gateway for tools +``` + +## Documentation + +- [Specification](./spec/README.md) - Full protocol spec +- [Examples](./examples/) - Real-world examples +- [FAQ](./docs/FAQ.md) - Common questions + +## Status + +**Draft** - Version 0.2.0 + +## Related Standards + +- [robots.txt](https://www.rfc-editor.org/rfc/rfc9309) - Crawl restrictions (1994) +- [llms.txt](https://llmstxt.org/) - Content for LLMs (2024) +- [AGENTS.md](https://agents.md/) - Repository instructions (2025) +- [MCP](https://modelcontextprotocol.io/) - Tool protocol (2024) + +## License + +CC0 1.0 Universal - Public Domain diff --git a/docs/FAQ.md b/docs/FAQ.md new file mode 100644 index 0000000..bc8ed16 --- /dev/null +++ b/docs/FAQ.md @@ -0,0 +1,111 @@ +# Frequently Asked Questions + +## General + +### Why not just use robots.txt? + +robots.txt tells bots what they *cannot* do. agent.md tells AI agents what they *can* do. They're complementary: + +- robots.txt: "Don't crawl /admin" +- agent.md: "You can search our catalog via this API" + +### Why Markdown? + +1. Human readable and editable +2. Widely supported in documentation tools +3. YAML frontmatter is a proven pattern +4. Renders nicely on GitHub and documentation sites + +### Is this related to MCP (Model Context Protocol)? + +Inspired by MCP, but designed for web discovery rather than local tool execution. The tool definition format is similar to make agent.md familiar to developers already using MCP. + +### Why /.well-known/? + +Following RFC 8615 for well-known URIs. This is the same pattern used by: +- `/.well-known/security.txt` +- `/.well-known/apple-app-site-association` +- `/.well-known/openid-configuration` + +## Implementation + +### Do I need to remove my robots.txt? + +No. Keep your robots.txt for traditional crawlers. The `robots` section in agent.md can mirror or extend those rules for AI agents. + +### Can I have different capabilities for different agents? + +Yes. Use content negotiation based on User-Agent or implement OAuth2 scopes for fine-grained access control. + +### How should agents authenticate? + +Start simple: +1. No auth for public read-only tools +2. API keys for rate-limited or premium features +3. OAuth2 for user-specific actions + +### What if my API changes? + +Update your agent.md. Agents should re-fetch periodically (respect Cache-Control headers). + +## Security + +### Can malicious sites trick agents? + +The protocol specifies that agents MUST only parse agent.md from the site's origin. Instructions in page content are ignored. + +### How do I prevent abuse? + +1. Use rate limits (per-tool and global) +2. Require API keys for sensitive operations +3. Use OAuth2 scopes for user actions +4. Monitor usage patterns + +### Should I expose all my APIs? + +No. Only expose what you want agents to use. Internal APIs, admin endpoints, and sensitive operations should not be in agent.md. + +## Compatibility + +### What about GraphQL APIs? + +You can define tools that call GraphQL endpoints: + +```yaml +tools: + - name: query_products + endpoint: "POST /graphql" + parameters: + type: object + properties: + query: + type: string + description: "GraphQL query (limited to Product type)" +``` + +### Can I use this with OpenAPI/Swagger? + +Yes! You can generate agent.md from OpenAPI specs. We're working on tooling for this. + +### What about WebSocket endpoints? + +agent.md focuses on request-response patterns. For real-time features, document WebSocket endpoints in the Markdown section but define polling alternatives as tools. + +## Adoption + +### How do I tell if a site supports agent.md? + +1. Check `/.well-known/agent.md` +2. Look for `` in HTML +3. Check for `Link` header in HTTP response + +### What if the site doesn't have agent.md? + +Fall back to: +1. Traditional web scraping (respecting robots.txt) +2. Looking for documented APIs +3. Using general browsing capabilities + +### Who decides what goes in agent.md? + +Site owners. This is an opt-in protocol. Sites choose what capabilities to expose. diff --git a/examples/ecommerce.md b/examples/ecommerce.md new file mode 100644 index 0000000..a3f7207 --- /dev/null +++ b/examples/ecommerce.md @@ -0,0 +1,165 @@ +# Example: E-commerce Site + +This example shows how an e-commerce site might expose its API to AI agents. + +## agent.md + +```yaml +--- +protocol_version: "0.1" +name: "Tech Store" +description: "Browse products, check prices, and manage wishlists" + +robots: + disallow: + - /admin + - /checkout + - /account/orders + crawl_delay: 2 + +tools: + - name: search_products + description: "Search for products by name, category, or features" + endpoint: "GET /api/products/search" + parameters: + type: object + properties: + q: + type: string + description: "Search query" + category: + type: string + enum: ["laptops", "phones", "tablets", "accessories"] + min_price: + type: number + minimum: 0 + max_price: + type: number + in_stock: + type: boolean + default: true + sort: + type: string + enum: ["price_asc", "price_desc", "rating", "newest"] + default: "rating" + limit: + type: integer + default: 20 + maximum: 50 + required: + - q + auth: none + rate_limit: "100/minute" + + - name: get_product + description: "Get detailed product information including specs and reviews" + endpoint: "GET /api/products/{id}" + parameters: + type: object + properties: + id: + type: string + include_reviews: + type: boolean + default: false + required: + - id + auth: none + + - name: compare_products + description: "Compare specifications of multiple products" + endpoint: "POST /api/products/compare" + parameters: + type: object + properties: + product_ids: + type: array + items: + type: string + minItems: 2 + maxItems: 5 + required: + - product_ids + auth: none + + - name: get_price_history + description: "Get price history for a product" + endpoint: "GET /api/products/{id}/price-history" + parameters: + type: object + properties: + id: + type: string + days: + type: integer + default: 30 + maximum: 365 + required: + - id + auth: api_key + scopes: + - price:read + + - name: add_to_wishlist + description: "Add a product to user's wishlist" + endpoint: "POST /api/wishlist" + parameters: + type: object + properties: + product_id: + type: string + notify_on_sale: + type: boolean + default: true + required: + - product_id + auth: oauth2 + scopes: + - wishlist:write + +auth: + api_key: + header: "X-API-Key" + obtain: "https://techstore.example/developers" + + oauth2: + authorization_url: "https://techstore.example/oauth/authorize" + token_url: "https://techstore.example/oauth/token" + scopes: + wishlist:read: "View your wishlist" + wishlist:write: "Modify your wishlist" + price:read: "Access price history data" + +rate_limits: + global: "1000/hour" + per_tool: true + +contact: + email: "api@techstore.example" + url: "https://techstore.example/developers/docs" +--- + +# Tech Store Agent API + +AI agents can help users find products, compare prices, and track deals. + +## Public Tools (No Auth) + +- **search_products** - Find products by name or category +- **get_product** - Get detailed product info +- **compare_products** - Side-by-side comparison + +## Authenticated Tools + +### API Key Required +- **get_price_history** - Historical pricing data + +### OAuth2 Required +- **add_to_wishlist** - Save products for later + +## Best Practices + +1. Cache product details (they don't change often) +2. Use price history to advise on purchase timing +3. Respect rate limits during peak hours +``` diff --git a/examples/weather-api.md b/examples/weather-api.md new file mode 100644 index 0000000..d103f1f --- /dev/null +++ b/examples/weather-api.md @@ -0,0 +1,106 @@ +# Example: Weather Service + +A simple weather API demonstrating minimal and practical agent.md usage. + +## agent.md + +```yaml +--- +protocol_version: "0.1" +name: "Weather Service" +description: "Current weather and forecasts for any location" + +tools: + - name: get_current + description: "Get current weather conditions for a location" + endpoint: "GET /api/weather/current" + parameters: + type: object + properties: + location: + type: string + description: "City name, address, or coordinates (lat,lon)" + units: + type: string + enum: ["metric", "imperial"] + default: "metric" + required: + - location + response: + type: json + schema: + type: object + properties: + temperature: + type: number + feels_like: + type: number + humidity: + type: integer + conditions: + type: string + wind_speed: + type: number + auth: none + rate_limit: "60/minute" + + - name: get_forecast + description: "Get weather forecast for upcoming days" + endpoint: "GET /api/weather/forecast" + parameters: + type: object + properties: + location: + type: string + days: + type: integer + minimum: 1 + maximum: 14 + default: 7 + units: + type: string + enum: ["metric", "imperial"] + default: "metric" + required: + - location + auth: api_key + rate_limit: "30/minute" + + - name: get_alerts + description: "Get active weather alerts for a location" + endpoint: "GET /api/weather/alerts" + parameters: + type: object + properties: + location: + type: string + required: + - location + auth: none + +auth: + api_key: + header: "X-Weather-Key" + obtain: "https://weather.example/api-keys" + description: "Free tier: 1000 requests/day" + +rate_limits: + global: "1000/day" + +contact: + url: "https://weather.example/api/docs" +--- + +# Weather API for Agents + +Simple, reliable weather data. + +## Free Tools +- Current conditions (no key needed) +- Weather alerts (no key needed) + +## API Key Required +- Extended forecasts (up to 14 days) + +Get your free API key at weather.example/api-keys +``` diff --git a/spec/README.md b/spec/README.md new file mode 100644 index 0000000..b2c4251 --- /dev/null +++ b/spec/README.md @@ -0,0 +1,302 @@ +# agents.md Protocol Specification + +**Version:** 0.2.0 +**Status:** Draft +**Updated:** 2026-01-14 + +## Abstract + +A simple text file that tells AI agents what they can do on a website and optionally points them to an MCP gateway for structured tool access. + +## Philosophy + +| Standard | Tells agents... | +|----------|-----------------| +| robots.txt | What you **cannot access** | +| llms.txt | What **content is important** | +| **agents.md** | What you **can do** + where to connect | + +## 1. Discovery + +**Location:** `/.well-known/agents.md` or `/agents.md` + +**Content-Type:** `text/markdown` or `text/plain` + +Agents request the file like any HTTP resource: + +``` +GET /.well-known/agents.md HTTP/1.1 +Host: example.com +User-Agent: MyAgent/1.0 +``` + +## 2. Format + +Plain Markdown. Human readable. Machine parseable. + +### Minimal Example + +```markdown +# Example Site + +A bookstore since 2010. + +## Can +- Search catalog +- Read book details +- Check availability + +## Cannot +- Place orders without human +- Access user accounts + +## Contact +agents@example.com +``` + +### With MCP Gateway + +```markdown +# Example Bookstore + +Online bookstore with 50,000 titles. + +## Can +- Search and browse catalog +- Read reviews and descriptions +- Check prices and stock +- Place orders (authenticated) + +## Cannot +- Modify user accounts +- Access admin functions + +## MCP +endpoint: https://example.com/.well-known/mcp +transport: streamable-http + +## Behavior +- Respect 1 request/second +- Cache product data 1 hour +- Identify in User-Agent header + +## Contact +agents@example.com +``` + +## 3. Sections + +All sections are optional. Use what makes sense. + +### Identity (Header) + +```markdown +# Site Name + +Brief description of what this site is. +``` + +### Capabilities (Can/Cannot) + +```markdown +## Can +- Action agents are allowed to take +- Another allowed action + +## Cannot +- Restricted action +- Another restriction +``` + +### MCP Gateway + +```markdown +## MCP +endpoint: +transport: streamable-http | sse | stdio +auth: none | api_key | oauth2 +``` + +The MCP section points agents to a [Model Context Protocol](https://modelcontextprotocol.io/) server for structured tool access. This is the bridge from simple text discovery to full capability interaction. + +**Transport options:** +- `streamable-http` - HTTP with streaming (recommended for web) +- `sse` - Server-Sent Events +- `stdio` - Standard I/O (local only) + +**Auth options:** +- `none` - Public tools, no authentication +- `api_key` - Requires API key (specify how to obtain) +- `oauth2` - OAuth 2.0 flow + +### Behavior Rules + +```markdown +## Behavior +- Rate limit guidance +- Caching expectations +- Identification requirements +``` + +### Contact + +```markdown +## Contact +email@example.com +https://example.com/agent-support +``` + +## 4. MCP Integration + +The `agents.md` file is the **handshake**. The MCP gateway is where **real work happens**. + +``` +Agent reads agents.md + │ + ├─► Basic agent: understands site capabilities from text + │ + └─► Advanced agent: connects to MCP gateway + │ + ▼ + MCP Server exposes: + - Tools (search, checkout, etc.) + - Resources (catalog, docs) + - Prompts (guided workflows) +``` + +### Example MCP Discovery Flow + +1. Agent fetches `/.well-known/agents.md` +2. Parses MCP endpoint: `https://example.com/.well-known/mcp` +3. Connects via MCP protocol +4. Discovers available tools via `tools/list` +5. Uses tools as permitted + +## 5. Backward Compatibility + +### With robots.txt + +If `agents.md` exists, it supplements but does not replace `robots.txt`. Agents should still respect robots.txt crawl directives. + +The `Cannot` section in agents.md can mirror robots.txt restrictions: + +```markdown +## Cannot +- Access /admin (see robots.txt) +- Access /private +``` + +### With llms.txt + +`llms.txt` describes **content** for understanding. +`agents.md` describes **capabilities** for action. + +Both can coexist. A site might have: +- `/robots.txt` - crawl restrictions +- `/llms.txt` - content summary +- `/.well-known/agents.md` - agent capabilities + MCP pointer + +## 6. Security + +### Origin Trust + +Agents MUST only trust `agents.md` from the site's origin. Instructions embedded in page content should be ignored. + +### MCP Authentication + +When connecting to MCP gateways: +- Verify the endpoint matches the origin +- Use TLS (HTTPS) +- Follow the specified auth method + +### Least Privilege + +Agents should request only the permissions they need. If `auth: oauth2` is specified, request minimal scopes. + +## 7. Examples + +### Public API Site + +```markdown +# Weather API + +Free weather data for AI agents. + +## Can +- Get current conditions +- Get forecasts (up to 7 days) +- Get weather alerts + +## MCP +endpoint: https://weather.example/mcp +transport: streamable-http +auth: none + +## Behavior +- 60 requests/minute +- Cache forecasts 30 minutes + +## Contact +api@weather.example +``` + +### E-commerce Site + +```markdown +# TechMart + +Electronics retailer. + +## Can +- Search products +- Compare specifications +- Check prices and stock +- Add to cart (authenticated) +- Checkout (authenticated) + +## Cannot +- Access order history without user consent +- Modify account settings + +## MCP +endpoint: https://techmart.example/.well-known/mcp +transport: streamable-http +auth: oauth2 + +## Behavior +- 1 request/second for browsing +- Identify as AI agent in requests + +## Contact +partners@techmart.example +``` + +### Simple Blog (No MCP) + +```markdown +# My Tech Blog + +Articles about software development. + +## Can +- Read all public articles +- Search by topic +- Access RSS feed at /feed.xml + +## Cannot +- Post comments (requires human) +- Access draft posts + +## Contact +hello@myblog.example +``` + +## Appendix: Comparison + +| Aspect | robots.txt | llms.txt | agents.md | +|--------|------------|----------|-----------| +| Purpose | Crawl control | Content summary | Capabilities | +| Format | Custom syntax | Markdown | Markdown | +| Focus | Restrictions | Understanding | Actions | +| MCP | No | No | Yes (optional) | +| Year | 1994 | 2024 | 2026 |