Hidden Layer — GEO & Agent-Readiness Research

The GEO leaderboard: who's winning AI visibility in 2026

Tue, 12 May 2026 00:00:00 GMT

# The GEO leaderboard: who's winning AI visibility in 2026 > Agent integration = 0 across all 118 domains. The agentic web is coming but zero companies have started building for it. **Category:** Leaderboard | **Date:** 2026-05-12 | **Read:** 10 min --- Agent Integration scored 0/13 across all 118 domains in every industry category. Zero. Not one company in our dataset has deployed agent-specific endpoints, structured tool responses, or machine-readable capability declarations. The agentic web is being built, but the companies that will populate it aren't preparing for it. ## The expanded leaderboard 118 domains across 12 industries. Here are the top 10; the full dataset includes F-grade domains like hm.com (F/14), adidas.com (F/15), and freshworks.com (F/17). | Rank | Domain | Grade/Score | | --- | --- | --- | | 1 | twilio.com | B/86 | | 2 | akeneo.com | B/85 | | 3 | cloudflare.com | B/84 | | 4 | samsung.com | B/84 | | 5 | atlassian.com | B/84 | | 6 | nike.com | B/80 | | 7 | shopify.com | B/79 | | 8 | render.com | B/77 | | 9 | allbirds.com | B/76 | | 10 | notion.so | B/76 | ## Industry benchmarks | Industry | Median Score | N | | --- | --- | --- | | Consumer Electronics | C/72 | 4 | | B2B SaaS | C/65 | 36 | | Financial Services | D/59 | 7 | | Developer Tools | D/59 | 20 | | AI/ML | D/55 | 14 | | Marketplace | D/54 | 3 | | Footwear | D/53 | 3 | | Entertainment | D/53 | 6 | | Media & Publishing | D/50 | 10 | | Fashion | F/41 | 4 | | Social Media | F/36 | 5 | | Sportswear | F/35 | 4 | ## Pattern 1: Bot access is solved — AI visibility isn't Every domain achieved the maximum 75/75 on Bot Access. Technical infrastructure is no longer the constraint. But AI Visibility ranges 12–18/68 median — LLMs can reach your content but don't know what it means. The gap between accessibility and understandability is 57 points wide. ## Pattern 2: AI companies are getting worse, not better OpenAI dropped to D/57 (was C/62). Anthropic fell to D/48 (was C/52). Vercel crashed from #1 at B/82 to #14 at C/70 — partly because Vercel's own robots.txt now includes 'ai-train=no', an explicit AI training block that penalises its score. These are the companies building the models — yet they're among the worst at making their own content discoverable to those same models. ## Pattern 3: The within-category canyon In Sportswear: Nike B/80 vs Adidas F/15 — a 65-point gap. In B2B SaaS: Twilio B/86 vs Freshworks F/17 — a 69-point gap. Nike and Twilio built API-first infrastructure; Adidas and Freshworks use marketing-page-first architectures. The structural choice, made years before GEO existed, determines whether you're invisible to agents. ## What separates B from F - Agent integration endpoints (0/13 across all 118 domains) — no company has started; first mover advantage is completely unclaimed - Entity linking to knowledge graphs — Wikipedia/Wikidata sameAs on Organisation schema; GEO Presence median is 18–30/46 - llms.txt implementation — AI Discovery plateaus at 12–15/20 median without it; Shopify has it, Anthropic doesn't - Render gap — AI Visibility at 12–18/68 reflects JavaScript-heavy pages that serve thin HTML shells to crawlers ## What nobody has done yet No domain scored above 0 on Agent Integration. The category covers /.well-known/agent-card.json (A2A protocol), /.well-known/oauth-protected-resource (RFC 9728), and MCP discovery endpoints. These are a day's work each. Zero companies have shipped any of them. Being the first in your industry category to hit 1/13 is a genuinely differentiating signal right now. ## Where to start if you're C, D, or F 1. Deploy llms.txt at your root with current sitemap references — improves AI Discovery score by 12–15 points. Check llmstxt.org for current format. 2. Add entity links: Wikipedia and Wikidata sameAs on your Organisation and Product schema — boosts GEO Presence from the 18–30 range toward the 40s. 3. Implement structured product schema if you sell anything — Commerce scores are 0–2/2 across all 118 domains; this is table stakes. 4. File agent integration as a roadmap item — /.well-known/agent-card.json (A2A) and /.well-known/oauth-protected-resource (RFC 9728) are the next frontier. First movers will hold these positions. Run your domain at hidden-layer-blogs.pages.dev to see where you rank. **Tags:** Leaderboard, GEO, Rankings, AI Visibility

Agent integration: the 0/13 frontier

Tue, 12 May 2026 00:00:00 GMT

# Agent integration: the 0/13 frontier > Hidden Layer audited 118 domains across 12 industries. Every single one scored 0/13 on agent integration. The frontier is completely open. **Category:** Agent Integration | **Date:** 2026-05-12 | **Read:** 8 min --- Hidden Layer audited 118 domains across 12 industries. Every single one scored 0/13 on agent integration. Not one B2B SaaS company. Not one Developer Tools platform. Not one AI company. The agent integration category is the only one where the maximum score equals the minimum score across every domain in the dataset — and both are zero. This isn't about companies doing it badly. It's that no one has started. ## What agent integration actually means Agent integration means making your services discoverable and callable by AI agents through standardised protocols. Three files do most of the work: - MCP server card (/.well-known/mcp.json) — declares your MCP server's capabilities, tools, and connection requirements to any agent that checks. The Model Context Protocol launched Nov 2024. - A2A agent card (/.well-known/agent-card.json) — Google's Agent-to-Agent protocol, launched Apr 2025. Tells external agents your agent's name, skills, and how to invoke it. - OpenAPI spec (/openapi.json) — a standard 3.1+ description of your REST API at a well-known path. Most companies publish this at /api/swagger or /docs — not the standardised location agents look for. The three protocols are independent and complementary. MCP is about tool access. A2A is about agent-to-agent delegation. OpenAPI is about raw API discoverability. A B2B SaaS company could implement all three in a day. ## The 7 checks, ranked by opportunity | Check | Points | What it requires | Est. time | | --- | --- | --- | --- | | OpenAPI spec | 3 | /openapi.json with OpenAPI 3.1+ spec | 20 min if spec exists | | A2A agent card | 2 | /.well-known/agent-card.json (Google A2A spec) | 10 min | | OAuth protected resource | 2 | /.well-known/oauth-protected-resource (RFC 9728) | 15 min | | API catalog | 1 | /.well-known/api linking to API resources | 5 min | | MCP server card | 1 | /.well-known/mcp.json declaring MCP capabilities | 10 min | | Web Bot Auth | 1 | /.well-known/http-message-signatures-directory (IETF draft) | 10 min | | Listed in MCP registry | 1 | Submit to registry.smithery.ai | 5 min | ## Who moves first? Companies with existing REST APIs are already halfway there. Twilio and Stripe both have OpenAPI specs — they just live at non-standard paths. Moving them to /openapi.json is a one-line Nginx/Cloudflare rewrite. Adding an agent-card.json is JSON authoring. Developer tools platforms (Cloudflare, Vercel, Render) are structurally ready. Their APIs are already public, documented, and designed for machines. Adding MCP and A2A discovery is a config file, not a product decision. Content-only sites (media, fashion, consumer brands) are furthest away — they'd need to build APIs first. But most don't need to. Their play is llms.txt and entity linking, not MCP servers. Agent integration points are honestly irrelevant for a brand like Adidas. The irony remains: AI companies (OpenAI D/57, Anthropic D/48) — whose products depend on agent interoperability — have zero agent integration signals on their own domains. The 0/13 applies to them as much as to anyone. ## First mover advantage is real Today's 0/13 vs 0/13 looks equal. In 12 months it's 10/13 vs 0/13 — and the gap is permanent for the companies who move. Agent discovery works like SEO in 2004: the companies that invested early hold positions that are structurally hard to displace. The Hidden Layer audit will add weight to these checks as adoption rises. Currently they're low-weighted because sub-1% of the web uses them — the methodology doesn't punish you for missing protocols that don't exist yet. But as Smithery grows, as A2A adoption rises, the weight increases. The first 30 companies in each industry to hit 5/13 will own that category's agent discovery ranking. ## How to implement the two highest-value checks Start with A2A agent card (2 pts) and OpenAPI spec (3 pts). Together they're 5/13 — already ahead of 100% of the current web. Create /.well-known/agent-card.json: ``` { "name": "Your Service Name", "description": "What your service does in one sentence", "url": "https://yourdomain.com", "version": "1.0.0", "skills": [ { "name": "primary-capability", "description": "What agents can do with your service" } ] } ``` Create /.well-known/mcp.json: ``` { "mcpVersion": "1.0.0", "server": { "name": "Your MCP Server", "version": "1.0.0", "description": "What tools your MCP server exposes" }, "capabilities": { "tools": { "listChanged": true } } } ``` Both files serve as static JSON from your web root. No backend logic required. Add a Nginx location block or a Cloudflare Pages static file and you're done. Run your domain at hidden-layer-blogs.pages.dev after deploying — you should see 3/13 immediately (A2A card + MCP card + at least one from the OpenAPI path check). **Tags:** Agent Integration, MCP, A2A, GEO, Research

46+ checks, one grade: how we score AI readiness

Mon, 11 May 2026 00:00:00 GMT

# 46+ checks, one grade: how we score AI readiness > A transparent look at the 8 categories, 229 base points, and why a B doesn't mean you're ready and an F doesn't mean you're invisible. **Category:** Methodology | **Date:** 2026-05-11 | **Read:** 6 min --- Hidden Layer runs 46+ checks across 8 categories and returns a letter grade from A to F. The methodology is transparent — here's the full rationale for what we check and why we weight it the way we do. ## The 8 categories ### 1. Discoverability (39 points max) Can AI systems find you at all? This covers robots.txt presence and parse-ability, sitemap.xml structure (including sub-sitemap fan-out), HTTPS enforcement, and response headers. A domain with a 403 robots.txt gets a hard fail on bot access — we treat CDN blocks as equivalent to Disallow: *. Why 39 points? Discovery is the precondition for everything else. A site that blocks crawlers doesn't get to score on agent integration. ### 2. Bot Access (75 points max) The heaviest category. We check 12 canonical AI bot UAs against robots.txt rules: GPTBot, ClaudeBot, OAI-SearchBot, ChatGPT-User, Claude-SearchBot, Claude-User, PerplexityBot, Google-Extended, Applebot-Extended, CCBot, Meta-ExternalAgent, and Bytespider. We also score the training_search_mismatch signal — a site that blocks all training bots while allowing search bots takes a penalty because the combination signals an inconsistent AI policy. We apply RFC 9309 inheritance: bots with no explicit section inherit the User-agent: * rule. A permissive wildcard rule is a pass for all uncovered bots. We weight retrieval bots (OAI-SearchBot, ChatGPT-User, Claude-SearchBot, Claude-User, PerplexityBot) higher than training bots — the immediate commercial impact of blocking search-type bots is larger. ### 3. AI Discovery (15 points max) Is llms.txt present and parseable? We check for the file at /llms.txt, validate it has a top-level H1 and at least one section, and check for a /llms-full.txt companion. 15 points reflects that llms.txt is a genuine differentiator — present on under 20% of major domains — but not yet universal infrastructure. ### 4. Agent Integration (13 points max) The emerging-standards category. These specs have <1% adoption across the web but signal forward-looking infrastructure. We weight each proportionally to its maturity: - OpenAPI spec at standard well-known paths — 3pts (OpenAPI is established; well-known path discovery is new) - /.well-known/agent-card.json (A2A protocol) — 2pts - /.well-known/oauth-protected-resource (RFC 9728) — 2pts - Listed in Smithery MCP registry — 2pts - /.well-known/mcp.json (informal MCP card) — 1pt - /.well-known/api catalog — 1pt - /.well-known/http-message-signatures-directory (Web Bot Auth draft) — 1pt - /.well-known/agent-skills/index.json — 1pt ### 5. AI Visibility (34 points max) Content legibility and identity signals. This category measures how well AI systems can extract meaning from your site: - JSON-LD structured data (Schema.org) — 8pts: the primary machine-readable identity signal - Open Graph meta tags — 5pts: used for content preview and entity extraction - sameAs entity linking — 5pts: links your domain to Wikidata, Wikipedia, LinkedIn in JSON-LD - Content efficiency (text-to-HTML ratio) — 5pts: JS-heavy sites penalised for AI legibility - Content-Signal directive (robots.txt or X-Robots-Tag) — 3pts - Speakable Schema.org markup — 2pts - /pricing.md machine-readable pricing — 2pts - Agent-mode view (non-HTML response to AI UA) — 2pts - Markdown content negotiation (Accept: text/markdown) — 2pts ### 6. GEO Presence (46 points max) The GEO-specific category — measuring whether AI systems actually know and recommend your brand. This is what distinguishes Hidden Layer from infrastructure-only audits. - LLM cold recall — 15pts: we probe a language model with no tools or context and ask it to describe your domain. Pass = model recognises and accurately describes you. - LLM category share of voice — 10pts: we ask the model to list the top 10 brands in your industry. Pass = your brand appears. - Wikipedia / Wikidata presence — 8pts: the strongest predictor of LLM citation accuracy in our dataset. - HN mentions (Algolia HN API) — 5pts: HN is a high-weight LLM training corpus source. - Brand-name search (DuckDuckGo instant) — 5pts: brand search returns your domain. - Reddit mentions — 3pts: community corpus presence. ### 7. Commerce (2 points max) Payment-pointer and x402 protocol presence. Very early-stage — most sites score 0 here. This will expand as AI-native payment protocols mature. ### 8. Product Pages (variable) For domains with e-commerce product pages, we auto-discover product URLs from the sitemap and audit up to 10 pages for Schema.org product schema (Product, ProductGroup, IndividualProduct), completeness of offers/price/availability, image and brand presence, and aggregate ratings. Scored as a percentage of observed product page completeness; weight adjusts to domain size. ## The grade scale | Grade | Score | What it means | | --- | --- | --- | | A | 90–100% | AI-optimised: llms.txt, strong discoverability, explicit bot policies, schema complete | | B | 75–89% | AI-friendly: bots allowed, llms.txt present, some GEO signals in place | | C | 60–74% | AI-accessible: core discoverability works, higher-order signals missing | | D | 45–59% | AI-limited: significant blocks or gaps, agents struggle to get accurate information | | F | 0–44% | AI-inaccessible: CDN blocks, critical failures, or deliberate AI exclusion | ## Common misreads "A D score means AI can't find us." Not necessarily. A D often means you're accessible but haven't implemented bot-access rules or llms.txt explicitly. Agents can still reach you; they have less policy certainty and fewer curated signals to work from. "An A means we're done." The spec is evolving. An A today is passing the current 46+ checks. Weights will shift as adoption rises — A2A and OAuth resource metadata are currently at 1–2pts because <1% of sites implement them. When 10% do, the weight goes up. "Low score is about content quality." We don't score content. We score discoverability, access signals, and LLM-observable presence. A site with brilliant content behind a CDN firewall scores the same as a site with no content behind the same firewall. "The LLM check is subjective." The GEO Presence checks use a deterministic probe with a fixed model (llama-3.1-8b via CF Workers AI) and structured parsing. The same domain at the same time should return the same result. The model is updated by CF — results may shift when the model checkpoint updates. ## What we don't check (yet) Render gap analysis requires a headless browser — measuring what JS-rendered content looks like to AI crawlers vs what the raw HTTP response contains. Not free at scale, so it's not scored. It's the biggest single gap in the current methodology. **Tags:** Scoring, Methodology, Agent Readiness

Product schema in the AI era: why your store's JSON-LD is now table stakes

Mon, 11 May 2026 00:00:00 GMT

# Product schema in the AI era: why your store's JSON-LD is now table stakes > AI shopping agents don't browse — they parse. If your product pages don't emit Product, Offer, and AggregateRating in server-rendered JSON-LD, you're invisible to the next wave of commerce traffic. **Category:** Structured Data | **Date:** 2026-05-11 | **Read:** 7 min --- When a user asks ChatGPT 'what are the best sustainable running shoes under $150?', the model doesn't open a browser and shop. It draws from training data and, increasingly, from real-time agent-browsing where AI tools fetch product pages and extract structured information. What they find — or don't find — determines whether your products appear in AI-generated recommendations. The mechanism is product schema: JSON-LD embedded in your page HTML that declares product name, price, availability, ratings, brand, and identifiers in a machine-readable format. It's been used by Google for rich results since 2012. In 2026, it's become the primary signal AI shopping agents use to extract product information without running JavaScript. ## What AI agents actually extract from a product page When an AI agent visits a product page, it issues an HTTP GET — the same request a curl command makes. It receives the server-rendered HTML. On a well-built commerce site, embedded somewhere in that HTML is a `