Research Foundation

Built on peer-reviewed science, not marketing claims.

GEO: Generative Engine Optimization

KDD 2024 — Princeton, Georgia Tech, AI2, IIT Delhi
Tested 9 optimization strategies across 10,000 queries on GEO-bench. Validated on Perplexity.ai with 22-37% real-world improvements. The foundational controlled experiment for GEO.
Key finding: Adding citations to content increases AI visibility by up to +115%. Statistics add +40%. Keyword stuffing has ~0% effect.
Cite Sources +27-115%
Quotations +41%
Statistics +33%
Fluency +29%
Technical Terms +18%
Authority +16%
Readability +14%
Unique Words +7%
Keyword Stuffing ~0%
Read paper →

AutoGEO: Automatic Generative Engine Optimization

ICLR 2026 — Carnegie Mellon University
Uses frontier LLMs to automatically discover optimization rules, then trains compact models via reinforcement learning. Achieved +50.99% improvement over best Princeton baseline. Introduces GEU (Generative Engine Utility) metric.
How we use it: Informs our scoring weight updates and content analysis heuristics. Answer-first structure and passage density checks are based on AutoGEO findings.
Read paper →

C-SEO Bench: Conversational SEO Methods

2025 — Puerto et al.
Found that most conversational SEO methods are largely ineffective and frequently have negative impact. As adoption increases, gains decrease — revealing a zero-sum competitive dynamic.
How we use it: Validates our focus on technical infrastructure optimization over content manipulation. We build white-hat GEO tools.
Read paper →

Schema Markup & AI Citations

Growth Marshal, February 2026 — Study of 730 pages
Attribute-rich schema markup earns 61.7% citation rate vs 41.6% for generic schema. Generic schema actually underperforms having no schema at all.
How we use it: Our schema richness check differentiates between rich and generic schema. Generic schema gets zero bonus points.

AI Citations Report 2026

OtterlyAI — 1M+ citations analyzed
73% of sites have technical barriers blocking AI crawlers. Reference-grade content receives 3-5x more citations. Reddit is the #1 cited domain across all AI platforms.
How we use it: Our robots.txt, CDN, and JS rendering checks address the 73% barrier problem. Content quality analysis targets reference-grade optimization.

AI Mode Citation Factors

SE Ranking, 2025 — 2.3M pages analyzed
Domain traffic is the #1 predictor (3x), followed by referring domains (2.5x). Flesch-Kincaid Grade 6-8 is the readability sweet spot. Sections of 100-150 words between headings are optimal. FAQ schema markup has zero impact — only real FAQ content in the body matters.
How we use it: Informs our readability, passage density, FAQ-in-content, and content structure checks.
GEO Optimizer focuses on infrastructure optimization (robots.txt, llms.txt, schema, meta tags, content structure) — not content manipulation. The adversarial research (ETH Zurich, Harvard, UC Berkeley) shows that manipulative techniques degrade the ecosystem. We build tools for white-hat GEO.