How many issues does your crawler actually find?

crawlbench is a fixed corpus of deliberately broken pages — planted SEO and AI-search defects with ground-truth labels — used to score the detection rate of commercial and open-source crawlers. One number, reproducible, versioned.

# Tool Type License Detection rate Found Missed False+ Ignored
1 SECrawl both freemium 80% 242 61 0 7,570
2 Screaming Frog SEO Spider seo commercial 31% 94 209 0 2,809

Run date: 2026-06-15 · Corpus: v0.1 · 2 tools tested against 303 planted issues across 25 categories.

Planted, not pulled

Every issue in the corpus is intentionally introduced and labelled. No "real websites" — no ambiguity about whether an issue is actually a bug.

Reproducible

The corpus is versioned and self-hostable. Anyone can re-run any tool against the same fixtures and replicate the score.

SEO + GEO

Coverage spans classic technical SEO (crawlability, indexability, hreflang) and the newer AI-search readiness category that few tools test rigorously.

What's in scope

The v1 corpus targets seven issue categories. See the methodology page for the full taxonomy and scoring rules.