What AI Crawlers See – and What They Don't
Much of what is considered standard in everyday SEO is invisible to LLMs in conversation mode. The new Writesonic Study 2026 analyzed six major LLM crawlers (ChatGPT, Claude, Gemini-Conversation, Perplexity, Bing Copilot, Google AI Overviews) and confirms: all content must be delivered in a way that remains understandable without JavaScript, without extensive scrolling, and without metadata in the head.
The Three Crawler Tiers (HTML-only / Headless / Full-Browser)
A simple grid helps with classification, where today's LLM crawlers mostly fall into Tier 1–2:- Tier 1: HTML-only Parser – read static HTML, follow links to a limited extent, do not execute JavaScript, do not scroll.
- Tier 2: Headless Light – can parse HTML more robustly, take individual fallbacks (e.g.,
- Tier 3: Full-Browser – execute JavaScript and complex rendering and scroll. However, behaviors from Tiers 1–2 dominate in the study.
What All 6 LLMs Reliably See — and What Not
The Writesonic tests reveal clear patterns:- 3 out of 6 LLMs do not execute JavaScript. JS-only content (SPAs, client-side injected reviews, lazy-load content) remains invisible.
- 0 out of 6 LLMs read JSON-LD in live conversation mode. Important: JSON-LD remains relevant for the Google search index – two different worlds.
- 0 out of 6 LLMs read meta descriptions or OG tags in conversation mode.
- 5 out of 6 LLMs reliably read the
tag. It is thus the most important head element for LLM-read. - 0 out of 6 LLMs scroll. Content "below the fold" and lazy-loaded images/text blocks are ignored.
- 4 out of 6 LLMs read
- CSS-hidden content (e.g., display:none, accordions) is visible; ::before/::after pseudo-content is invisible.
- Microdata in the body is read better than JSON-LD in the head.
Title Tag is the New Gold — Practical Recommendation
If 5 out of 6 crawlers reliably read the title, it becomes the central lever. Recommendations:- Precise, information-dense, under 60 characters; avoid empty phrases.
- Structural suggestion: Primary topic | specific benefit | brand.
- Vary by page type (category, product, guide), but keep terminology consistent.
- Place the strongest terms first; brand name at the end, unless there is strong brand demand.
- Synchronize H1 and title semantically without blind copying: The title condenses, the H1 explains.
JSON-LD is Not Dead, but Microdata Wins in LLM-Read
The study shows: In conversation mode, JSON-LD is not read, whereas Microdata in the body tends to be. Action recommendation:- Keep JSON-LD for the search index (products, FAQs, organization, breadcrumbs).
- Mirror critical facts as Microdata directly in the visible body (prices, availability, ratings), content-identical to the JSON-LD.
- Avoid contradictions between structured data and visible text.
- Use semantic HTML elements (article, header, nav, main, footer) to cleanly structure the body content for parsers.
Lazy Loading & Below-the-fold — The Underestimated Killers
Since 0 out of 6 LLMs scroll, lazy-load mechanisms cut off core content from perception. Consequences:- Place the core message, product USPs, prices, and primary calls-to-action above the first viewport height.
- Load hero-relevant images/text content without lazy-load (or with server-side inline fallback). Do not use purely client-side injection for main content.
- Provide
- Check SPAs and review widgets: without server-side rendering, they remain invisible.
What Our Audit Makes of This
We are expanding technical audits to include four targeted checks and moderately adjusting weightings:- title_tag_present: Checks existence, length, and precision of the title tag per URL.
- lazy_loaded_main_content: Warning if essential content is exclusively lazy-loaded or only becomes visible after interaction.
- noscript_fallback_present: Records whether a
- css_generated_critical_content: Reports risk if relevant copy is generated via CSS pseudo-elements (::before/::after).