Methodology
LLMSE grades websites across six dimensions using automated analysis. Each analyzer produces a score from 0 to 100, mapped to a letter grade (A–F). This page explains how each grade is calculated.
Grade Thresholds
Each analyzer uses its own grade thresholds tuned to the nature of the metric:
| Grade | SEO | EEAT | AEO | Readability | WCAG | GARM |
|---|---|---|---|---|---|---|
| A | 90–100 | 90–100 | 85–100 | 60–100 | 90–100 | 80–100 |
| B | 80–89 | 80–89 | 70–84 | 50–59 | 80–89 | 60–79 |
| C | 70–79 | 70–79 | 55–69 | 30–49 | 70–79 | 40–59 |
| D | 60–69 | 60–69 | 40–54 | 10–29 | 60–69 | 20–39 |
| F | 0–59 | 0–59 | 0–39 | 0–9 | 0–59 | 0–19 |
SEO Analysis
Approach: Deduction-based. Every page starts at 100 points; issues found during analysis deduct from the score.
The analyzer runs 96+ checks across 20 categories including title tags, meta descriptions, headings, structured data, images, links, URL structure, security headers, and HTML validation. The final score is clamped to 0–100.
E-E-A-T Analysis
Approach: Weighted deduction across four pillars, each scored independently from 100.
Each pillar deducts points for issues (critical: −20, warning: −10, info: −3) and awards up to +20 bonus points for positive signals. The four pillar scores are combined using the weights above into a single overall score.
AEO Analysis
Approach: Additive scoring across 10 metrics that measure how well content is optimized for AI answer engines (ChatGPT, Perplexity, Gemini, Claude).
A clickbait penalty of up to −10 points is applied for ALL CAPS headings, excessive punctuation, and known clickbait phrases. The final score is clamped to 0–100.
Readability Analysis
Approach: Direct measurement using the Flesch Reading Ease formula. Higher scores mean easier-to-read content.
Thresholds are web-optimized: most successful web content scores 60+ (grade A) because general audiences prefer clear, concise writing. Additional metrics reported include Flesch-Kincaid grade level, reading time, word count, and difficult word count.
WCAG Accessibility Analysis
Approach: Deduction-based. Starts at 100, deducts for accessibility issues found via 15 automated checks against WCAG 2.1 Level A criteria.
Automated checks cover approximately 30–40% of WCAG 2.1 Level A criteria. Full compliance requires manual testing for interactive behaviors, color contrast, keyboard navigation, and screen reader compatibility.
GARM Brand Safety Analysis
Approach: Category-based risk mapping using the GARM Brand Suitability Framework. LLMSE's 58-category taxonomy is mapped to GARM risk levels.
The base risk score is adjusted by content sentiment: positive sentiment adds up to +15 points, negative sentiment deducts up to −20 points. The 11 GARM categories evaluated include adult content, arms and ammunition, crime, death/injury/military conflict, sensitive social issues, online piracy, hate speech, terrorism, drugs, spam, and obscenity.
Data Collection
All grades are computed from automated static analysis of HTML content fetched at classification time. LLMSE does not execute JavaScript or measure runtime performance. The analysis pipeline:
- Fetches the target URL and parses the HTML response
- Runs each analyzer independently against the parsed content
- Calculates scores and maps them to letter grades using the thresholds above
- Caches results in Redis for subsequent lookups
Technology detection (CMS, frameworks, server software), mail provider identification (MX records), and DNS provider identification (NS records) are performed separately during domain enrichment.
Limitations
- Static analysis only — JavaScript-rendered content, client-side interactions, and runtime performance are not evaluated
- WCAG coverage — Automated checks cover 30–40% of WCAG 2.1 Level A; full accessibility compliance requires manual testing
- Point-in-time snapshot — Grades reflect the page content at the time of analysis and may change as sites are updated
- Single-page scope — Each analysis evaluates the specific URL provided, not the entire website