The Sentiment Economy: What 1.17 Million URLs Reveal About Positive vs Negative Web Content
The internet feels negative. Doom-scrolling is a real phenomenon — a Stanford HAI study found that news sources publish nearly twice as much negative as positive content, and a 2023 Nature Human Behaviour study proved that each negative word in a headline increases click-through rates by 2.3%. The algorithm rewards outrage. The perception is that the web is a hostile place.
The data says otherwise. We classified 1,172,644 URLs by content sentiment — Good, Neutral, or Bad — as part of LLMSE's multi-dimensional website analysis. The headline finding: 83.5% of the web is positive. Just 0.32% is negative. For every URL classified as Bad, there are 260 classified as Good.
That gap between perception and reality has consequences. Advertisers waste $26.8 billion annually on programmatic inefficiencies, partly because imprecise sentiment assessment leads to over-blocking of brand-safe inventory. The GARM Brand Suitability Framework — the industry's shared language for content risk — was dissolved in August 2024 after X/Twitter filed an antitrust lawsuit, leaving no replacement standard. And 78% of brands cite brand safety as a top concern despite actual brand risk being at a record low of 1.5%.
This report maps the sentiment landscape of the web: where positivity concentrates, where the 0.32% hides, how sentiment correlates with quality, and what it all means for the $4.82 billion ad verification industry.
The Data
We analyzed 1,172,644 URLs with sentiment classifications in LLMSE's database as of February 26, 2026.
| Sentiment | URLs | Share |
|---|---|---|
| Good | 979,325 | 83.5% |
| Neutral | 189,557 | 16.2% |
| Bad | 3,762 | 0.32% |
Three observations stand out:
- The web is overwhelmingly positive. 83.5% of classified content has positive sentiment — upbeat, constructive, or affirmatively toned. This aligns with Meta's transparency reports showing less than 1% of Facebook and Instagram content is removed for policy violations.
- Neutral content is the real second tier. 16.2% of URLs are neutral — informational, factual, or editorial content without strong sentiment signals. This is where reference material, documentation, and straight reporting live.
- Negative content is vanishingly rare. At 0.32%, Bad sentiment content is a statistical rounding error. The 3,762 negative URLs are outnumbered 260-to-1 by positive content.
Where Positivity Lives: Sentiment by Category
The 83.5% average masks variation across content categories. Some industries are almost uniformly positive; others harbor disproportionate negativity.
The Most Positive Categories
| Category | Matched URLs | Good | Good% | Bad | Bad% |
|---|---|---|---|---|---|
| Shopping | 20,435 | 18,666 | 91.3% | 17 | 0.1% |
| Style and Fashion | 5,699 | 5,100 | 89.5% | 11 | 0.2% |
| Entertainment | 151,136 | 133,969 | 88.6% | 187 | 0.1% |
| Food and Drink | 20,184 | 17,660 | 87.5% | 23 | 0.1% |
| Travel | 13,317 | 11,638 | 87.4% | 35 | 0.3% |
| Business and Industry | 252,417 | 219,860 | 87.1% | 328 | 0.1% |
| Internet and Telecom | 32,707 | 28,503 | 87.1% | 411 | 1.3% |
Shopping leads at 91.3% positive — and with a staggering 1,098:1 Good-to-Bad ratio. For every negative shopping URL, there are over a thousand positive ones. This makes intuitive sense: e-commerce is fundamentally optimistic. Product pages, reviews, and promotional content are designed to persuade, not depress.
Food and Drink (87.5%), Style and Fashion (89.5%), and Entertainment (88.6%) follow the same pattern — consumer-facing categories where the commercial incentive is to present content positively.
The Most Negative Categories
| Category | Matched URLs | Good% | Bad | Bad% | Good:Bad Ratio |
|---|---|---|---|---|---|
| Adult | 21,535 | 63.6% | 680 | 3.2% | 20:1 |
| Disasters | 1,749 | 65.1% | 39 | 2.2% | 29:1 |
| Crime | 2,726 | 69.7% | 54 | 2.0% | 35:1 |
| Sensitive Topics | 7,407 | 81.4% | 108 | 1.5% | 56:1 |
| Internet and Telecom | 32,707 | 87.1% | 411 | 1.3% | 69:1 |
| News and Media | 28,270 | 82.5% | 211 | 0.7% | 111:1 |
Adult content leads negative sentiment at 3.2% — ten times the web average. This is the single largest source of Bad URLs, contributing 680 of the 3,762 total (18.1%). Yet even Adult content is predominantly positive (63.6%) — the category is negative relative to the web, not in absolute terms.
Disasters (2.2%) and Crime (2.0%) follow predictably — these categories cover inherently negative topics. But even here, the Good-to-Bad ratios are 29:1 and 35:1 respectively. Crime prevention sites, disaster relief organizations, and educational content vastly outnumber sensationalist coverage.
Internet and Telecom's 1.3% stands out. This is the fourth-most-negative category by percentage, driven by Telecommunications (116 Bad URLs), File Sharing (52), and Web Hosting (52). Complaint-oriented content — service outage reports, hosting reviews, and telecom consumer grievances — concentrates negativity here.
Where the 3,762 Bad URLs Hide
The 0.32% of negative content isn't randomly distributed. It clusters in specific subcategories:
| Subcategory | Bad URLs | Share of All Bad |
|---|---|---|
| Adult > Photos | 178 | 4.7% |
| Adult > Interracial | 139 | 3.7% |
| Internet and Telecom > Telecommunications | 116 | 3.1% |
| Business and Industry > Business Services | 114 | 3.0% |
| Adult > Videos | 111 | 2.9% |
| Sensitive Topics > Spam or Harmful Content | 81 | 2.2% |
| Agriculture > Crop Farming | 61 | 1.6% |
| Entertainment > Movies | 57 | 1.5% |
| Internet and Telecom > File Sharing | 52 | 1.4% |
| Internet and Telecom > Web Hosting | 52 | 1.4% |
| Computer and Electronics > Programming | 48 | 1.3% |
Adult content accounts for 18.1% of all Bad URLs across its subcategories (680 total), making it the single largest contributor. But the remaining 81.9% of negative content is spread across 28 other categories — no single non-Adult category exceeds 11%.
Telecom complaints (3.1%) and business services (3.0%) represent the consumer frustration layer of the web — sites where people document negative experiences with service providers.
The presence of Agriculture > Crop Farming (1.6%) is unexpected. Manual inspection reveals these are sites discussing crop failures, pest infestations, and agricultural market downturns — genuinely negative topics in a category that rarely appears in brand safety discussions.
Sentiment and Quality: Do Positive Sites Score Better?
We cross-referenced sentiment with four quality dimensions. The relationship between sentiment and quality is not straightforward.
EEAT (Expertise, Experience, Authoritativeness, Trustworthiness)
| Sentiment | A | B | C | D | F | Total |
|---|---|---|---|---|---|---|
| Good | 3.0% | 22.8% | 23.7% | 44.4% | 6.1% | 531,815 |
| Neutral | 7.6% | 11.4% | 19.5% | 60.6% | 0.9% | 90,154 |
| Bad | 2.5% | 19.0% | 50.7% | 26.0% | 1.8% | 2,852 |
Bad-sentiment sites cluster in C-grade EEAT (50.7%) — more than double the Good-sentiment rate (23.7%). This suggests that negative content tends to come from sites with moderate but not terrible expertise signals. Sites bad enough to score D or F on EEAT (thin content, no author credentials) often don't have enough substance to register strong negative sentiment either.
Neutral content has the highest A-grade rate (7.6%) — 2.5x higher than Good or Bad. This makes sense: the most authoritative sites (universities, government agencies, reference databases) tend to present information neutrally rather than with positive or negative framing.
GARM Brand Safety
| Sentiment | A (Safe) | B | C | D | F (Floor) | Total |
|---|---|---|---|---|---|---|
| Good | 94.9% | 2.8% | 1.1% | 0.0% | 1.3% | 62,193 |
| Neutral | 88.8% | 5.2% | 0.0% | 1.7% | 4.3% | 8,939 |
| Bad | 77.2% | 0.0% | 4.5% | 0.0% | 18.3% | 531 |
Sentiment and brand safety are strongly correlated — but not perfectly. 77.2% of Bad-sentiment URLs are still brand-safe (GARM A-grade). A URL can have negative sentiment (a complaint about a product, a critical review, a report on a scandal) and still fall outside all 11 GARM risk categories.
The critical insight: 18.3% of Bad-sentiment URLs hit the GARM floor (F-grade, universally unsafe for advertising), compared to just 1.3% of Good-sentiment and 4.3% of Neutral-sentiment URLs. Negative sentiment is 14x more likely to be GARM floor content than positive sentiment.
SEO Quality
| Sentiment | A+B+C (Passing) | D+F (Failing) | Total |
|---|---|---|---|
| Good | 1.9% | 98.1% | 687,896 |
| Neutral | 2.6% | 97.4% | 124,797 |
| Bad | 2.6% | 97.4% | 3,092 |
SEO quality is essentially independent of sentiment — all three groups have 97-98% failure rates. This is consistent with the broader finding from our State of Website SEO 2026 report: the vast majority of websites fail basic SEO regardless of their content quality or sentiment.
WCAG Accessibility
| Sentiment | A (Pass) | F (Fail) | Total |
|---|---|---|---|
| Good | 17.7% | 27.0% | 68,422 |
| Neutral | 14.1% | 33.4% | 8,928 |
| Bad | 18.5% | 41.0% | 524 |
Negative-sentiment sites have the worst accessibility, with 41.0% scoring F on WCAG checks — compared to 27.0% for positive content. Sites producing negative content appear less likely to invest in accessibility infrastructure.
The Language Factor: Where Negativity Speaks
Sentiment varies meaningfully by content language:
| Language | URLs | Good% | Bad% |
|---|---|---|---|
| Czech | 5,535 | 83.9% | 4.4% |
| Chinese | 36,881 | 75.5% | 1.5% |
| Japanese | 23,451 | 85.7% | 1.2% |
| Turkish | 7,411 | 75.3% | 0.9% |
| Korean | 6,191 | 81.7% | 0.8% |
| Polish | 8,926 | 83.5% | 0.6% |
| Thai | 2,238 | 90.5% | 0.6% |
| French | 33,453 | 85.2% | 0.4% |
| Spanish | 27,195 | 85.9% | 0.4% |
| German | 50,549 | 88.4% | 0.2% |
| English | 806,624 | 84.3% | 0.1% |
| Indonesian | 8,311 | 85.4% | 0.1% |
Czech leads negativity at 4.4% — 14x the English rate. The Czech web's Bad-sentiment concentration is driven by specific content patterns in the Czech online ecosystem.
Chinese (1.5%) and Japanese (1.2%) have notably higher negativity rates than European languages. This may reflect cultural differences in how sentiment is expressed online, or differences in the types of content that dominate each language's web presence.
English has the lowest negativity rate (0.1%) among major languages — reflecting the sheer volume of commercial, corporate, and institutional content in English that dilutes any negative signal. With 806,624 URLs, the English web's scale means that even rare negative content amounts to 898 URLs in absolute terms.
Thai is the most positive language at 90.5% Good sentiment — consistent with cultural communication norms that favor positive framing.
The Gender Non-Effect
Unlike most other dimensions, sentiment barely varies by gender targeting:
| Target Gender | URLs | Good% | Neutral% | Bad% |
|---|---|---|---|---|
| Male | 650,777 | 81.9% | 17.8% | 0.3% |
| Female | 310,760 | 86.7% | 13.1% | 0.3% |
| All | 198,520 | 84.1% | 15.6% | 0.3% |
All three gender segments have identical Bad-sentiment rates (0.3%). Female-targeted content is slightly more positive (86.7% vs 81.9%) — a 4.8 percentage point gap driven by the higher proportion of lifestyle, beauty, health, and shopping content in female-targeted categories. But negative content is equally rare across all audience segments.
The Infrastructure View
Web server choice shows minor sentiment variation:
| Server | URLs | Good% | Bad% |
|---|---|---|---|
| Cloudflare | 319,038 | 83.1% | 0.3% |
| nginx | 258,880 | 79.3% | 0.5% |
| Apache | 181,606 | 84.8% | 0.4% |
| LiteSpeed | 46,873 | 83.3% | 0.3% |
nginx has the highest Bad-sentiment rate (0.5%) — modest in absolute terms but 67% higher than Cloudflare. nginx's open-source, self-hosted nature means it serves a wider range of content types, including sites that larger CDN providers might decline to serve. Apache's slightly higher positivity (84.8%) reflects its concentration in established business and institutional sites.
What This Means for Advertisers
1. The web is safer than the narrative suggests
At 83.5% positive and 0.32% negative, the data contradicts the doom-scrolling narrative. The Stanford HAI research explains the disconnect: negative content is shared 1.91x more often and each negative headline word increases clicks by 2.3%. The algorithm amplifies negativity, but the underlying content is overwhelmingly positive. Advertisers over-blocking based on perceived risk are excluding brand-safe inventory at scale.
2. Sentiment is not brand safety
77.2% of Bad-sentiment URLs are still GARM brand-safe. A negative product review, a critical news article, or a complaint about a telecom provider all register as negative sentiment but carry zero brand risk. Treating sentiment and safety as equivalent leads to massive over-blocking. DoubleVerify's 2025 data shows the actual brand suitability violation rate is 5.2% globally — and declining.
3. Negative content concentrates predictably
Adult (18.1% of all Bad URLs), Internet and Telecom (10.9%), and Business Services (8.7%) account for 37.7% of all negative content. For advertisers, this means category-level filters remain the most efficient tool — sentiment filtering alone would miss the 77.2% of negative content that's brand-safe, while catching positive content in risky categories would require a different signal entirely.
4. The GARM vacuum creates opportunity
With the GARM framework dissolved, advertisers lack a shared standard for content suitability. Forrester noted that the dissolution "laid bare the fragility of self-regulatory approaches." The $4.82 billion ad verification market (projected to reach $15.87 billion by 2033) needs data-driven frameworks that separate perception from reality. Tools that can distinguish a negative restaurant review (brand-safe) from hate speech (brand-unsafe) at scale are where the market is heading.
5. Language matters for international campaigns
Czech content is 44x more likely to be negative than English content. Chinese and Japanese are 15x and 12x more likely respectively. International advertisers need language-specific sentiment thresholds rather than global blocklists. A 0.3% negative rate in English is not the same risk profile as 4.4% in Czech.
Key Findings
-
83.5% of 1.17 million URLs are positive, 0.32% are negative. The web's sentiment distribution is 260:1 in favor of positive content.
-
Shopping is the most positive category (91.3%) with a 1,098:1 Good-to-Bad ratio. Adult is the most negative (3.2%) but still majority positive (63.6%).
-
77.2% of Bad-sentiment URLs are brand-safe. Sentiment and brand safety are correlated but not equivalent — conflating them leads to massive over-blocking of safe inventory.
-
Negative sentiment correlates with lower quality. Bad-sentiment sites are 14x more likely to hit the GARM floor, 50.7% cluster in C-grade EEAT (vs 23.7% for Good), and 41.0% fail WCAG accessibility (vs 27.0% for Good).
-
Language is the strongest predictor of negative sentiment. Czech (4.4% Bad) runs 44x higher than English (0.1%). Chinese, Japanese, Turkish, and Korean all exceed 0.8%.
-
Gender targeting has no effect on sentiment. Male, female, and all-audience content have identical 0.3% negativity rates.
-
The 3,762 Bad URLs cluster in Adult (18.1%), Telecom (10.9%), and Business Services (8.7%). Three subcategories account for over a third of all negative content on the web.
Methodology
This analysis covers 1,172,644 URLs with sentiment classifications in the LLMSE database as of February 26, 2026. Sentiment (Good, Neutral, Bad) is assigned during the LLM-based classification process alongside category, subcategory, demographics, and language.
Cross-references were computed using Redis sorted set intersections between the sentiment-{Good|Neutral|Bad} indices and category, quality grade (seo-{A-F}, eeat-{A-F}, wcag-{A-F}, garm-{A-F}, readability-{A-F}), gender (sex-{male|female|all}), language (lang-{Language}), and server (server-{Server}) indices. All intersections represent domains present in both the sentiment index and the cross-referenced dimension.
Limitations: (1) Sentiment is classified by the LLM as part of the multi-dimensional analysis — it reflects the model's assessment of overall content tone, not human-annotated ground truth. (2) The 1.17M URL dataset is biased toward the commercial web (sites submitted for classification or discovered through crawling); it is not a random sample of all internet content. (3) "Bad" sentiment captures content with negative tone or harmful framing but does not distinguish between types of negativity (e.g., legitimate criticism vs. malicious content). (4) Category-level analysis counts each URL equally regardless of traffic or influence.
External statistics are sourced from ANA, DoubleVerify, Stanford HAI, Nature Human Behaviour, Meta transparency reports, Forrester, and other cited publications. These provide industry context but were not generated from LLMSE data.
Explore the Data
Browse sentiment-filtered results on LLMSE — search for s:Good, s:Neutral, or s:Bad using the advanced search. Cross-reference with categories, quality grades, and demographics using the filter system. The REST API provides programmatic access to all classification data including sentiment. Check any URL's sentiment with the comprehensive audit.
This analysis was conducted using LLMSE, which has classified over 1.4 million websites across SEO, EEAT, WCAG accessibility, readability, and GARM brand safety dimensions. All data reflects the database as of February 2026. To analyze your own site, visit llmse.ai/classify.