When a Content Network Starts Publishing to Itself

  • by

Full opportunity report: When a Content Network Starts Publishing to Itself on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

A large automated content network is self-publishing primarily to a few sites, leaving over half of the sites inactive. This was confirmed through a 28-day audit revealing skewed distribution. The issue stems from both placement and supply mismatches, with ongoing efforts to fix the imbalance.

A large automated content network is primarily publishing to just a handful of sites, leaving over half of its sites inactive, according to a recent audit. This imbalance could affect the network’s SEO health and content diversity, raising questions about the system’s underlying processes. When a Content Network Starts Publishing to Itself

The network consists of 474 WordPress sites managed by two interconnected systems: Stenvrik, which sources and judges news content, and DojoClaw, which rewrites and distributes it. Despite the systems functioning correctly at an individual decision level, a 28-day audit revealed that 80% of the content was concentrated on only 8% of the sites, mostly technology-focused outlets. Meanwhile, 249 sites received no posts during that period.

The core issue was identified as two separate problems: first, within-topic concentration, where the LLM-based matcher kept surfacing the same popular tech sites, ignoring less active or dormant sites; second, a supply-demand mismatch, as most content was tech-related, but the majority of sites covered other categories like Home, Health, and Food, which received little to no content. The imbalance was not due to flawed routing but stemmed from systemic supply and placement issues.

To address this, the team implemented targeted fixes in DojoClaw’s selection process, including caps on site publication frequency, a global recency-based ordering to prioritize idle sites, and a starvation floor to ensure dormant sites could still receive content. These changes aim to diversify distribution and balance the network’s output across all sites.

Balancing a 474-site network — ThorstenMeyerAI.com

<!– DEPLOY: swap this Google Fonts for self-hosted base64 woff2 (German GDPR) –>

ThorstenMeyerAI.com
AI & Tooling · Engineering Note
Systems at scale

When a content network starts publishing to itself

A 474-site network quietly collapsed onto 38 of its own favorites while half the catalog went dark. The throughput graph looked fine. The fix wasn’t one thing — it was two causes and a three-part repair across two decoupled systems.

Stenvrik

News-intelligence layer

Ingests hundreds of feeds, scores & geo-tags stories, surfaces what’s trending.

SUPPLY · what’s worth covering

DojoClaw

AI content engine

Rewrites a story in each site’s voice and fans it out across the catalog.

PLACEMENT · where it lands & how it reads

01The symptom

80% of output on 8% of sites

A 28-day audit, bucketed per site, was lopsided in a way the totals had hidden. Every individual placement was “correct” — the aggregate was a slow-motion failure.

Where 28 days of syndication actually landed

474-site catalog · per-site audit

Top 38 sites8% of catalog
80% of all posts
Top 4 sitesall tech titles
200+ articles/week each
249 sites53% of catalog
ZERO posts — half the network dark

02The diagnosis · refuse the obvious

Not one bug — two independent causes

The tempting move is to blame the matcher and move on. The data showed two distinct problems living on two different systems, each needing its own fix.

Cause 1 · DojoClaw

Within-topic concentration

The matcher kept surfacing the same broad tech sites for every tech story, and rotation only shuffled candidates within the matched pool. A site that never entered the pool could never get a turn — fair only among the already-chosen.

Cause 2 · Stenvrik

Supply ≠ demand

53% of supplied content was tech/AI — but only ~13% of sites are. The catalog skews the other way, so those sites starved for on-topic material.

supply
tech/AI content in53%
demand
tech/AI sites in catalog~13%

03The load balancer · flip it

Watch the network rebalance

Each square is one of the 474 sites; color is how much it’s publishing. Toggle the selection logic to see placement spread off the red-hot favorites and into the dark long tail.

Placement simulator

Same matcher relevance gate either way — the only change is how candidates are ordered after it.


38
sites carrying 80% of posts
249
dark sites · zero posts
overloaded
hottest sites at ~30/day
dark · 0
light
healthy
busy
overloaded

04The three-part fix

Placement, supply, throughput

Two causes meant the fix had to touch both systems — and only then could the ceiling rise without re-concentrating the load.

1

Placement levers

DojoClaw

Per-site weekly cap — any site over 25 posts/7d drops from the pool, pushing selection into the long tail (relaxes only if it would starve a fan-out).
Global LRU — order by network-wide recency, not just within-topic, so sites idle across the whole network float to the top.
Starvation floor — guaranteed by construction: the most-idle eligible site is always within the picks.

2

Supply rebalance

Stenvrik

Audited existing feeds for liveness — removed ones returning HTTP 200 but zero items (broken RSS).
Added a verified batch across Home, Garden, Health, Food, Fashion, Auto, Science, Pets & more — every feed fetched live first, weighted to the most idle categories.
Flagged throttled feeds (big publishers exposing only 1–2 items) for replacement rather than burying the risk.

3

Throughput raise

Scheduler

Fan-out width maxSites 5 → 7 — the extra slots land on fresh sites because the cap is now enforcing.
Quota depth K 2 → 3 — every category’s daily cap scaled ×1.5.
Honest note: a documented ~950/day intent the code never delivered (units quirk) stays gated behind a sign-off.

05What it adds up to

The scoreboard — with an honest asterisk

The change is behavioral: it shapes future placement, it doesn’t retroactively rescue the month sites sat dark. The proof is in the next weeks of data — which is why the instrumentation is the real deliverable.

Metric
Before
After
Concentration
80% on 38 sites
cap + LRU + floor
Dormant sites
249 (53%)
shrinking ↓
Feed sources
245
271 verified
Daily ceiling
~188/day
~280/day · +49%
Fan-out width
5
7
Why two systems, not one

Supply and placement are genuinely separate concerns. Diagnosing the imbalance meant looking at both sides and seeing they disagreed. A clean boundary made a failure that spanned both legible — good system boundaries organize thought, not just code.

The tradeoff taken

Ordering by load & idleness sacrifices a little topical ranking for dramatically better coverage. All candidates already cleared the relevance gate — so it’s a deliberate trade, not a regression.

ThorstenMeyerAI.com
Stenvrik (news-intelligence) DojoClaw (content engine) · figures reflect the May 2026 engineering audit & the behavioral changes made in response · the network’s response is being tracked.

Implications for Content Network Health and SEO

This imbalance can harm the network’s overall SEO performance by creating patterns search engines might interpret as spammy or low-quality. It also reduces content diversity and value for the less-active sites, potentially impacting their growth and relevance. The case highlights the importance of systemic checks and dynamic balancing in automated publishing systems to prevent self-reinforcing skewness that can undermine long-term sustainability.

Background of Automated Content Distribution Challenges

Large-scale automated content networks often rely on complex systems to source, judge, and distribute articles across many sites. Previous issues have included content quality, relevance, and distribution fairness. This case underscores how even well-functioning decision components can produce unintended systemic effects, such as over-concentration on popular sites, if systemic biases or supply mismatches are not addressed.

The specific setup involves two decoupled systems—Stenvrik for content selection and DojoClaw for content rewriting and distribution—each making independent decisions based on different criteria. When a Content Network Starts Publishing to Itself The recent audit revealed that these decoupled processes, while efficient at decision-making, can also produce unintended imbalances if not carefully managed.

“The system was functioning correctly at each decision point, but the aggregate behavior was skewed, leading to a lopsided distribution that went unnoticed until we examined the data in detail.”

— Thorsten Meyer, system operator

Unresolved Questions About Long-Term Impact

It is not yet clear how effective the implemented fixes will be over time or whether additional systemic adjustments will be necessary. The long-term impact on SEO, site engagement, and content diversity remains to be seen, as ongoing monitoring is required. When a Content Network Starts Publishing to Itself

Next Steps for Balancing Content Distribution

The team plans to monitor the network closely over the coming weeks to evaluate the effectiveness of recent algorithm adjustments. Further refinements may include dynamic weighting of site activity, more granular control over content categories, and periodic audits to prevent recurrence of imbalance. Additionally, broader systemic changes could be implemented to ensure more equitable distribution across all sites.

Key Questions

Why did most content go to only a few sites?

The system’s current algorithms favored popular tech sites due to within-topic concentration and a supply mismatch, leading to over-representation of certain sites while others remained inactive.

Are these issues common in automated content networks?

Yes, systemic imbalances can occur if distribution algorithms do not account for site activity and content diversity, especially in large, decoupled systems.

Will the fixes prevent this imbalance in the future?

The recent algorithm adjustments aim to diversify distribution, but ongoing monitoring and further refinements will be necessary to ensure long-term balance.

Could this imbalance harm the network’s SEO?

Potentially, as search engines may interpret over-concentration on a few sites as spammy or low-quality, affecting overall visibility and ranking.

Source: ThorstenMeyerAI.com

Leave a Reply

Your email address will not be published.