Skip to content
Writing
·6 min read·Teardown

What 115 articles a day actually requires

Drafted through my n8n + AI pipeline, edited by me.

By the end of this you'll know what 'autonomous' actually requires under the hood, and why the impressive part is the boring reliability layer, not the AI.

Key numbers: 115-plus articles published per day, across four categories, with zero manual steps on a normal day.

0+

articles/day

credible, scored

0

categories

0

manual steps

on a normal day

What 'autonomous' adds up to on a normal day.

The mess

'Autonomous' is a word people use loosely. Most 'set and forget' systems quietly need a person every single day: restart this, re-run that, patch the thing that broke overnight. For SourceRated the word has a literal meaning. Nobody touches the pipeline on a normal day, and it still publishes 115+ credible articles across four categories.

The wrong way people solve it

They build the happy path and call it done. The scraper works on a good day, the model scores cleanly on tidy input, the publish succeeds when nothing is wrong. Then a source changes its layout, two articles turn out to be the same story in different headlines, the model is unsure, a publish fails halfway, and the 'autonomous' system needs exactly the babysitting it was supposed to remove.

The system view

The loop, end to end, is not exotic. A cron fires every 30 minutes, self-heals if it was ever left unscheduled, and detaches so a slow run survives the host's timeouts. Pre-flight guards reserve the slot so two cycles never collide, confirm auto-publish is on, and check the AI key. Then it fetches the feeds, dedupes by URL and by headline both before and after the AI call, generates the rewrite and an initial credibility score, and publishes. There is no human in the loop: a review queue exists but is currently switched off, so everything auto-publishes. Failures are logged, not alerted. The work is not in the steps. It is in the seams between them.

Cron (every 30 min) → Guard (slot lock, auto-publish on, API key) → Dedup (URL + headline, before and after the AI) → Generate (rewrite + credibility score) → Auto-publish (no human gate) → Record (log + circuit breaker; readers vote later).

The SourceRated cycle: a self-healing cron every 30 minutes, pre-flight guards, an RSS fetch, three dedup passes, AI generation with a credibility score, automatic publishing with transparency metadata, enrichment and SEO, then logging and a circuit breaker. Readers vote on the score after publish.

  1. 01Trigger

    Cron, every 30 min

    self-heals, detaches past host timeouts

  2. 02

    Pre-flight guards

    lock the slot, auto-publish on, API key

  3. 03

    Fetch RSS

    candidates per category

  4. 04Decision

    Dedup, three passes

    URL, pre-AI and post-AI headline

  5. 05Decision

    AI generate

    rewrite + credibility score + citations

  6. 06Action

    Auto-publish + transparency meta

    no human gate

  7. 07Action

    Enrich + SEO

    verify citations, image, Rank Math

  8. 08Record

    Log + circuit breaker

    score history; readers vote later, score evolves

One SourceRated cycle, every 30 minutes. Auto-published, logged, no human in the loop.

What I would build

Exactly what makes it survive a bad day. Custom scrapers across multiple sources. SQLite-based semantic dedup, so the same story in a different headline gets stripped before it publishes, plus a post-publish dedup guard that trashes a near-duplicate from the last hour. An AI credibility score assigned at publish, with a community-voting layer that lets readers move it over time. A publish step running in Docker. And then the part that earns the word autonomous: retries, and a circuit breaker that stops after three straight AI failures and switches to a backup provider on its own. Every run is logged, with a full history of every score change.

What can break

A source goes down or quietly changes its structure. Two articles are near-duplicates and both slip through. The model scores something at low confidence. A publish dies halfway and leaves a half-written record behind. Bad input data poisons the feed. Every one of these is handled on purpose, because the alternative is finding out from a reader.

What the business gets

A system that genuinely runs without you, credibility you can defend instead of claim, and your attention back for the work that actually needs judgment. Reliability here is not a feature bolted on at the end. It is the product.

Anyone can ship an automation that works in the demo. The valuable part is making it fail loudly and survive the day you are not watching.

Bring me the workflow you wish ran without you. I'll tell you what I'd make autonomous first, and what still needs a human.

Building something this should run inside?

Book a systems call