BOTWIN.TOKYO SYSTEMS//OPEN-SOURCE NEWS ENGINE//CLOUDFLARE PAGES//D1 LIVE//SELF-HOSTED
OPEN-SOURCE AI NEWSPAPER ENGINE

Botwin's Morning Wire

An autonomous newspaper you run yourself. It ingests the day's news, rewrites it into neutral wire-style copy with your own LLM, validates every article, ranks the edition, and publishes a clean daily briefing to your Cloudflare account.

MIT
Open Source
100%
Self-Hosted
D1
No Rebuilds
MIT LicenseSelf-HostedBring Your Own LLMCloudflare Pages
What It Does

“A personal morning intelligence system that rewrites the news — and you own the whole stack.”

Botwin's Morning Wire reads from your configured feeds, rewrites every story into neutral wire-style copy via an LLM you control, and validates the output before any of it touches the published edition. Clone the repo, point it at your own LLM and Cloudflare account, and you have a daily broadsheet that's entirely yours — fully auditable, served from the edge.

Quickstart

Clone, configure, run.

Get a local broadsheet running in four commands. Full setup lives in the README.

bash · local setup
git clone https://github.com/botwin-tokyo/daily-wire-dot-tokyo
cd daily-wire-dot-tokyo
npm install
cp .env.example .env # add your LLM endpoint + (optional) Cloudflare keys
npm run dev # local broadsheet dev server

Generate today's edition — run the pipeline phases in order via Hermes (/fetch-news → /clean-chunks → /chunk-articles → /publish-pipeline), or by hand:

bash · generate an edition
npm run ingest:articles # fetch + compile to local DB
npx tsx agentskills/clean-chunks/clean-chunks.ts # clear previous run artifacts
npx tsx agentskills/chunk-articles/chunk-articles.ts # chunk + rewrite
npx tsx agentskills/publish-pipeline/publish-pipeline.ts # review → publish
Before You Start

What you'll need.

Runtime

Node.js & npm

Node 22+. The ingest pipeline, rewrite scripts, and the TanStack Start frontend all run on it.

Bring your own

An OpenAI-compatible LLM

Local (LM Studio, llama.cpp, Ollama-compatible) or hosted. Set BRAIN_API_URL, BRAIN_API_KEY, and BRAIN_MODEL — every rewrite and validation runs on a model you control.

Fetching

Docker

Runs the Ladder proxy (npm run ladder:up) used to fetch publisher pages. Firecrawl is an optional fallback for blocked sources.

To publish

A Cloudflare account

D1 stores live editions, KV handles rate limiting, Pages serves the site at the edge. Local dev works without it via static fallback.

Self-hosted by design. You bring your own LLM and your own Cloudflare account — nothing phones home.

Sources & Fetching

Fifty sources, wired in.

Botwin's Morning Wire ships with roughly 50 pre-built source fetchers spanning every section. Fetching routes through a self-hosted Ladder proxy, with Firecrawl as an optional fallback for sources that block it — and adding your own is a single fetcher file.

50
Source fetchers
9
Sections covered
2-tier
Ladder + Firecrawl
BBCThe GuardianAPReutersAl JazeeraFrance 24CoinDeskTechCrunchThe VergeArs TechnicaMacRumorsWashington Post
The Pipeline

FETCH → CLEAN → CHUNK → REWRITE → VALIDATE → REVIEW → ASSEMBLE → PUBLISH → RENDER

FETCH
Collect news from configured RSS, APIs, and publisher pages
LOCAL
CLEAN
Remove old generated files and previous run artifacts
LOCAL
CHUNK
Split incoming articles into category batches
LOCAL
REWRITE
Rewrite stories into neutral wire-style copy via your own LLM
LLM
VALIDATE
Check for chain-of-thought leaks, JSON artifacts, empty content, second-person advice
VALIDATED
Strongest guardrail in the system
RETRY / QUARANTINE
Recover failed rewrites or remove them from publication
RETRY
REVIEW
Deduplicate, boost trending topics, clean formatting
LOCAL
ASSEMBLE
Merge reviewed articles into the daily file and populate the edition database
LOCAL
PUBLISH
Build the schema-valid edition JSON and upsert it to Cloudflare D1
D1
RENDER
Serve via TanStack Start frontend on Cloudflare Pages
EDGE
Why It's Different

Six things that make this not a blog.

It rewrites, not just scrapes

Articles are rewritten into a consistent neutral editorial voice before publication.

Validation before every publish

Every AI-generated article passes a validation check. Bad output never reaches the edition.

Canonical storage in D1

Editions are stored in your Cloudflare D1, not just ephemeral builds.

Live updates without rebuilds

The site updates daily without a Git push or frontend redeployment.

Full source transparency

Every story carries source attribution, confidence scores, and AI disclosure metadata.

You own the whole stack

Your machine, your LLM, your Cloudflare account. No vendor, no lock-in, MIT-licensed.

Architecture

Four layers. One newsroom.

① Hermes Skills

Operator personality

$ /morning-wire run

Slash-command-driven automation that runs the pipeline phases from the operator's machine.

② TypeScript Scripts

Code pipeline personality

fetchrewritevalidatepublish

The deterministic and LLM-backed scripts that handle every phase from ingestion to final edition.

③ Cloudflare Platform

Infrastructure personality

D1KVPages

D1 stores live editions. KV backs rate limiting (and caching down the road). Pages serves the frontend globally at the edge.

④ TanStack Start Frontend

Product personality

WorldBizSciCulture

The responsive broadsheet UI with API routes, archive, search, article pages, and category views.

Deploy Your Own

Fork it, point it at Cloudflare, ship your own wire.

Provision the storage layer, then connect your fork to Cloudflare Pages.

bash · provision cloudflare
wrangler d1 create <your-database-name> # then apply migrations/0001 + 0002
wrangler kv namespace create KV
wrangler pages secret put ADMIN_TOKEN --project-name=<your-pages-project>

Connect your fork to Cloudflare Pages — push to main and it deploys. D1 is the source of truth, so new editions go live with no rebuild and no Git push. The static files are only a local-dev fallback.

Tech Stack

Stamped, classified, accounted for.

REACT 19
TYPESCRIPT
TANSTACK START
VITE
TAILWIND CSS
ZOD
SQLITE
CLOUDFLARE D1
CLOUDFLARE KV
CLOUDFLARE PAGES
WRANGLER
DOCKER
OPENAI-COMPATIBLE LLM
HERMES
SOURCE SERIF 4
Trust & Safety

No garbage in. No garbage out.

No chain-of-thought leaks

Internal LLM reasoning is stripped before any output reaches the edition.

No raw JSON artifacts

Structured output is validated and cleaned before publication.

No empty article bodies

Empty rewrites are caught and quarantined automatically.

No second-person advice

AI-generated copy that breaks editorial voice is flagged and removed.

Source transparency preserved

Every story carries its original source attribution and URL.

No browser-exposed secrets

Admin routes are protected by bearer token and rate-limited.

Bad rewrites are quarantined, not silently dropped. The operator can inspect, retry, or discard them.

❧ Built For Operators ❧

A newspaper that behaves like infrastructure.

Botwin's Morning Wire is built like a newsroom pipeline, an automation stack, and a publishing platform at once. It runs every morning. It validates its own output. It publishes like infrastructure.

Roadmap

What's next on the press.

  • Better monitoring and alerting
  • More polished admin controls
  • Expanded personalization options
  • Improved source reliability scoring
  • Feed management UI
  • Better edition analytics
  • More advanced topic clustering
  • Voice briefing support

Open source under MIT. Issues and pull requests welcome.

View on GitHub ↗

This is not a blog.
It is an autonomous daily publication engine — and the source is yours.

Clone the repo, run the pipeline, and publish your own edition.