Back to Blog

Building an Automated AI News Posting System - Lessons Learned

Fazm Team··2 min read
news-automationrsscontent-postingai-systemautomation

We built an automated system that monitors news sources, filters relevant stories, and posts them with AI-generated summaries. It sounds straightforward. It wasn't.

Scraping Breaks, RSS Doesn't

Our first version scraped news sites directly on a schedule. Within a week, we hit rate limits, got blocked by Cloudflare, and dealt with layout changes that broke our parsers. Switching to RSS feeds solved most of this. RSS is boring and old and incredibly reliable. Most major publications still maintain their feeds even if they don't advertise them.

Deduplication Is Harder Than It Looks

The same story appears on multiple outlets with different headlines. Simple title matching misses most duplicates. URL-based dedup fails when outlets use different slugs. What worked for us was embedding headlines and checking cosine similarity against a rolling window of recent posts. Anything above 0.85 similarity gets flagged as a duplicate.

Queue Management Prevents Flooding

Without rate limiting on the output side, our system would post 40 stories in five minutes whenever a major event happened. Nobody wants that. A simple queue with configurable intervals - one post every 15 minutes maximum - made the output feel curated instead of robotic.

Let the AI Summarize, Not Editorialize

Early prompts asked the model to write engaging commentary. The results felt fake. Switching to straightforward summaries - what happened, who's involved, why it matters - produced content people actually read. The AI adds value by condensing, not by pretending to have opinions.

The system runs reliably now with minimal maintenance. The key insight is that the AI part was the easy part. The infrastructure around it - feeds, dedup, queuing, scheduling - is where all the real work lives.

Fazm is an open source macOS AI agent. Open source on GitHub.

Related Posts