Mar 17, 2026 · 7 min read

Build a Reddit + Stack Overflow Monitor That Sends You Opportunities on Discord

I run a developer resource site with 200+ pages — cheat sheets, error fixes, tutorials, comparisons. The problem: how do I find people who actually need this content, right when they’re asking for help?

The answer: a Python script that scans Reddit, Stack Overflow, and Hacker News every 2 hours, matches posts against my content library, and sends opportunities to Discord. No API keys needed. Runs for free on GitHub Actions.

Here’s exactly how I built it.

What we’re building

A monitoring system that:

Scans 37 subreddits, Stack Overflow, and Hacker News for new posts
Matches posts against your content using phrase-based keyword matching
Sends matched opportunities to a Discord channel via webhook
Runs automatically every 2 hours using GitHub Actions
Tracks seen posts so you never get duplicate notifications

Tech stack: Python (standard library only), GitHub Actions, Discord webhooks.

Step 1: Set up the Discord webhook

First, you need a place to receive notifications. Discord webhooks are the simplest option — no bot setup, no authentication, just a URL you POST to.

Open Discord → go to your server → pick a channel (or create one called #opportunities)
Click the gear icon → Integrations → Webhooks → New Webhook
Name it something like “Content Monitor”
Click Copy Webhook URL

That URL is all you need. Test it with curl:

curl -X POST "YOUR_WEBHOOK_URL" \
  -H "Content-Type: application/json" \
  -d '{"content": "Hello from the monitor!"}'

If you see the message in Discord, you’re good. That’s how webhooks work — you POST JSON data to a URL and the receiving service handles it.

Step 2: Understand the data sources

We’re pulling from three sources, all with free public APIs:

Reddit exposes every subreddit as JSON by appending .json to the URL:

https://www.reddit.com/r/learnprogramming/new.json?limit=25

No API key needed. Just set a User-Agent header (Reddit blocks requests without one). Each post has a title, selftext (body), permalink, created_utc, and num_comments.

Stack Overflow

The Stack Exchange API is free for basic usage:

https://api.stackexchange.com/2.3/questions?order=desc&sort=creation&tagged=python&site=stackoverflow&filter=withbody&pagesize=20

Returns questions with title, body, tags, answer count, and creation date. We filter for questions with 2 or fewer answers — those are the ones where your answer will actually be seen.

Hacker News

The HN Firebase API is dead simple:

https://hacker-news.firebaseio.com/v0/askstories.json  # Get Ask HN story IDs
https://hacker-news.firebaseio.com/v0/item/12345.json  # Get a specific story

We focus on “Ask HN” posts since those are questions where you can provide helpful answers.

Step 3: Build the content index

The script needs to know what content you have so it can match posts against it. We read all markdown files from the blog directory and extract titles and tags:

import os, re

KEYWORD_MAP = {}

def load_content_index():
    blog_dir = "src/content/blog"

    for f in os.listdir(blog_dir):
        if not f.endswith(".md"):
            continue
        slug = f.replace(".md", "")
        with open(os.path.join(blog_dir, f)) as fh:
            head = fh.read(1500)  # Only need the frontmatter

        # Extract title
        m = re.search(r'^title:\s*["\'](.*?)["\']', head, re.M)
        title = m.group(1) if m else slug

        # Extract tags
        m = re.search(r'^tags:\s*\[(.*?)\]', head, re.M)
        tags = [t.strip().strip('"').strip("'")
                for t in m.group(1).split(",")] if m else []

        KEYWORD_MAP[slug] = {
            "title": title,
            "tags": tags,
            "url": f"https://yoursite.com/blog/{slug}/",
        }

This uses regex to parse the YAML frontmatter. We only read the first 1500 characters since the frontmatter is always at the top.

Step 4: Build the matching engine

This is the most important part. Bad matching = useless notifications. The key insight: match on phrases, not single words.

Single-word matching (“python”, “docker”, “git”) matches everything and gives you garbage results. Phrase matching (“learning python”, “cannot read property of undefined”, “react vs vue”) gives you high-intent matches.

We build different matching rules based on content type:

# Error fix pages — match on the actual error message
# "cannot read property of undefined" → our error fix page
error_phrases = {
    "cannot-read-property-undefined": [
        "cannot read property of undefined",
        "cannot read properties of undefined",
    ],
    "git-merge-conflict": [
        "merge conflict",
        "resolve conflict",
    ],
    # ... one entry per error fix page
}

# Comparison pages — match on "X vs Y" patterns
# "react vs vue" or "should I use react or vue" → our comparison
comparison_phrases = [
    f"{a} vs {b}", f"{b} vs {a}",
    f"{a} or {b}", f"{b} or {a}",
    f"should i use {a} or {b}",
]

# Tutorial pages — match on learning intent
# "learning python" or "new to docker" → our tutorial
tutorial_phrases = [
    f"what is {tech}", f"new to {tech}",
    f"learning {tech}", f"{tech} for beginners",
]

The matching function checks every post against every rule:

def match_post(title, body):
    text = (title + " " + body).lower()
    matches = []

    for slug, display_title, url, phrases, match_type in CONTENT_RULES:
        matched = [p for p in phrases if p in text]
        if not matched:
            continue

        score = len(matched)
        if match_type == "error":
            score += 2  # Error matches are high-intent
        matches.append({
            "title": display_title,
            "url": url,
            "score": score,
            "matched": matched[:3],
        })

    matches.sort(key=lambda x: x["score"], reverse=True)
    return matches[:3]

Error matches get a bonus score because someone pasting an error message is the highest-intent signal — they have a problem right now and need a fix.

Step 5: Deduplication

Without deduplication, you’ll get the same post every time the script runs. We track seen posts using a simple JSON file:

import hashlib, json, time

SEEN_FILE = "scripts/.seen-posts.json"

def load_seen():
    try:
        with open(SEEN_FILE) as f:
            data = json.load(f)
        # Prune entries older than 48 hours
        cutoff = time.time() - 172800
        return {k: v for k, v in data.items() if v > cutoff}
    except (FileNotFoundError, json.JSONDecodeError):
        return {}

def post_id(url):
    return hashlib.md5(url.encode()).hexdigest()[:12]

Each post URL gets hashed to a short ID. We store the ID with a timestamp and auto-prune after 48 hours to keep the file small.

Step 6: Send to Discord

Discord webhooks accept up to 10 embeds per request. Embeds are richer than plain text — they support titles, links, colors, and footers:

import json, urllib.request

def send_discord(webhook_url, embeds):
    for i in range(0, len(embeds), 10):
        batch = embeds[i:i+10]
        payload = json.dumps({"embeds": batch}).encode()
        req = urllib.request.Request(
            webhook_url,
            data=payload,
            headers={"Content-Type": "application/json"},
        )
        urllib.request.urlopen(req, timeout=10)

Each embed looks like this:

{
    "title": "🔥 r/learnpython — Is this a good way to self-learn Python?",
    "url": "https://reddit.com/r/learnpython/...",
    "description": "💡 [Python Cheat Sheet](https://yoursite.com/blog/python-cheat-sheet/)\n  ↳ matched: *learning python, self-learn python*",
    "color": 0x6366F1,  # Purple for Reddit
    "footer": {"text": "💬 3 replies · 45m ago"},
}

We color-code by source: purple for Reddit, orange for Stack Overflow, another orange for Hacker News.

Step 7: Automate with GitHub Actions

The script works locally, but you want it running automatically. GitHub Actions lets you run code on a schedule using cron syntax — for free.

Create .github/workflows/reddit-monitor.yml:

name: Content Monitor
on:
  schedule:
    - cron: '15 */2 * * *'  # Every 2 hours at :15
  workflow_dispatch:  # Manual trigger button

jobs:
  monitor:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: '3.12'
      - run: python scripts/reddit-monitor.py
        env:
          DISCORD_WEBHOOK: ${{ secrets.DISCORD_WEBHOOK }}

The cron expression 15 */2 * * * means “at minute 15 of every 2nd hour” — so 0:15, 2:15, 4:15, etc. We offset by 15 minutes to avoid the rush of jobs that run at :00.

Add your Discord webhook URL as a repository secret:

Go to your repo → Settings → Secrets and variables → Actions
Click “New repository secret”
Name: DISCORD_WEBHOOK, Value: your webhook URL

Step 8: Test it

Run locally first:

DISCORD_WEBHOOK="your-webhook-url" python scripts/reddit-monitor.py

You should see output like:

Loaded 156 matching rules from 229 files
Scanning Reddit...
  r/learnprogramming
  r/learnpython
  ...
Scanning Stack Overflow...
Scanning Hacker News...

Total posts fetched: 315
New opportunities: 14
Sent 14 opportunities to Discord

Check your Discord channel — you should see color-coded embeds with matched content links.

Run it again immediately to verify deduplication:

Total posts fetched: 315
New opportunities: 0

Zero duplicates. The seen file is working.

The result

Every 2 hours, I get Discord notifications like:

🔥 r/learnpython — Is this a good way to self-learn Python for finance? 💡 Python Cheat Sheet — The Only Reference You Need ↳ matched: learning python, self-learn python 💬 3 replies · 45m ago

I click through, read the post, and if it’s a good fit, I write a helpful answer that naturally links to my content. No spam, no self-promotion — just answering questions with useful resources.

What I’d improve

Smarter matching — NLP or embeddings instead of phrase matching would catch more nuanced questions
Priority scoring — weight by subreddit size, post upvotes, and comment count
Response templates — pre-generate answer drafts based on the matched content
Analytics — track which answers drive the most traffic back to the site

But the simple version works surprisingly well. 14 quality matches from 315 posts is a solid hit rate, and each one is a genuine opportunity to help someone.

Full source code

The complete script is about 300 lines of Python with zero dependencies (standard library only). You can find it in the GitHub repo.