Architecture¶
This page describes how the community bots are structured and how data flows through the system.
Overview¶
The project contains four automation bots that run on Bluesky (and historically Mastodon). Each bot is a standalone Python class triggered by a scheduled GitHub Actions workflow.
Each bot is a Python class in src/, triggered by a GitHub Actions cron job. The class then calls the appropriate platform API — Mastodon.py for Mastodon and atproto for Bluesky.
GitHub Actions (cron)
│
▼
Bot class (src/)
│
┌─────┴─────┐
▼ ▼
Mastodon Bluesky
(Mastodon.py) (atproto)
Bots¶
Promote Anniversaries (promote_anniversaries.py)¶
Posts a celebration message on the anniversary date of women in tech profiles.
Flow: The bot loads events.json, compares each entry's date against today, builds a text post with the person's image, and sends it to both platforms.
metadata/events.json
│
▼
PromoteAnniversary.promote_anniversary()
│
├─ is_matching_current_date() ← compare MM-DD to today
├─ build_post() ← format text + hashtags
├─ download_image() ← fetch from GitHub raw URL
└─ send_post()
├─ send_post_to_mastodon()
└─ send_post_to_bluesky()
Each entry in events.json holds a name, anniversary date (MM-DD), description, image filename, and wiki link. The bot runs daily and only posts when the date matches.
Promote Blog Posts (promote_blog_post.py)¶
Rotates through community members and shares their latest blog post.
Flow: The bot reads a counter file to pick the next community member, fetches their RSS feed, generates a summary via Gemini, posts to the platform, then commits the updated counter back to the repository.
metadata/{pyladies,rladies}_meta_data.json ← member list + RSS feeds
metadata/*_counter_*.txt ← tracks current position
│
▼
PromoteBlogPost.promote_blog_post()
│
├─ read_metadata_json() ← load member list
├─ read_counter_name() ← determine next member
├─ process_feeds()
│ ├─ fetch RSS feed
│ ├─ parse_pub_date()
│ ├─ download_image()
│ ├─ generate_summary() ← Gemini API
│ └─ send_post()
├─ update_counter() ← advance position
└─ push changes to repo ← commit updated counter
The counter files (*_counter_mastodon.txt, *_counter_bluesky.txt) persist the current rotation index between runs. After each post the bot commits the updated counter back to the repository.
Boost Tags (boost_tags.py)¶
Reposts any public post tagged #pyladies or #rladies.
Flow: The bot searches each configured hashtag, then reposts any public post it finds that hasn't already been boosted.
config.TAGS ← ['#rladies', '#pyladies']
│
▼
BoostTags.boost_tags()
│
├─ repost_tags_mastodon()
│ └─ timeline_hashtag() → status_reblog()
└─ repost_tags_bluesky()
└─ feed.search_posts() → feed.repost()
config.IGNORE_SERVERS lists Mastodon instances whose posts should be skipped.
Boost Mentions (boost_mentions.py)¶
Boosts and likes posts that mention the bot accounts.
Flow: The bot reads its notification feed, then boosts and likes any new mention it finds.
BoostMentions.boost_mentions()
│
├─ [Mastodon]
│ ├─ notifications(types=['mention'])
│ ├─ status_reblog()
│ └─ status_favourite()
└─ [Bluesky]
├─ notification.list_notifications()
├─ feed.repost()
└─ feed.like()
RSS Data Collector (get_rss_data.py)¶
Scrapes community RSS feeds and regenerates the metadata JSON files used by the blog-post bot.
Flow: The bot fetches the community blog list from GitHub, parses each blog's RSS feed, and writes an updated metadata JSON file locally.
RSSData.get_rss_data()
│
├─ get_json_data() ← load existing metadata
├─ get_meta_data() ← fetch + parse RSS feeds
│ └─ extract_elements()
└─ write updated JSON ← metadata/{pyladies,rladies}_meta_data.json
This bot runs on a daily schedule and keeps the member list fresh.
Directory Structure¶
The repository is organised as follows: bot source code lives in src/, persistent state in metadata/, documentation in docs/, and scheduled triggers in .github/workflows/.
.
├── src/
│ ├── config.py # Shared constants (tags, API URLs, ignored servers)
│ ├── promote_anniversaries.py # Anniversary bot
│ ├── promote_blog_post.py # Blog-post promotion bot
│ ├── boost_tags.py # Hashtag boost bot
│ ├── boost_mentions.py # Mention boost bot
│ ├── get_rss_data.py # RSS metadata collector
│ ├── debug.py # Dry-run / testing helper
│ └── helper/
│ ├── login_mastodon.py # Mastodon authentication
│ ├── login_bluesky.py # Bluesky authentication
│ └── check_length_anniversary.py # Character-limit validation
├── metadata/
│ ├── events.json # Women-in-tech profiles (anniversary bot)
│ ├── pyladies_meta_data.json # PyLadies members + RSS feeds
│ ├── rladies_meta_data.json # R-Ladies members + RSS feeds
│ └── *_counter_*.txt # Rotation state for blog-post bot
├── archive/ # Audit copies of posted content
├── docs/ # MkDocs documentation source
├── .github/workflows/ # Scheduled GitHub Actions (one per bot × community)
├── pyproject.toml # Dependencies (pdm)
└── mkdocs.yml # Documentation site config
Scheduling¶
All bots are triggered by GitHub Actions cron schedules. The table below shows each workflow, its schedule, and the bot module it runs.
| Workflow | Schedule | Bot |
|---|---|---|
pyladies_anniversaries.yml |
Daily @ 11:00 UTC | Promote Anniversaries |
rladies_anniversaries.yml |
Daily @ 11:00 UTC | Promote Anniversaries |
pyladies_promote_blog.yml |
Every 2 days @ 07:00 UTC | Promote Blog Posts |
rladies_promote_blog.yml |
Every 2 days @ 07:00 UTC | Promote Blog Posts |
pyladies_boost_tags.yml |
Every 6 hours | Boost Tags |
rladies_boost_tags.yml |
Every 6 hours | Boost Tags |
pyladies_boost_mentions.yml |
Every 30 minutes | Boost Mentions |
rladies_boost_mentions.yml |
Every 30 minutes | Boost Mentions |
pyladies_rss_feed.yml |
Daily | RSS Data Collector |
rladies_rss_feed.yml |
Daily | RSS Data Collector |
Configuration & Authentication¶
Environment variables (stored as GitHub Secrets) drive all credentials and paths. The table below lists each variable and which bot module reads it.
| Variable | Used by |
|---|---|
PLATFORM |
All bots — "mastodon" or "bluesky" |
USERNAME / PASSWORD |
Bluesky login |
ACCESS_TOKEN, CLIENT_ID, CLIENT_SECRET |
Mastodon OAuth |
GEMINI_API_KEY |
Blog-post bot (AI summaries) |
JSON_FILE, COUNTER |
Blog-post bot (file paths) |
IMAGES, ARCHIVE_DIRECTORY |
Image and archive storage |
Bots accept a config_dict constructor argument as an alternative to environment variables, which is used for local testing and the debug helper (debug.py).
Dry-Run Mode¶
Every bot respects a no_dry_run flag. When False (the default for local runs), the bot executes all logic but skips the actual API calls to the social media platforms. Set no_dry_run=True in production or pass --no-dry-run via the workflow step.
Key Dependencies¶
The table below lists the third-party libraries the project depends on and what each is used for.
| Library | Purpose |
|---|---|
atproto |
Bluesky / AT Protocol client |
Mastodon.py |
Mastodon API client |
feedparser |
RSS feed parsing |
beautifulsoup4 |
HTML scraping |
google-generativeai |
Gemini API for post summaries |
requests |
HTTP downloads (images, feeds) |