December, 2025

Building a Custom Static Site Generator

2025-12-24 | Tags: python, engineering, ssg, retrospective

Project Intent

The objective of this project was to develop a lightweight, semi-automated blogging engine. Rather than relying on pre-built CMS solutions (WordPress, Ghost) or complex frameworks (Next.js, Hugo), I chose to build a custom Static Site Generator (SSG) in Python. The goal was to maintain absolute control over the data structure and minimize dependencies while adhering to the Unix philosophy of modularity.

Initial Architecture

The architecture was designed as a "Regenerative" model. In this system, the output (HTML) is treated as disposable. On every build, the dist directory is wiped, and the entire site is reconstructed from the input (Markdown files).

The Technology Stack

I selected a minimal Python stack to ensure maintainability and ease of deployment:

  • Core Logic: Python 3.13 (Standard Library for file I/O and Regex).
  • Content Layer: Markdown with YAML Frontmatter (via python-frontmatter). This separates metadata (dates, tags) from the prose.
  • Presentation Layer: Jinja2. A templating engine that decouples the HTML structure from the Python logic.
  • Styling: Pico.css (Classless). A semantic CSS framework that allows for screen-independent rendering without class-name clutter. This produces HTML that can be styled very easily to match an existing website or replace for another CSS framework.

The Build Pipeline

The resulting builder.py script executes a linear pipeline: 1. Sanitization: The distribution folder is cleaned to prevent artifact accumulation. 2. Passthrough: Static assets (CSS, images) are copied verbatim. 3. Ingestion: The script walks the content/YYYY-MM directories, parsing filenames via Regex to extract slugs and timestamps. 4. Rendering: Markdown is converted to HTML, injected into Jinja2 templates, and written to the output directory.

Conclusion

I am also testing collaboration with AI using my most updated set of system instructions (XML, JSON) using highly structured prompts. From first concept to this first version took around 7 hours, most of it defining the criteria. I was aiming for a skeletal back-end based on best practice, solid error handling, and "headless" architecture for easy integration into common front-end tools (Obsidian, etc.).

The LLM did most the programming, while I provided the logic and criteria and evaluated/debuged the code, and the LLM even wrote all of this blog entry besides the conclusion as a summary of the process. I just edited. This definitely feels like proof-of-concept, in terms of my own efficiency gain! For reference I started using a professional LLM around 2 weeks ago, and have otherwise never used any besides the AI mode on Google's search page, and OpenEvidence for quick review when I'm working as a nurse.

Next steps: VPS integration, Obsidian front-end integration, Docker, and read up on Jinja2, as my needs were always more along the BeautifulSoup4 line. Consider refactoring for multiple posts in a single day, or an error handling protocol that will handle this situation gracefully. Add custom styling and navigation elements for my personal blog, though this is less important.