2569
13 min read

Why I Wrote my own Static Site Generator

Embracing sustainability

author profile image
Will Carhart
author profile image
Will Carhart
Why I Wrote my own Static Site Generator cover image

A good first try

The version of willcarh.art that you are blissfully enjoying today is actually v2. It is (in my humble opinion) a more refined version of the website. However, most of this refinement is probably unbeknownst to you. This is the story of how in trying to refactor my website I ended up writing a Static Site Generator (SSG) from scratch.

The initial version of willcarh.art was a full-stack Django + PosgreSQL monolith deployed on Heroku. By the culminating iteration of willcarh.art v1 the project had become a frankensteined hodgepodge of templates, classes, and database manipulations. Each of the three Django sub-applications had static files in their respective <app>/static/ folder, but also referenced static files in other apps' static/ folders, as well as a global static/ folder for the whole site. The front-end for one sub-application symlinked to a completely different project structure, because I wrote the front-end first and thought it would be easier than translating it into a true Django-esque project structure. Rendered files were the product of a nested tangle of templates, which meant an instance of a subtemplate could not reference a supertemplate's minutiae, and the supertemplates quickly became overly complex trying to maintain a quasi-global namespace of configurations for each page that implemented one.

There were a few nice-to-have features to show off my web-development chops (dark mode, card filtering, etc.) that simply could not be implemented under the current project organization. What started as a refactor quickly erupted into a conflagration of legacy code colliding with new, jerry-rigged features.


Growing pains

As the site grew in content it quickly became difficult to maintain. One week a seemingly simple schema update would break the database layer, the next I would struggle to style simple DOM elements due to conflicting CSS files. Most of this was due to my error, not the services I was using. However, I also was not trying to maintain the website as a full-time product; I was hoping to keep it as a touch-and-go resource that I could keep updated as necessary. I knew that the currently approach was not sustainable, and so I eventually broke down my observations into four chief complaints about the v1 implementation.


Difficult to maintain

The simple task of publishing a new blog post involved copy and pasting HTML from previous blog posts. There was no abstraction between content and code, it was one in the same. This probably explains why I wrote plenty of blog posts but only published a few.


Difficult to expand

In the original version of willcarh.art, I had a section on my homepage for career experience. I had space for exactly 3 positions, because when I first wrote the site I had only had 3 positions. Upon moving jobs, it became apparent that adding space for a 4th position would require a major refactor of the entire homepage. It was the opposite of modularity - everything was loaded into one index.html file.


Difficult to deploy

The website, a web app, was deployed via Heroku. I architected a simple website of a blog, project porfolio, and homepage into a full-stack Django application because, well, it's all I knew how to do at the time. When I wanted to add another metadata field to a project or blog post, such as an Updated Date in addition to the Published Date, I'd have to migrate the entire database. This became cumbersome for a website with less than 10 pages of text.

In addition, Heroku came with its own quirks. I had a hard time setting up the site's naked domain: www.willcarh.art worked consistently but willcarh.art did not. I thought this was due to my own DNS naiveté, but apparently I'm not the first one to have this problem. In addition, Heroku advertises a free tier, but the free dynos go to sleep after 30 minutes, meaning if your website is not frequently accessed it can take more than 20 seconds for it to load. There have been work arounds, but none of them are free.

As a result, the overly complicated web app masquerading around as a static site was costing me $7/month and was not the one-click deploy solution I wanted.


Difficult to fix

With v1 I took on tech debt willingly in order to get the site published initially. However, the quick fixes I had implemented to get things up and running came back to bite me. In addition, there was little documentation or notes to keep me informed on why I made the decisions I did. This complaint is not something unique to willcarh.art, but rather a practice on which I needed to improve going forward.


Immediate legacy code

I feel like a popular tech buzzword that gets thrown around is legacy code. It's used in terms of previous projects, acquired companies, and COBOL codebases that apparently still power 95% of all ATM swipes. For me, however, code becomes "legacy" after about 20 minutes. Maybe I simply have a poor memory, or perhaps I am not good at multitasking. Maybe I can blame it on "too many meetings." Regardless, almost as soon as I've written a piece of code, I've forgotten how it works. And when I say how it "works," I don't mean how the algorithm operates or why the code works syntactically, I mean I forget the reasoning behind why I made the decisions I did. Why does div.col-md-6.no-shift have left:1rem!important;? I don't know, but definitely don't touch it.

In v1 of willcarh.art, I wrote a custom email handler for the website's contact form. I wrote a Python integration to interact with Gmail and send emails from an automated account. I even wrote a blog post about it. It seemed pretty robust and was working well when the website was first deployed. However, when Google updated a small flag in the Gmail API, the tool became completely unusable. It was not unusable because of the API change, it was unusable because I couldn't remember how I set it up, and thus it never became functional again (sidenote, this is why you might have noticed a lack of contact form on the current site).

This was the downfall of the original version of willcarh.art. It was unmaintainable not because of the tools I was or wasn't using, but because it was too complex, undocumented, and unfamiliar to me. It was not sustainable.


A burgeoning idea

Between my bouts of software frustration implementing with v1 of willcarh.art, I worked on a few other projects. One was docs and one was wanda. I won't go into detail here, but essentially each of these projects were statically generated sites. Docs is a statically generated site build using Docsify and is home to my software tool documentation. Wanda is another static site I built where I also wrote the generator. I created a virtual Christmas card and realized how repetitive some of the process was, and eventually wrote a prototype Python script to do it for me, which evolved into a simple SSG. Unknown to me at the time, this practice of using code to generate HTML from templates is called Static Site Generation, and I am very much not the first person to come up with it. Once I learned about SSGs, I realized how over-engineering willcarh.art v1 was and how it desperately needed to be a statically generated site.


The benefits of a static site

For the uninitiated, a static site is a collection of static files that make up a webpage. This most commonly a collection of HTML, CSS, and JavaScript files, but that is not a strict requirement. This differs from a more orthodox web application, where there is usually a server-side application that creates webpages based on dynamic content (e.g. database queries). While static sites are not a one-size-fits-all, they are perfect for willcarh.art.


Let's reinvent the wheel

I feel like my immediate reaction to spaghetti code is to rewrite the whole thing from scratch, even if debugging the actual issue would take a fraction of the time. I would venture to guess that I'm not alone in that sentiment. But, before I write my own SSG, it's good to look at the current spectrum of tools already out there. What would my custom SSG do that an off-the-shelf tool would not? Well, it's not that willcarh.art couldn't be generated with Jekyll, Hugo, or <your SSG goes here>. In fact, you'd probably be able to replicate the vast majority of this site with Wordpress.

But, I'm a software engineer, dammit. I'd say the motivation for writing my own SSG is as much pride and curiosity as it is "calculated" necessity. I make all the decisions, I make all the mistakes. When someone asks "how did you make your website?" I don't want the answer to be "I bought a theme for $20 and put some words in a text file." If I'm already going to learn a new tool to statically generate my website, why don't I make the tool too? I'm being hyperbolic, but the sentiment holds true. Somewhere in the thought process it comes down to a trade-off between ownership and luxury. Do I want the luxury of my own website, or do I want the pride in knowing that I wrote my own website? I usually want the latter, but it's not always sensible. My custom SSG will need to resolve all four chief complaints of willcarh.art v1 for it to overcome an off-the-shelf equivalent.


A bespoke solution

After ~150 commits, willcarh.art v2 is just about complete. The site is generated from content in markdown files and utilizes a custom templating system to reuse components wherever possible. Let's take a look at those complaints from earlier.


Fixing maintainability with markdown

One of the biggest issues with v1 was the lack of abstraction between content and code. In v2, all of the content for the site is sourced from markdown files in content/. This means that the markdown files are human-readable on GitHub and easy to keep updated. Adding a new position is as simple as adding another entry to content/experience.md.

Heyo!

The markdown generation for willcarh.art became so powerful that I spun it out into its own separate project, marq.



Fixing expansion with a static site

The static site is much simpler than its predecessor. There's no database and little code outside the generator. In addition, the generator builds a resulting src/ folder of static HTML files, so it's easy to track down issues when they arise.


Fixing deployment with Netlify

Another gripe with v1 was the complicated deployment practice for Heroku. With v2, I moved to Netlify, which is to the JAMStack as Docker is to containerization. The entire site deployment is a Git-based workflow. When the repository is updated, the site is redeployed. Netlify has other nice features too, like Netlify Functions (think: AWS Lambda) and Branch Deploys (for work-in-progress). Although Netlify is not the only company to offer these services for static sites, they were one of the first. And, best of all, it's free. It was a sweet sensation tearing down my $7/month hobby dyno from Heroku.


Fixing fixing with better documentation

The last complaint about v1 was the overflowing tech debt. v2's generator is fully featured and (hopefully) as future-proof as it can be. In addition, there are verbose comments throughout the codebase with an increased emphasis on documentation. Class names are more descriptive, and generation errors are explanatory.


A powerful result

This is the part where I boast about the code I've written. willcarh.art's generator uses a custom Handlebars-esque templating language for linking static files, generating computed values, and building HTML. It reads from template HTML files, which reference content via the templating language, to build the all of the site's static files.

Here's an example. Take a look at the experience tabs on the About page. The generator reads from the template templates/about.html, which specifies {{html:exp-tabs}}. This prompts the generator to read in the content from content/experience.md and populate the experience tabs. As a result, content updates do not require code updates and are still tracked in version control. This method is used for all of the project cards, blog posts, and vault rows throughout the site. Even these inline links are built using the templating system, so if the source files ever change I don't need to update every blog post that references them.

The templating also supports dynamically generated content and statically linked files. Content is loaded into JavaScript files that get shipped with the site. Static files are referenced using the templating language so there are no broken links.

However, one of the biggest achievements of the generator is its markdown capabilities. I wrote the markdown to HTML converter from scratch, and it supports all the features of GitHub-flavored markdown, as well as some other additions (inline YouTube videos, comments, and more). The markdown converter has become so powerful that I will likely make it a separate project in the future. You can check out the markdown converter's code here.

There are also a number of fun easter eggs and accessory features throughout the site. For instance, try hovering your mouse over the published date and read time near the top of this blog post. The dark mode toggle in the top right uses the LocalStorage API to remember your preference. Analytics are provided by Plausible, a privacy-conscious alternative to Google Analytics. Images are hosted from a CDN to optimze page loads.


So...was it worth it?

Relevant XKCD about automation

https://xkcd.com/1319

Overall, I'd say yes! Did the SSG take more time than an off-the-shelf equivalent? Probably. Did the SSG take more time than its non-static equivalent? Nope. Did I learn a bunch along the way? You betcha.

Overall, I feel that willcarh.art's SSG is as future-proof as possible. It has plently of rich features that turn my code and opinions into a snazzy-lookin' webpage. It was a ton of fun to write and it made me think through parts of the networking stack that I haven't needed to in the past. While it's not a replacement for Hugo or Jekyll, it sure gives them a run for their money without a hit to my software engineering pride.


Conclusion

Looking back on v1, I think it's really easy to point out my flaws in authoring the site. You could say, "Will, you being bad at configuring DNS is no fault of Heroku, you can't bash their product just because you're incompetent," and you'd be right. However, it was my perceived incompetence that led me to easier-to-use solutions for my use case, like Netlify and the statically generated site. The journey is always a learning experience and part of organic growth is the failure, and that's OK. There's no one way to make a website, and there's a plethora of powerful software tools to do it these days. This is what I found worked best for me, and it might not work best for you. Hugo, Jekyll, Docsify, and others are great SSGs. However, if you yearn for control and want to get into the nuts and bolts of an SSG, I can't recommend enough writing your own. You'll have an intimate connection with your website and feel like a true software engineer.

If you're interested, here's a link to the old site. I can't promise it'll be deployed forever, but you can always find its source code on my GitHub. In addition, you can find the code for the current site here. If you're interested in similar projects I've written, take a look at wanda.


🦉

Artwork by jenteva

Back

Read More

Back

Read More

©  Will Carhart