How this Website is Put Together

It's about time I documented how this website (and server) works. Maybe it's my own bias, but I'm quite happy with this setup and think it is valuable to share.

The website serves static HTML, is easy to update, has a relatively low hosting cost ($5/month for hosting), is as fast as I need it to be, and also provides a number of handy services.

In addition to hosting this website, my server is configured to:

  • host a few other static websites
  • host several personal git repositories
  • send SMS messages
  • send + receive email
  • run my ActivityPub instance

There's nothing too fancy on board, but it does take a small amount of maintenance. Overall, I'm happy with this setup. I've been running it for years and feel like sharing how it all works.

This website itself

This website is put together via a few custom scripts which turn markdown files (with some metadata) into HTML and an Atom and RSS feed. Nothing is particularly interesting about this build script, but if you're interested in seeing it, just ask—I'd be happy to share.

One notable thing to call out is that in this website, all global CSS, JavaScript, and web font files are inlined into the HTML payload itself.

You may be asking yourself: why would you do this?

Well, you don't really need much CSS or JavaScript to make a website look decent. However, I do use a web font for section headings: Roboto Slab Light.

Normally when using web fonts, you have two choices: show a fallback font while loading or hide text while loading. This site takes a different approach: it inlines the web fonts as data:// URLs in the CSS, which is inlined in the HTML document itself. This means we get the immediate display of fonts without additional network requests. Neat!

You may be thinking, "but you're sending down the font files for each page, isn't that inefficient?"

Well, yes. However, many web fonts are surprisingly small in comparison to other web assets. Case in point: currently, my home page is 21.5Kb (compressed) over the wire. That's including an embedded web font and all the CSS and JavaScript needed for this website. That's a lot smaller than most images on the web.

Heck, as of 2024-04-30, google.com pulls in ~795Kb uncached and ~36.7Kb when loading things cached.

Serving the web pages

This is all running on nginx with a pretty minimal configuration. One thing I periodically do is ensure that the TLS/SSL configuration is robust. There are a few tools to help:

  • Mozilla provides a very handy SSL configuration helper, which helps ensure that your server is configured to use the most appropriate TLS/SSL protocols, ciphers, and configurations: https://ssl-config.mozilla.org/
  • Qualys provides a similarly handy tool to "grade" your server configuration. If you did everything right with the above configuration, you should have an A or A+ rating: https://www.ssllabs.com/ssltest/index.html

For certificates, I use Let's Encrypt, which since its introduction in 2016 has managed to make getting certificates both free and easy. The power of non-profits!

Mostly because I don't trust the automated process, I need to renew the certificates every 3 months. However! It's incredibly easy. Once you have here are the steps:

  1. I ssh into the host, run sudo certbot renew
  2. ... that's it!

Well, that's not entirely true. I actually end up copying the new certificates off the host onto my laptop to put them in my configuration management toolchain, but I'll talk about that later.

DNS

The Internet is a pay-to-play forum, so if you want a domain, you've got to buy one. So I went with namecheap.

It's fine, I have no complaints. I manually edit the DNS entries in their website, which could probably be automated if I wanted to do that (they do have an API).

Notably, namecheap only gives you the ability to change only a subset of DNS record types: A, AAAA, ALIAS, CAA, CNAME, NS, SRV, TXT, and MX. These fit all of my needs, but if for some reason you need something more, use a different provider (or use NS and point to your own nameservers).

I specifically have:

  • A set up for abstract.properties and (at the time of this writing) host-0004.abstract.properties pointing to the same IP address.
  • CNAME set up to point git.abstract.properties to abstract.properties
  • MX for receiving email
  • TXT for email SPF, DKIM, and random things like various site verifications
  • CAA for verifying my TLS issuer (Let's Encrypt) matches the certificate sent over https://

Here's what the main ones currently are (as of 2024-04-30).

~ $ dig +short abstract.properties a
159.89.121.44
~ $ dig +short git.abstract.properties cname
abstract.properties.
~ $ dig +short abstract.properties mx
10 host-0004.abstract.properties.
~ $ dig +short abstract.properties txt
"v=spf1 a:host-0004.%{d} -all"
"google-site-verification=drdyZbCygHk9pqJkp3iGK5FM0KxldV8OnJo51JLvTXc"
~ $ dig +short abstract.properties caa
0 issue "letsencrypt.org"

Hosting

And if you have domains pointing to an IP address... well, you'll need either a stable IP address from your ISP and a reliable electrical grid, or a hosting provider to do the hard work of keeping a computer running 24/7 for you.

My hosting provider is DigitalOcean, and since my needs are light, I pay $5/month for the smallest "droplet" host, which is a "tiny" virtual host:

  • 1 CPU
  • 1 GB of RAM
  • 25 GB of storage (fast, SSD)
  • A stable IP address
  • A PTR DNS record pointing that IP address to the name of your choosing! (NOTE: this is important for email)

This is not a beefy machine. However, if you're serving mostly static web pages off disk, it doesn't matter. Running ab (locally), I was able to get approximately 422.76 requests per second on this hardware, which to be honest is far more traffic than this website will ever see. I'm fairly certain it'd survive a hug of death.

All in all, DigitalOcean is great. They've got both cheap options as well as most of the bells and whistles you need if you're doing something more serious. And they're not Amazon, which is a plus.

HTTPS

This website is configured to always use https://. TLS certificates are obtained via Let's Encrypt, which I adore. Prior to their launch, getting even a paid TLS certificate was a big pain in the butt. Now it's so easy, I'm honestly suspicious that I'm doing it wrong.

In order to tell the world that I am using TLS certificates from Let's Encrypt, I have a CAA DNS record set to issue "letsencrypt.org" per their recommendations.

Of course, https:// is only useful if people are directed to it, so I have configured nginx to redirect via HTTP 301 and send a Strict-Transport-Security HTTP header to tell clients in the future they should only be using https://.

I mentioned these before, but will repeat them here since they're very important:

Mozilla provides configuration snippets for ensuring that your SSL configuration is secure. If you're running a web server and you're not sure which TLS protocols and ciphers to use, just visit https://ssl-config.mozilla.org/ and use their recommendations.

To manually verify that things are working as expected, I use Qualys' free SSL Labs SSL Server Test, which at the time of this writing gives this site a solid A score. Not an A+, but I'll take it:

Configuration Management

So we've got a web server running on some host that is hooked up to DNS... but what's the host actually configured to do?

Well, for the first few years of running this server, I just used stock Ubuntu on the host and managed everything... manually. It worked for a few years!

But like in everything in life, if you don't keep things documented & organized and don't have a clear record of what you've done & what you need to do, you're gonna have a hard time maintaining quality.

There are a lot of different configuration management products out there. Heck, at my first job I helped build and maintain a home-grown configuration management piece of software. I've personally used CFEngine, chef, and Ansible—while I can't say either of these three are great, I settled for Ansible.

Mostly this was because I felt CFEngine was very awkward to maintain, and chef requires running in a client-server mode that seems like overkill for managing a single host.

Ansible itself is pretty straightforward. You keep an "inventory" of hosts and are able to define "roles" which your hosts may participate in. Each of these roles have associated tasks that allow you to do things like install (templated) files, set up users/groups, and trigger actions when the task identifies it has performed a change.

It comes with a big batteries-included set of rules that make most things very easy.

The main complaints I have are minor:

  • It's astonishingly slow (doing a no-op update on my single host takes 1 minute and 47 seconds)
  • It makes tasks that involve looping over sets of data (say, installing a number of templated files in named subdirectories) is awkward and involves "calling" out to a separate sub-task which must be a different YAML file
  • Too much YAML

Keeping secrets: s/encrypt and s/decrypt

One thing you need to consider when dealing with software configuration is how to store secrets. For my own site infrastruture, this is not particularly exciting, but I put together a surprisingly simple, surprisingly effective solution that works great.

I made two scripts: s/encrypt and s/decrypt which are fairly straightforward python scripts that input a passphrase, and encrypt/decript specified files using openssl enc.

Specifically, I use these two commands to encrypt and decrypt files:

  • encrypt: openssl enc -aes-256-cbc -pbkdf2 -salt -in $FILE -out $FILE.aes
  • decrypt: openssl enc -aes-256-cbc -pbkdf2 -salt -d -in $FILE.aes -out $FILE

Prior to running ansible-playbook to perform any updates on my host, I decrypt all the encrypted files on disk. Once the ansible-playbook run is complete, I delete the encrypted files. Any time I need to change the contents of the encrypted files, I can decrypt them, modify the contents, and then encrypt them again.

These encrypted files are stored in git alongside all of the rest of the ansible configuration files and associated files, I don't need to worry about accidentally leaking information that should not be leaked.

To avoid accidentally checking in the files to git, I add the unencrypted files to .gitignore and when running ansible, I decrypt immediately prior to running ansible and delete the decrypted files immediately after with git clean -i.

By the way, git clean -i is by far my favorite thing about git. It has such a good user experience!

SSH and remote access

I run SSH on a non-standard port (159, ensuring it's not firewalled by UFW), configure it to disallow passphrase authentication, and rotate my ssh keys every time I rotate my TLS certificates periodically.

This is maybe tin-foil-hat paranoia, but avoids all the SSH port scanner/vulnerability scripts out there.

One thing to note about non-standard SSH ports is that in order to reduce the likelihood of some non-root user gaining access to the system, the port should be less than 1024. These ports are only able to be listened to if you are a root user, which limits a non-root user from being able to listen on that same port, which would allow them to replace it with a malicious proxy.

Unattended upgrades

Since I'm not actively monitoring this host, I need it to automatically upgrade, so I have unattended-upgrades installed to take care of that.

Alerting (SMS)

I wrote a very simple shell script which:

  1. Stores standard input in a temporary content addressable location served by this host.
  2. Sends an SMS with a portion of the input and a URL to the complete content via Twilio to my phone number from a number I control.

This gives me very easy alerting / notification when certain events happen on my host. I have a few periodic cron scripts which will alert me if I'm running low on disk space or other notable events occur.

This costs $0.0075 per SMS, which with my current $28.80 trial balance means I have a free 3,840 alert notifications directly to my phone. Surprisingly easy and cheap self-monitoring. And once that trial balance runs out, I'll happily send them another $20 for this service.

Note: this is all self-monitoring. If an asteroid were to hit the data center, I would not be notified about it. Oh well, I don't feel like spending another $5/month for a geographically distant host to monitor the main one.

git

I'm trying to wean myself off relying on github. Since I work on a few different computers, it's helpful to have a remote git server for hosting. Since git has a protocol that works over SSH, I use this host as the origin for my repositories.

In addition to SSH, you may have noticed that I've linked to files served by git.abstract.properties! In order for this to work, I am using gitweb running behind nginx via fastcgi. Handy whenever needing to link to a specific file.

One thing to note is that gitweb is a pretty complex giant perl script. At the time of this writing, Ubuntu's latest version on Bionic has patched all of the known CVEs, but this is probably the most shaky thing I have running. Please don't hack me.

Backups

Because I'm hosting the "authoritative" copy of my git repositories on this host, I need to have a reliable backup solution. For this, I use Tarsnap.

It's a bit of a pain to set up, but overall very inexpensive (as of 2024-04-30, my daily storage cost is $0.004188688352335824—yes, Tarsnap does its daily accounting in picodollars, rounding to the benefit of customers) and are very secure.

Because all of the repositories I'm hosting are personal projects, I don't really care about backing up the full history of the project. So I wrote a weekly backup script which archives only the most recent commit in my repositories' HEAD refs in a way that can leverage Tarsnap's compression and de-duplication efficiencies. This isn't ideal, I wish there was a way to tell git to have a more optimal-for-backup layout for its internal files.

Email

Email is significantly more difficult to set up and get right. There are several moving parts, all of which need to be set up and configured correctly for you to be able to send and receive email from others.

I started writing about how I've set up email, and ended up writing more than I've already written for everything else combined. So... I'm going to not publish that right now. Please let me know if you'd like to read this.

ActivityPub

I don't use Mastodon. However, I follow many people who do and people there are able to follow and interact with the posts I write. This is the power of federation, which is powered by ActivityPub.

When I wanted to set up my own ActivityPub server, I was looking for lightweight servers. That is, servers which could be run on a host with 1 GB of RAM that also serves several static websites. I ended up finding Pleroma, which bills itself as "a lightweight fediverse server" and decided to give it a shot.

It has a dependency on postgresql, but I was able to follow the installation instructions and everything seems to work just fine.

To be honest, I'm not sure if I'd recommend using it. I've been running it less than one year, have a daily cron job set up to restart it because sometimes it gets OOM killed, and I'm not too confident of its security.

That's it!

And... that's about everything that's used in putting this website together. If for some reason you're still reading this and want to poke into the code involved, feel free to take a look at the ansible setup I use to manage this host: https://git.abstract.properties/?p=fleet-management.git;a=summary

And if you happen to stumble on anything that is Bad™ in there (i.e. some secrets are leaked, some configuration is insecure), please contact me directly via email or a direct message on the Fediverse™: @sufian@apub.abstract.properties. If you find something bad in there and are in the NYC area, I'll treat you to a coffee.