Self-Hosting

The concept

Self-hosting means hosting things yourself instead of running them under someone else’s management. Sometimes this can mean provisioning your very own VM from a cloud service, other times this may mean having your own physical device, like an old Raspberry Pi in a closet.

Either way, you can then choose to host services to be used only for yourself (e.g. your own media server, NAS, IoT device hub, dashboard for the air humidity in your attic, etc.) or even to be exposed to other people through the internet (e.g. a website, live dashboard of the ISS urine tank levels, your own fediverse Mastodon node, or any other crazy idea you may have).

The why

The reasons why you may do this are pretty varied and everyone will have their own story here. I do feel it generally falls into a few broad categories:

1. You are a big nerd

Admittedly this is actually more of a prerequisite than a reason. More to the point, it usually takes quite some technical know-how to set up, host and maintain your own services. At least compared to filling in a registration form for someone else’s app. Though the most popular self-hosted services tend to be relatively user-friendly as far as they can be.

Setting up these services for yourself can teach you a lot about how the internet and networking in general works, which is satisfying in itself, and may make your life easier if you work in IT :)

2. You are a control freak

A very satisfying feeling of self-hosting is that you can be in complete control of what is running and how it is configured. It is your device after all. Doubly so if the software you are running is open-source (it generally is).

To illustrate the point: If Netflix suddenly removes your favorite movie, doubles your monthly fee, or messes with the recommendation algorithm in a way you dislike, your choice is to cancel your subscription or just accept it. If you self-host your media, that movie will always be stored on your server, there is no fee, and you may fully customize the recommendation system.

For example, while I run this blog on my Raspberry Pi, I have a separate, slightly heavier machine that is inaccessible from the internet and acts as my NAS and media hub. (I got this one refurbished for cheap because it didn’t have the specs to run Windows 11, so businesses were getting rid of it. Thank you Microsoft!)

3. You got a philosophy

The fact you have full control can also be put to use ‘fixing’ some more issues people have with popular services running on someone else’s servers: The amount of telemetry and subsequent re-selling of that data is enormous. Self-hosting allows you to severely restrict or remove that.

Another aspect I’m partial to is the thought that too much centralization is not always a good thing. Software can be seen as an ecosystem where single points of failure can breed fragility. Monopolies and walled gardens are seldom good in the long term, so sometimes its good to do your own thing when you can.

The goal

My personal reasons were some combination of the above, and having time between Christmas and New Years to actually set it all up. My main goal was to be able to self-host the website you’re reading now. Added benefits were having some tools to use in my internal network, and a solid basis for if I ever want to share more things on the internet.

The idea was to take control of hosting this simple static website as far as is reasonable. This means I have to rely on someone else for my DNS registration, but I don’t want to host anything on cloud servers. I had an old Raspberry Pi 2 lying around in a box for years, so this was the perfect occasion to give it a new life, given that its specs should be good enough and its power consumption is low.

I also wanted it to run by itself as much as possible without needing many manual actions from me, as I mostly don’t feel like baby’ing a server in my evenings/weekends. Its getting directly exposed to the internet, which is always a risk. I don’t want it to be a big deal if it gets compromised, so nothing sensitive will be on it or accessible from it.

The set-up

Below is a more technical overview of how I set up the website hosting.

TL;DR:

Reverse-proxy & file server: Caddy
Dynamic DNS update, monitoring, server updates: CRON-triggered Bash scripts
Docker container updates: Watchtower
Alerting: Gotify
IP Bans: Fail2Ban
Privacy-friendly visitor stats: Goatcounter

Current setup
Fig: My current server set-up

The Handshake a.k.a. DNS configuration

The DNS configuration makes sure that when someone opens your website in their browser, the request is routed to your server. Simplified the flow is like this: Visitor laptop -> DNS server -> your router -> your server.

The setup for me consisted of two parts:

Dynamic DNS: My ISP changes my IP address every few weeks. I use a CRON job to curl my DNS provider so its updated automatically.
Port Forwarding: I told my router to forward all port 80/443 traffic directly to my Pi’s static internal IP.

Reverse-proxy

A reverse proxy takes care of the next step in the chain: Incoming requests on your server can be forwarded to the service it needs to access, so only the reverse proxy needs to be directly exposed to the internet if you have more services running internally.

I’m using Caddy as a reverse proxy, which handles logging, HTTPS certificates out of the box, and supports rate limiting. Alternatively, Nginx is a very popular option as well.

File server

Having a static website has the big advantage that no content needs to be dynamically generated when a user request comes in. Its just a matter of grabbing the right files from the server filesystem and sending them along. This is not only much faster - but also prevents some security concerns, updates or bugs that you may have when running a PHP or Python server.

Caddy/nginx can both act as a file server, so no additional software component is needed for this (another benefit of static websites). You just have to make sure your website files are in a folder on the server.

In principle by now you have everything you need to successfully host and serve your static website. The only thing the coming sections describe is how to get the peace of mind that everything is running as expected without having to check it yourself.

Logging and Monitoring

The basis of it all is writing logs of each request, so you can take action on it later. Caddy does this for me.

Initially I set up a Loki stack to ingest the Caddy logs, and used Prometheus to store extensive metrics on the server, so I could display dashboards with Grafana. This resulted in a crazy amount of metrics and graphs, which were reassuring to (be able to) see when I just set the whole system up. Also it was gorgeous. However this was obvious overkill, and I never actually looked at them much anyways beyond the very basics.

So relatively quickly I removed the entire Loki stack, and replaced it with a Bash script running on a CRON schedule, looking only at some essentials:

Requests/sec
CPU Temp
RAM usage
Memory usage

I like this a lot better.

All this is to make sure the server keeps running successfully. To know if people visit the website, and which pages are most visited, I use Goatcounter, which is a privacy-friendly basic web analytics service. This can be self-hosted as well, though I use the hosted service.

Alerting

Whenever something is wrong, I would like to be alerted without having to login to my server, so I need something to send me a message (my domain name was so cheap it didn’t come with an e-mail address - otherwise sending an email from it would be the simplest option). For this i use Gotify, a simple server I run on my machine (behind Caddy), that also has a phone app which can send me push messages.

Banning the baddies

One last thing to think about: As soon as you open ports on your router, a bunch of people/bots are going to go try all kinds of stuff on your system. A majority of the traffic is actually bots trying to exploit known security vulnerabilities in popular software. (The number of requests trying to access the nonexistent WordPress admin console is crazy.)

More “visitors” I wasn’t prepared for are the LLM-bots that are scraping the entire internet hoovering up as much data as possible, sometimes multiple times a day. A robots.txt file may help to dissuade some of them.

Layered Defense

To keep my Pi safe these are some of the things I use:

Fail2ban: A popular software package which can be configured to monitor your traffic and ban IP addresses based on number of (404’d) requests, location or user-agent. This helps for the worst traffic, though they send requests from so many different IPs that it will never completely stop them.
Watchtower: This is why updating your software to fix known security vulnerabilities is so important. Linux packages can be updated on a CRON schedule automatically, and I use Watchtower to do the same for my Docker containers (which are running Caddy and Gotify).
VLAN Isolation: Another step you should take is isolating your server on the home network. This is more of a precaution so, in case someone manages to take control of your server, they cannot use that to spread to other devices in your network. How to do this depends a bit on the router that you have and what it lets you do in terms of isolating the server device on the network, but it is good to take a look at. In my case, the Raspberry Pi has no way of connecting to the media server or any other machine in my network.

Whats next

Having a set-up like this gives a great base from which to play with other projects. If its something just for my personal home lab I leave it on the media server, isolated from the web (ideas include IoT humidity sensors for the garden, pH meters for my ferments or a private Forgejo server). If I want to share it with everyone its just a matter of deploying its Dockerfile to the Pi, and adding a few lines to my Caddy config to forward a certain path/subdomain to it.

The Pi has been working quite well so far, every now and then something minor happens and I have to make sure its all running smoothly again (at least my alerting setup works great!) but I’m probably doing better than GitHub on the uptime, so I’ll take that as a win.

In any case I hope this post demystifies a bit what is needed to fully own your own corner of the internet. Let me know what you (want to) self-host!

The concept#

The why#

1. You are a big nerd#

2. You are a control freak#

3. You got a philosophy#

The goal#

The set-up#

The Handshake a.k.a. DNS configuration#

Reverse-proxy#

File server#

Logging and Monitoring#

Alerting#

Banning the baddies#

Layered Defense#

Whats next#