You might already know the “cloud native” proxy Traefik from Kubernetes. Or not, as Kubernetes hides most of the configuration from you. But Traefik can be really useful on its own and has a lot of functionality. I find two of its features especially useful for smaller servers ot home servers: the configuration/service discovery and Traefiks ability to resolve and renew Let’s Encrypt certificates.

Traefik Configuration Discovery

Traefik configuration discovery decouples the service configuration from the proxy configuration. Instead of editing a proxy config file every time you want to add a new webpage, you place the configuration at a known source and let Traefik pick it up. This allows for modular deployments where the services that Traefik serves do not have to be known beforehand with minimal downtime of the proxy.

Traefik can watch multiple sources for new configurations of services that it shall act as a proxy for. The service configurations can be read from a file, etcd, Redis and others. One particularly useful variant is the Docker configuration discovery where Traefik reads the configuration from the labels of a running container.

Automatic Lets’t Encrypt certificates with Traefik

Traefik is able to obtain Let’s Encrypt certificates for the domains of your services. When discovering a new configuration, for instance when a docker container starts up, Traefik will check wether SSL is required and how the certificate shall be resolved. If the ACME resolver is selected, Traefik will obtain a SSL certificate for the domain the service will run on and also keep it up to date.
The domain in question has to point to the server that runs Traefik, of course.

The following example shows how to set up Traefik in a Docker container and discover services on the same Docker network, serving them with SSL certificates. I will reuse the jumphost VM created in my guide for Azure VM Deployment With Ansible.

Making the VM a Docker Host

The VM we created earlier is not yet capable of running Docker containers. We will add this functionality now including a Docker network to which we will attach all our containers.
Additionally we add our user to the docker group so we can manage the docker containers without having to sudo every time.

roles/dockerhost/tasks/main.ymlview raw
---
- name: Add docker packages
apt:
name:
- docker.io
- python3-docker
state: present

- name: Ensure docker internal network
docker_network:
name: internal
appends: yes
driver_options:
# Makes debugging easier
com.docker.network.bridge.name: docker-internal
ipam_config:
# We will pass the from the playbook later on
- subnet: "{{ network_subnet }}"

- name: Add azureadmin to docker group
user:
name: azureadmin
groups: docker
append: yes
...

In an earlier post I already highlighted how to name your Docker network and why.

Traefik Proxy Installation

Now we can already install the Traefik. We start it as a Docker container and expose ports 80 and 443. You will find these ports later in the playbook. This makes the script more versatile.

Apart from the container we just check that the needed directories are present and the config file is available.

roles/traefik/tasks/main.ymlview raw
---
- name: Ensure traefik config directory
file:
name: "{{ item }}"
state: directory
loop:
- "{{ traefik_config_directory }}"
- "{{ traefik_config_directory }}/dynamic"

- name: Check acme.json
stat:
path: "{{ traefik_config_directory }}/acme.json"
register: acme_json

- name: Ensure acme.json
file:
name: "{{ traefik_config_directory }}/acme.json"
state: touch
mode: 0600
when: not acme_json.stat.exists

- name: Copy traefik config
template:
src: files/traefik.yml.j2
dest: "{{ traefik_config_directory }}/traefik.yml"
register: copy_traefik_config

- name: Bring up traefik
docker_container:
name: traefik
image: traefik:v2.6.1
# Restart Traefik when the config changed
recreate: "{{ copy_traefik_config.changed }}"
networks:
# Our network from before
- name: internal
networks_cli_compatible: yes
restart_policy: unless-stopped
# We will pass these from the playbook
published_ports: "{{ published_ports }}"
volumes:
- "{{ traefik_config_directory }}:/etc/traefik"
- /var/run/docker.sock:/var/run/docker.sock
labels:
# Disable Traefik from serving its own container
traefik.enable: "false"
...

The config file is relatively short. We only tell Traefik where to discover the configurations, paste the entrypoints and configure the resolver for the Let’s Encrypt certificates.

An entrypoint is an IP/port combination on which Traefik will listen. We can later choose the names of these entrypoints in our service configuration and therefore specify where a service is available. The names have to be consistent in Traefik and the Docker container labels where they are referenced.

We will use two entrypoints: web for port 80 and websecure for port 443.

roles/traefik/files/traefik.yml.j2view raw
---
# Improve privacy
global:
sendAnonymousUsage: no
log:
level: WARN
# Here we tell Traefik to get the configurations from the docker network
# "internal" that we created earlier
providers:
docker:
network: internal
# We pass the entrypoints from the playbook
entryPoints:
{% for name, address in entrypoints.items() %}
{{ name }}:
address: "{{ address }}"
{% endfor %}

# Use the default storage to store SSL certificates
tls:
stores:
default: {}
# Use the ACME resolver and name it "letsEncryptResolver"
certificatesResolvers:
letsEncryptResolver:
acme:
email: admin@example.com
storage: /etc/traefik/acme.json
httpChallenge:
entryPoint: web
...

In our variables file we just configure the location of the mount point for the traefik config.

roles/traefik/vars/main.ymlview raw
---
traefik_config_directory: /srv/data/traefik/config
...

Ok, done. Now we need a service to show off.

Grafana with Traefik Autodiscovery

Grafana will do nicely for our demonstration because it doesn’t need other services on our VM. Just like before we mainly ensure the necessary directories and copy over the config file.

The labels of the grafana container are what configures Traefik. There, we specify the URL that Grafana shall be on, the port of the grafana container to proxy, that we want to use the letsEncryptResolver from the Traefik configuration to resolve a SSL certificate and that we want to use compression on responses.

roles/grafana/tasks/main.ymlview raw
---
- name: Include dockerhost vars
include_vars:
dir: ../../dockerhost/vars

- name: Include grafana vars
include_vars:
dir: vars

- name: Ensure grafana directories
file:
name: "{{ item }}"
state: directory
owner: "{{ docker_data_user_name }}"
group: "{{ docker_data_user_name }}"
loop:
- "{{ grafana_data_directory }}"
- "{{ grafana_config_directory }}"

# We need these directories in later posts ;-)
- name: Ensure grafana config directories
file:
name: "{{ item }}"
state: directory
owner: "{{ docker_data_user_name }}"
group: "{{ docker_data_user_name }}"
loop:
- "{{ grafana_config_directory }}/provisioning/access-control"
- "{{ grafana_config_directory }}/provisioning/dashboards"
- "{{ grafana_config_directory }}/provisioning/datasources"
- "{{ grafana_config_directory }}/provisioning/notifiers"
- "{{ grafana_config_directory }}/provisioning/plugins"

- name: Copy grafana config
template:
src: files/grafana.ini
dest: "{{ grafana_config_directory }}/grafana.ini"
register: copy_grafana_configuration

- name: Ensure grafana container
docker_container:
name: grafana
image: grafana/grafana:8.2.6
networks:
# Our internal network again. Here Traefik will discover it
- name: internal
networks_cli_compatible: yes
volumes:
- "{{ grafana_data_directory }}:/var/lib/grafana"
- "{{ grafana_config_directory }}:/etc/grafana"
user: "{{ docker_data_uid }}:{{ docker_data_uid }}"
restart_policy: unless-stopped
restart: "{{ copy_grafana_configuration.changed }}"
labels:
# What URL Grafana shall run on
traefik.http.routers.grafana.rule: "Host(`{{ ansible_host }}`) && PathPrefix(`/grafana`)"
# Grafana doesn't like a prefix in requests, so we remove it
traefik.http.middlewares.grafana-prefix.stripprefix.prefixes: "/grafana"
# Compress responses to save bandwidth
traefik.http.middlewares.grafana-compression.compress: "true"
# We want port 443
traefik.http.routers.grafana.entrypoints: "websecure"
# And of course SSL
traefik.http.routers.grafana.tls: "true"
# This line tells Traefik to fetch a certificate using the ACME resolver
traefik.http.routers.grafana.tls.certresolver: letsEncryptResolver
# Here we just switch on the two middlewares configured above
traefik.http.routers.grafana.middlewares: "grafana-prefix,grafana-compression"
# This is where Traefik will send the requests to
traefik.http.services.grafana.loadbalancer.server.port: "3000"
...

For Grafana it’s not enough to specify the URL in the container label. Grafana wants to know about it separately. Grafana is very picky when it comes to this URL. When it’s not spot on, Grafana won’t work properly.

Also we set an admin password in our configuration. This service will be on the open web after all.

roles/grafana/files/grafana.iniview raw
[server]
domain = {{ ansible_host }}
root_url = https://{{ ansible_host }}/grafana

[security]
admin_password = {{ grafana_admin_password }}

Now we specify the password variable alongside the config directories in the vars file. But it points to a special vars file that we call vault.yml. There we place the actual password.

roles/grafana/vars/main.ymlview raw
---
grafana_data_directory: /srv/data/grafana/data
grafana_config_directory: /srv/data/grafana/config
grafana_admin_password: "{{ vault_grafana_admin_password }}"
...

The vault file looks like this:

roles/grafana/vars/vault.ymlview raw
---
vault_grafana_admin_password: CHANGE_ME
...

The reason for storing the password in an extra file is, that we will later encrypt the vault file so nobody with access to the code can get the Grafana password. Still, we want to be able to locate the grafana_admin_password variable. We cannot search in the encrypted vault file, though. Therefore we create the indirection that points to the vault file and makes it easy for us to find our variables.

You can encrypt the vault file like this:

ansible-vault encrypt roles/grafana/vars/vault.yml

Combining all Ansible Roles

Before we execute the playbook we have to tell Ansible where it can find the VM. Be sure to reference the ssh key that you used in creating the jumphost VM in the Azure VM Deployment With Ansible post.

hosts.ymlview raw
---
all:
hosts:
jumphost:
ansible_host: azuredemojumphost.northcentralus.cloudapp.azure.com
ansible_port: 22
ansible_user: azureadmin
ansible_ssh_private_key_file: ~/.ssh/id_rsa_azure
children:
...

The playbook then targets the jumphost and applies all the roles we created above. Note that here we also pass the variables we referenced earlier: the docker network subnet, the Traefik published ports and entrypoints.

hosts.ymlview raw
---
- name: Provision jumphost
hosts: jumphost
become: yes
roles:
- role: dockerhost
vars:
# Some local network ip range is ok
network_subnet: 172.200.0.0/16
- role: traefik
vars:
published_ports:
- 80:80
- 443:443
entrypoints:
web: ":80"
websecure: ":443"
- grafana
...

Now we can execute the playbook. If you have already encrypted your vault file, use the –ask-vault-pass parameter.

ansible-playbook --ask-vault-pass jumphost.yml

Grafana should now be accessible under azuredemojumphost.northcentralus.cloudapp.azure.com (you should pick your own URL).