Azure Scale Set Monitoring With Prometheus and Grafana
When running more and more machines it becomes impractical to check on each of them by logging in and going through the numbers yourself. This is especially true for a variable number of machines like in cloud scale sets.
So what can we do? Prometheus is a popular solution to collect and store metrics from your machines. You can then browse them either via its included web interface or third party apps like Grafana.
In this post we will look at a practical example of metric collection with Prometheus on Microsoft Azure scale sets. I assume that you already have an Azure deployment set up. If not, check out my post on Microsoft Azure VM deployment.
We will run Prometheus in a docker container on a jumphost VM utilizing the also present Traefik. I got a post about how to set up Traefik with Ansible on your jumphost if you need it. Prometheus will then fetch the metrics from a small exporter app on each of the Azure scale set VMs. Finally, we display the data with Grafana that also runs in a container on the jumphost.