Setting up a basic nomad cluster

Create your very own cluster ready to scale out
Published: 2023-07-17T23:42Z

Running containers is basically my bread and butter these days. While at work I have to use kubernetes, outside of work and at previous employers I prefer using nomad for workload orchestration.

Following the steps described in this post should yield you a functional cluster with load-balancing, name-based routing and the ability to add plenty of plugins for things like auto-scaling your services.

Why not kubernetes

Don’t get me wrong, kubernetes is fine, especially with the whole ecosystem that has grown around it, but properly managing it in a production environment easily takes a team to ensure no downtime during upgrades of the cluster and the like.

There’s simply way too many moving parts to kubernetes and it’s plugins that make up a platform for a single person to understand and to be able to recover in a disaster scenario.

While things get a bit better when using a package manager like helm, it obfuscates how the components used to build the application even further and it doesn’t really help when writing your own software that much.

Quick references

Some terms that nomad use might be a bit confusing. Here’s a quick list of what those terms mean, even though I’ll stick with my own terms.

Theirs Mine Description
node node Machine participating in the cluster
server manager Instance of nomad that orchastrates workload
client worker Instance of nomad that executes workload

Prerequisites

We need a single machine with the following attributes

While technically you can work with a dynamic address on a home server as well, that’ll require setting up a DDNS service which we’ll not cover in this post.

In this post, I’ll be assuming your machine is running within a 172.16.0.0/24 subnet with the DHCP range from .100 through .200, and your public IP address is 1.2.3.4. This means the IP addresses that end below 100 will be free to use within the local network. Adjust as applicable to your environment.

Setting up the network

Because we’ll have a couple roles that our machine(s) will fulfill, and any machine in the cluster may go down, we’ll start with setting up floating IP addresses.

For the floating IP addresses, we’ll use keepalived, which is a software routing implementation, but we’re just interested in it’s VRRP implementation.

xbps-install -S keepalived

For the 3 main roles we’ll have in our cluster, we’ll define 3 virtual routers using keepalived on the internal network, roughly as follows. Do keep in mind your network interface and network range may be different.

# /etc/keepalived/keepalived.conf

# Internal IP address for the loadbalancer roles
vrrp_instance VI_LOADBALANCER {
  state BACKUP
  interface eth0
  virtual_router_id 10
  priority 100
  advert_int 1
  virtual_ipaddress {
    172.16.0.10/24
  }
  authentication {
    auth_type PASS
    auth_pass changeme-lb
  }
}

# Internal IP address for manager nodes
vrrp_instance VI_MANAGEMENT {
  state BACKUP
  interface eth0
  virtual_router_id 20
  priority 100
  advert_int 1
  virtual_ipaddress {
    172.16.0.20/24
  }
  authentication {
    auth_type PASS
    auth_pass changeme-management
  }
}

# Internal IP address for worker nodes
vrrp_instance VI_WORKER {
  state BACKUP
  interface eth0
  virtual_router_id 80
  priority 100
  advert_int 1
  virtual_ipaddress {
    172.16.0.80/24
  }
  authentication {
    auth_type PASS
    auth_pass changeme-worker
  }
}

After (re)starting the service, you should be able to see the IP addresses registered on the interface of your machine using ip addr.

Setting up the service mesh

Now the more modern fun starts, getting a service mesh up-and-running. For keeping track of what’s in the cluster, we’ll use consul. Aside from keeping track of what services are running, we’ll also use consul to bootstrap the nomad cluster’s connections, so we don’t have to write that information twice.

To install consul on the management nodes in our cluster (all 1 of them), run the following command:

xbps-install -S consul

The configuration of consul is quite simple as we only have a single machine, which also implies a single datacenter.

Don’t forget to pick a name for your datacenter. Because I’m located in western europe, I’ll simply pick eu-west-1. You don’t really need the ui to be enabled, but it may help with debugging.

# /etc/consul.d/consul.hcl

datacenter  = "eu-west-1"
data_dir    = "/opt/consul"
bind_addr   = "0.0.0.0"
client_addr = "0.0.0.0"
server      = true

ui_config {
  enabled = true
}

bootstrap_expect = 1
advertise_addr   = "172.16.0.20"
retry_join       = ["172.16.0.20"]
retry_join_wan   = []

Yes, consul will try to listen on all IP addresses due to the 0.0.0.0 settings there. This is so we don’t have to configure consul to restart when it becomes the machine that has the floating IP and so it doesn’t crash if it tries to listen on the floating IP and it’s not the owner. The firewall in between the machine and the internet should prevent unauthorized access to consul’s ports.

Note that the “retry_join” field is set to the manager IP we defined during the keepalived section. If you add another consul server instance to the cluster, this will allow them to connect and work together.

After placing that configuration and (re)starting the consul service on your machine, you should be able to connect to it with a browser on port 8500 if you’re in the same network. If you’re not in the same network, you can test this by setting up a port forward using SSH.

We’ll not be walking through setting up authentication here, so make sure to only host services in your environment which you trust.

Setting up the workload orchestrator

Setting up nomad is roughly of the same difficulty as setting up consul, but instead of letting nomad orchestrate all that itself, we’ll be pointing nomad towards consul for connecting them together.

Here we’ll start to differ a bit more between manager, worker and loadbalancer nodes, as we’ll need metadata to differ between the roles.

Let’s start with the configuration present on all nodes that’ll run nomad which will connect the nomad machines together:

# /etc/nomad.d/common.hcl

datacenter = "eu-west-1"
data_dir   = "/opt/nomad"
bind_addr  = "0.0.0.0"

consul {
  address = "localhost:8500"
  server_service_name = "nomad"
  client_service_name = "nomad-client"
  auto_advertise = true
  server_auto_join = true
  client_auto_join = true
}

On the manager machines, you’ll include the following section to make that machine act as a manager. Keep note that the bootstrap_expect is set to 1 here because we’re running only 1 machine. Preferably you would run 3 of 5 manager machines in your cluster and then set the bootstrap_expect value to 2 or 3 respectively for added resilience.

# /etc/nomad.d/server.hcl

server {
  enabled          = true
  bootstrap_expect = 1
}

And finally, on the worker machines you’ll add the following piece of configuration to enable it to perform work.

# /etc/nomad.d/client.hcl

client {
  enabled = true
}

plugin "docker" {
  config {
    allow_privileged = true
    volumes {
      enabled = true
    }
  }
}

These 3 configuration files can live next to eachother on the machines that need to perform those actions, so you can simply copy them across machines as needed.

If you will assign roles by metadata, like we will do, the client configuration will become node-specific and can no longer be run in parallel with other client configs.

# /etc/nomad.d/client.hcl

client {
  enabled = true

  meta {
    roles = "loadbalancer,manager,worker"
  }

}

// <docker block>

Remove or add roles you want based on your intended use-case, but for this post’s intention we’ll use those 3.

After a (re)start of the nomad service on your machine, going with a browser to port 4646 should yield you the nomad ui indicating a lack of services on your freshly created cluster.

You can double-check consul’s web ui to see if nomad registered itself as a service in the mesh to verify that that connection is properly set up.

Setting up load balancing

At this point, your cluster is already running, but we can not contact any services you would run on it in an easy way. To do that, we’ll set up the fabio loadbalancer.

While you technically could use caddy as well so you can set up an automatic SSL certificate environment using Let’s Encrypt, that’d require some templating with the data present in consul which we’ll not cover in this post.

The easiest way to get fabio running is through docker, as we’ve already enabled that on the loadbalancer machine. You can use the following nomad job specification to get started, but remember to replace the relevant values with what works in your environment:

# /opt/services/fabio.hcl

job "fabio" {
  datacenters = ["eu-west-1"]
  type        = "system"

  constraint {
    attribute = "${attr.kernel.name}"
    value     = "linux"
  }

  constraint {
    operator  = "set_contains"
    attribute = "${meta.roles}"
    value     = "loadbalancer"
  }

  group "fabio" {

    network {
      port "ui" {}
      port "lb" {
        static = 80
        to     = 9999
      }
    }

    task "fabio" {
      driver = "docker"

      template {
        destination = "local/fabio.properties"
        data = <<EOH
        proxy.addr           = :{{ env "NOMAD_PORT_lb" }}
        ui.addr              = :{{ env "NOMAD_PORT_ui" }}
        registry.consul.addr = 172.16.0.20:8500
        EOH
      }

      config {
        image        = "fabiolb/fabio:latest"
        ports        = ["ui","lb"]
        network_mode = "bridge"
        volumes      = [
          "local/fabio.properties:/etc/fabio/fabio.properties"
        ]
      }

      resources {
        cpu    = 300
        memory = 128
      }

      service {
        name = "fabio"
        tags = ["urlprefix-fabio.lan/"]
        port = "ui"
        check {
          name     = "alive"
          type     = "tcp"
          interval = "10s"
          timeout  = "3s"
        }
      }
    }
  }
}

By launching this service by running nomad run /opt/services/fabio.hcl, you’ll launch the load-balancer service which’ll handle proxying http requests to the correct containers within the cluster.

Keep in mind that the healthcheck is performed by the consul server(s), not by fabio. Fabio simply reads the services listed by consul and routes requests.

Note the network_mode = "bridge" in there, which allows the container to connect to the network again, which is not allowed by default. So if your own services need to connect outwards, they’ll need to have that network mode as well.

The “fabio.lan” in the service tag configures fabio to proxy that hostname to the port you indicate in the service.

(optional) Hosting an actual service

The cluster is functional at this moment, but it’s a bit barren in terms of services. To give you an example of how to host a service, here’s a job file you can use to run a bare nginx server:

# /opt/services/nginx.hcl

job "fabio" {
  datacenters = ["eu-west-1"]
  type        = "system"

  constraint {
    attribute = "${attr.kernel.name}"
    value     = "linux"
  }

  constraint {
    operator  = "set_contains"
    attribute = "${meta.roles}"
    value     = "loadbalancer"
  }

  group "fabio" {

    network {
      port "http" { to = 80 }
    }

    task "fabio" {
      driver = "docker"

      # Writing an env file
      template {
        destination = "local/env"
        env         = true
        data        = <<EOH
        NGINX_PORT = {{ env "NOMAD_PORT_http" }}
        EOH
      }

      # Or simply defining env directly
      env {
        FOO = "bar"
      }

      # Which container to run and which port to map
      config {
        image = "nginx:latest"
        ports = ["http"]
      }

      # How much resources is this allowed to use
      resources {
        cpu    = 300
        memory = 128
      }

      # And register a named service
      service {
        name = "nginx-example"
        tags = ["urlprefix-example.com/"]
        port = "http"
        check {
          name     = "alive"
          type     = "tcp"
          interval = "10s"
          timeout  = "3s"
        }
      }
    }
  }
}

Once you run this service as well and point example.com to your cluster in your hosts file, you should see the default nginx landing page. From here, you should be able to get most single-image applications running on your cluster.

I will not cover multi-container applications in this post.

(optional) Route a domain to your cluster

A cluster on it’s own is nice and all, but if you can’t reach it from the outside-world it won’t do you much good when you want to host public websites on it.

The server setup shown in this post receives unencrypted requests on port 80 and does not handle ssl. While that is possible, we’ll let cloudflare handle that.

First you’ll need to add your domain to cloudflare, for which luckally they have documentation themselves: Add site to cloudflare.

After adding your domain to cloudflare, add a dns record for your domain, point it to your public IP and make sure the proxied flag is enabled. The proxy is technically not required, but that’ll allow cloudflare to handle the SSL for us. Make sure to enable SSL/TLS -> Edge Certificates -> Always use HTTPS on your domain as well, so clients never accidentally use http while sending over sensitive information.

Now that you have a domain set up, let’s open up port 80 for requests coming through cloudflare. You could open it up for the whole world, but allowing only requests from cloudflare is slightly safer. Here’s their IP ranges so you can limit how open your network is: CloudFlare IP Ranges.

Opening up the port may look like adding a port-forward in your router, allowing the port in your cloud firewall, or something else entirely, it’s fully up to your environment which I can not guide you in.

After following these steps, going to the domain you chose (and set in the example service as well) with your browser should yield you an SSL-secured nginx landng page that’s hosted within your own cluster.

Closing thoughts

While this single-node cluster will work fine, adding a 2nd machine will allow you to separate the entry-point (loadbalancer) and the manager. This should not have an impact on security, but there will always be bugs in software and/or firewalls that are yet to be discovered, so separating the loadbalancer will give an extra layer of defense.

For stability, you should have at least 3 manager nodes within your cluster, which will allow you to continue operations as usual even when 1 of them is down for whatever reason (hardware failure, maintenance, etc). Then add as many worker machines as you need for your workload, or even autoscale your cluster to the size that matches your services.