SNI - NGINX and TLM Agent
I've been testing more TLM agent discovery and processing as it relates to multiple domains bound to one IP address, or SNI.
Environment:
I have a server with services I'd like to expose using a public facing domain (FQDN) with several layers of authentication to protect the sites from outside eyes and potentially creating an attack surface.
My home lab and network setup:
- 1GB fiber to the house with single external IP address
- Edge Reverse Proxy - configured to ingest external requests and pass them along to internal servers. I terminate :80 at this server and return a 301 to force any :80 requests up to :443 -> which then sends the traffic as TLM passthrough.
- Solution servers - I have several that are external facing and terminate the TLS traffic passed along through the edge reverse proxy server. This is where all the work is done.
While not exposing all the details, one of my "solution servers" is a PlexMediaServer where I run a few automation tools for streamlining media gathering for presentation through my Plex server for our personal enjoyment.
I'm running the following setup on my PlexMediaServer
- PlexMediaServer - native to Ubuntu Linux
- Seerr - for requesting TV shows and/or Movies - we make the request, it determines what's available and when, then sends the actual management to:
- Radarr and/or Sonarr - Radarr handles movies and Sonarr handles TV series for tracking against various RSS feeds and Indexers. These two services then push the download request to:
- qBittorrent - which handles the actual downloading and seeding. Sonarr or Radarr monitor the download status and maintain the system of record for what's managed on my Plex server.
What makes this setup unique is... all of these services (Seerr, Sonarr, Radarr), except for Plex itself are running a Docker Containers. Each service is bound to specific ports (which are configurable, but need to be unique). In my case, I left them default.
On the PlexMediaServer - I installed NGINX and configured each sever block to capture incoming TLS requests for the respective service AND reverse proxied it to the respective docker container port - boom!
This is where TLM came into play. I installed a TLM agent tied to my demo/test portal in production. Had it scan local bindings and all certs. Found way more than I was expecting, but initially it found the wrong IP with single vhost cert that I'd bootstrapped with OpenSSL.
Another note: I had installed Tailscale some time back to access this server remote through my private VPN tool, Tailscale. Which worked "ok" - but resolution for my two primary users (Wife & MIL) to pick shows/movies - was clunky. Hence, adding these to public FQDNs.
First scan by TLM found a single cert bound to a 10.x network, which was where Tailscale lived/bound itself. I then declared specific IP address for each server block on NGINX (I have 3). TLM didn't see the specific IP server blocks.
WHERE THE RUBBER MEETS THE ROAD!
One shortcoming of setting up a new TLM agent is, SNI is not enabled. After first scan, if SNI is expected and not displayed, then it needs to be enabled and declared.
Some issues with SNI:
- If all FQDNs server blocks are bound to specific IP and it's the SAME IP, first scan will only show first vHost on the nginx.conf file and will fail to see the other FQDNs
- If server blocks are set to just listen on :443 - it will bind to first IP bound in stack, so best practice is consider declaring the IP in the server block. x.x.x.x:443 will insure the FQDN listens JUST on that IP.
- Once first scan is done, click into the TLM Agent and set SNI to enable. AND declare the specific FQDNs expected. Declare what's configured. TLM will not "discover" these.
- Another "oddity" if you have N+1 FQDNS, often time during automation configuration, the second or third FQDN may fail the last step in automation "test" - I've learned two things. Hit "Retry" once and if it fails again, test it manually and/or check the server block configuration to see if the "Digicert managed" cert was updated. If so, you may have to restart NGINX. My best guess is, TLM agent either skipped restarting NGINX or it failed getting the right cert. Dropping to CLI and manually restarting it will sometimes get it reset.
Now - in my case, all three FQDNs have quality public certificates issued and presenting the three services I have available. TLM is managing these and I've established automation for each of these certificates and will monitor.