Remote Systems Engineer jobs – Full‑Time Senior Server Engineer (Remote) – AWS, Terraform, Linux – $110k‑$150k – Georgetown, Texas
TITLE: Remote Systems Engineer jobs – Full‑Time Senior Server Engineer (Remote) – AWS, Terraform, Linux – $110k‑$150k – Georgetown, Texas ---
Who we are
We’re a 120‑person software platform that started in a cramped coworking space in Georgetown, Texas eight years ago. Today, most of our code lives in the cloud, but we still remember the day we pulled the first rack of servers out of a storage unit and wired them together with duct tape. That scrappy spirit still guides us, even though 70 % of the engineering team now works from home across the U.S. and Europe. Our product – a SaaS analytics suite for mid‑size retailers – processes an average of 2.5 billion events per month. That volume translates to roughly 5,000 virtual machines, 200 TB of daily data ingestion, and a 99.9 % uptime SLA that our customers have come to expect. As we push into new verticals and double our ARR each year, the complexity of the underlying infrastructure has exploded, and that’s where you come in.
Why this role exists now
Two weeks ago we closed a $50 million Series C round that’s earmarked for a massive hybrid‑cloud migration. We’re moving 40 % of our legacy workloads from on‑prem datacenters in Georgetown, Texas to a mix of AWS and Azure, and we need a seasoned systems engineer to own the end‑to‑end journey. This isn’t a “build‑and‑forget” position: you’ll be responsible for designing, automating, and continuously improving the pipelines that keep our services running, scaling, and secure. If you love figuring out how a change in a Terraform module impacts a Kubernetes node pool, or how a tweak in our Prometheus alerting rules reduces noise for the on‑call rotation, you’ll feel right at home.
What you’ll do (day‑to‑day)
-
Architect & automate:
Design Terraform configurations for multi‑cloud VPCs, security groups, and IAM roles. Translate those designs into reusable Ansible playbooks that provision Linux (Ubuntu 20.04, RHEL 8) servers in both public clouds and our remaining on‑prem racks in Georgetown, Texas. -
Maintain reliability:
Own the 99.9 % uptime SLA for our core services. You’ll monitor health with Prometheus, visualize trends in Grafana, and fine‑tune alert thresholds in PagerDuty to keep on‑call fatigue under 15 minutes of unnecessary noise per week. -
Capacity planning:
Run quarterly forecasts based on product demand, ensuring we have at least 20 % headroom for traffic spikes. Last year we oversaw a 200 % YoY growth; this year’s goal is 250 %. -
Security & compliance:
Work with our InfoSec lead to enforce CIS Benchmarks, rotate AWS KMS keys, and patch 300+ Linux hosts within the 48‑hour window mandated by our PCI‑DSS audit. -
Incident response:
Lead post‑mortems for any breach of SLA, documenting root cause, remediation steps, and actionable improvements. You’ll mentor junior engineers on incident triage and run blameless retrospectives. -
Collaboration:
Pair with product managers to understand new feature roll‑outs, and translate those requirements into infrastructure changes. You’ll also contribute to the
#infra‑ops
channel on Slack, where we discuss everything from “why is my pod stuck in CrashLoopBackOff?” to the best coffee maker for home offices.
Our current stack (the tools you’ll be using)
1.
Terraform
– IaC for AWS, Azure, and on‑prem VMware. 2.
Ansible
– Configuration management for Linux and Windows. 3.
Docker
&
Kubernetes (EKS & AKS)
– Container orchestration for microservices. 4.
Prometheus
&
Grafana
– Metrics collection and dashboarding. 5.
Splunk
– Log aggregation and search. 6.
GitLab CI/CD
– Pipelines for building, testing, and deploying. 7.
AWS CloudWatch
&
Azure Monitor
– Cloud‑native observability. 8.
Jira
&
Confluence
– Project tracking and documentation. 9.
Python
&
Bash
– Scripting for automation tasks. 10.
HashiCorp Vault
– Secrets management. If you’ve spent even a fraction of your career with three or more of these, you’ll fit right in.
Who you are (the checklist)
-
Experience:
5+ years in a systems engineering or server‑engineering role, with at least 3 years of hands‑on Terraform and Ansible. Experience with both AWS and Azure is a plus, but deep expertise in one cloud plus a solid grasp of the other’s APIs will do. -
Mindset:
You treat infrastructure as code, not as a collection of “servers”. You love version‑controlling everything, reviewing pull requests, and leaving clear commit messages. -
Metrics‑driven:
You can read a latency histogram and explain why the 99th percentile matters to a business stakeholder. You’ve built SLAs and can defend them with data. -
Communication:
You can explain a complex networking diagram to a product designer without jargon. The ability to write clear Runbooks is just as important as speaking on a Zoom call with a remote on‑call partner. -
Team spirit:
You’ll be part of a distributed crew that runs daily stand‑ups across six time Apply tot his job Apply To this Job