All roles

[Remote] Senior DevOps Engineer/Site Reliability Engineer-East Coast

Remote · USA Full-time New today

Note: The job is a remote job and is open to candidates in USA. Stellar Cyber is a fast-growing global leader in cybersecurity, trusted by major enterprises and government agencies. They are seeking a highly skilled Senior DevOps / Site Reliability Engineer to build, operate, and scale reliable cloud-native infrastructure and distributed data platforms while ensuring operational excellence and reliability best practices.

Responsibilities

  • Administer and maintain Kubernetes clusters and containerized workloads
  • Manage cloud infrastructure across OCI, AWS, GCP, or Azure environments
  • Develop and maintain CI/CD pipelines for reliable application deployments
  • Implement and manage Infrastructure as Code (IaC) using Terraform and Helm
  • Build automation tooling and operational workflows using Python, Go, or Bash
  • Drive observability initiatives including monitoring, logging, tracing, and alerting improvements
  • Monitor, troubleshoot, and resolve production incidents while participating in on-call rotations
  • Support and optimize distributed data platforms including Kafka, Elasticsearch, Spark, Redis, and MongoDB
  • Improve platform reliability, scalability, and operational efficiency using SRE best practices
  • Collaborate with cross-functional teams across multiple time zones
  • Perform Linux system administration and networking troubleshooting
  • Contribute to incident response processes, postmortems, and reliability improvements
  • Support GitOps and deployment workflows using tools such as ArgoCD and GitHub Actions
  • Evaluate and implement AI-assisted operational tooling for auto-remediation, alert correlation, and operational intelligence

Skills

  • 5+ years of experience in DevOps, SRE, or Platform Engineering roles
  • Strong expertise with Kubernetes, Docker, and container orchestration
  • Hands-on experience managing production cloud environments
  • Strong Infrastructure as Code experience with Terraform and Helm
  • Experience with CI/CD tools and deployment automation
  • Advanced troubleshooting skills in Linux systems, networking, and distributed systems
  • Experience with observability platforms including Prometheus, Grafana, Loki, Alertmanager, and Elastic Stack
  • Strong programming and scripting skills in Python, Bash, or Go
  • Experience supporting high-availability production systems and on-call operations
  • Knowledge of incident management and reliability engineering practices
  • Familiarity with data platform technologies such as Kafka, Spark, Elasticsearch, Redis, or MongoDB
  • Understanding of AI-driven operational tooling and automated remediation concepts
  • Excellent communication, collaboration, and problem-solving skills
  • Resides on the East Coast

Benefits

  • Pre-IPO Stock Options
  • Medical, Dental & Vision care
  • 401(k)
  • Employee Assistance Program
  • Employee Discount Program
  • Life Insurance
  • Paid time off
  • Referral Program
  • Rewards and Recognition Program

Company Overview

  • Stellar Cyber is an open XDR platform that offers comprehensive security while streamlining operations for efficiency. It was founded in 2015, and is headquartered in Santa Clara, California, USA, with a workforce of 51-200 employees. Its website is https://www.stellarcyber.ai.
  • Company H1B Sponsorship

  • Stellar Cyber has a track record of offering H1B sponsorships, with 2 in 2026, 6 in 2025, 9 in 2024, 7 in 2023, 7 in 2022, 5 in 2021, 2 in 2020. Please note that this does not guarantee sponsorship for this specific role.
  • Apply To This Job

    Related roles