[Remote] Principal Software Engineer – Backend
Note: The job is a remote job and is open to candidates in USA. DeepHow is a Physical AI platform serving industrial manufacturing, pharmaceuticals, and utilities. They are seeking an experienced Principal Software Engineer – Backend to lead the architecture, development, and optimization of their backend systems, playing a critical role in shaping technical direction and mentoring engineers.
Responsibilities
- Lead the architecture, design, development, and optimization of scalable, high-performance backend systems that support business growth and product innovation
- Define technical roadmaps, architectural standards, and engineering best practices while providing technical leadership and mentorship to development teams
- Develop and maintain backend applications, APIs, microservices, and automation solutions using Node.js and Python
- Design, deploy, and manage cloud-native infrastructure on Google Cloud Platform (GCP) , including BigQuery, Cloud Run, Cloud Functions, App Engine, Compute Engine, and Google Kubernetes Engine (GKE)
- Implement and manage Infrastructure as Code (IaC) using Terraform and Helm to ensure scalable and repeatable deployments
- Build and maintain observability frameworks, including monitoring, logging, tracing, and alerting using tools such as Datadog, New Relic, and Google Cloud Monitoring
- Monitor and optimize production machine learning workloads, including model performance, operational health, and data drift detection
- Design and manage scalable data architectures using PostgreSQL, MongoDB, Redis, and Firestore , while developing large-scale data pipelines and supporting dataset versioning practices with tools such as DVC and LakeFS
- Deploy, manage, and optimize containerized applications using Docker and Kubernetes (GKE) , including multi-tenant architectures, RBAC, namespace isolation, and resource management
- Design secure cloud networking solutions involving VPCs, load balancers, and network security controls while implementing secure authentication and authorization using OAuth and SAML
- Establish and maintain infrastructure security best practices, including encryption, secrets management, service account governance, and credential rotation
- Build and enhance CI/CD pipelines using Jenkins and support GitOps workflows with tools such as ArgoCD and Flux
- Improve application performance, scalability, reliability, and fault tolerance while implementing asynchronous processing frameworks such as Temporal and Celery
- Integrate ML frameworks, model lifecycle tools, and model-serving platforms, including PyTorch, Ray, Hugging Face, MLflow, Weights & Biases, BentoML, Triton, and TorchServe, within scalable Kubernetes environments
Skills
- Bachelor's or Master's degree in Computer Science, Engineering, or a related technical discipline
- Equivalent practical experience will also be considered
- 10+ years of backend software engineering experience
- Proven track record of designing, building, and scaling production-grade systems
- Prior experience in a SaaS company is required
- Strong experience in cloud-native environments and distributed systems
- Previous experience in a Principal Engineer, Staff Engineer, or Senior Lead Engineer role with ownership of architecture and system design
- Demonstrated success leading complex technical initiatives and mentoring engineering teams
- Experience working in startup or high-growth environments is strongly preferred
Company Overview