All roles

[Remote] Senior Machine Learning Engineer - Agentic AI

Remote · USA Full-time New today

Note: The job is a remote job and is open to candidates in USA. The University of Texas MD Anderson Cancer Center is a leading institution in cancer care and research, seeking a Senior Machine Learning Engineer – Agentic AI. This role focuses on designing and operating enterprise-scale agentic AI platform capabilities to ensure the safe and governed deployment of AI systems within healthcare environments.

Responsibilities

  • Lead the design, evolution, and operation of the enterprise agentic AI platform in collaboration with enterprise architects and platform ML engineers
  • Build platform components that enable interoperability between first‑party and third‑party agents, including identity, state, memory, tool access, orchestration, auditability, and policy enforcement
  • Define and document standardized integration patterns connecting agents with enterprise business systems, data platforms, APIs, and health IT systems
  • Provide reusable platform services, reference implementations, and SDKs that reduce risk and accelerate delivery for applied teams
  • Design and operate validation and de‑risking frameworks, including simulation, sandboxing, shadow execution, canary releases, and continuous behavior monitoring
  • Establish and enforce platform standards for agent development, including interfaces, execution contracts, evaluation hooks, safety constraints, and observability requirements
  • Participate in platform governance, release coordination, and incident response, supporting investigation and remediation of agent‑related failures
  • Implement platform safeguards such as fallback mechanisms, rollback strategies, approval gates, rate limiting, audit trails, and kill‑switch capabilities
  • Partner with software engineering, security, IT, and health IT stakeholders to deploy agentic AI capabilities in secure enterprise environments
  • Support responsible AI practices through traceability of prompts, policies, tools, models, agent actions, and documentation of known failure modes and limitations

Skills

  • Bachelor's degree in Computer Science, Software Engineering, Data Science, Physics, Math & Statistics, or another related engineering discipline
  • Five years of experience in machine learning engineering, data science, data engineering, and/or software engineering
  • At least 5 years of industry experience in data science
  • 3+ years as a Senior ML Engineer focused on agentic AI systems
  • Experience building AI or ML platforms that serve multiple downstream teams and production workloads
  • Strong proficiency in Python and integration of modern ML frameworks (e.g., PyTorch) with large language models and agent systems
  • Hands-on experience with agentic AI frameworks such as LangGraph, LangChain, AutoGen, CrewAI, Semantic Kernel, or equivalent
  • Working knowledge of agentic AI protocols and interoperability standards (e.g., MCP, agent-to-agent communication, structured tool invocation)
  • Experience implementing planner-executor loops, hierarchical agents, and multi-agent coordination patterns
  • Familiarity with workflow orchestration tools (Airflow, Prefect, Temporal) and distributed execution frameworks (Ray or equivalent)
  • Experience deploying containerized AI platforms using Kubernetes in enterprise cloud environments with lineage, auditability, and controlled promotion to production
  • Ability to reason at the systems and platform level, balancing safety, performance, flexibility, and usability
  • Experience designing quantitative evaluation strategies for agentic systems, including success rates, latency, cost, recovery behavior, and safety metrics
  • Strong understanding of enterprise data governance, security, and privacy requirements, including healthcare and health IT considerations
  • Ability to identify systemic risks stemming from agent autonomy, non-determinism, tool access, and multi-agent interactions
  • Experience analyzing failure modes caused by prompt drift, model updates, tool changes, and cross-system dependencies
  • Collaborate effectively with architects, applied MLEs, data scientists, software engineers, and IT partners
  • Produce clear documentation covering platform architecture, APIs, integration patterns, validation frameworks, and operational runbooks
  • Communicate platform capabilities, risks, and limitations to leadership and partner teams
  • Contribute to internal standards and shared practices that improve safety, scalability, and consistency of agentic AI development
  • Provide hands-on technical guidance, mentorship, and troubleshooting support to platform adopters
  • Present technical and non-technical concepts clearly in meetings and institutional forums
  • Master's degree or PHD with a concentration in Science, engineering, or related field
  • Experience designing, deploying, and maintaining agentic AI systems that operate autonomously and collaboratively across distributed environments
  • Experience in monitoring and troubleshooting autonomous agents post-deployment, including performance degradation, clinical incidents, model updates, or corrective actions
  • Experience raising the technical bar for team members, such as establishing reproducibility practices, review standards, or shared patterns
  • Experience technically evaluating third-party agentic AI platforms within clinical workflows

Benefits

  • Paid medical benefits
  • Paid time off (PTO)
  • Strong retirement plans
  • Tuition benefits
  • Educational opportunities
  • Individual and team recognition
  • Referral Bonus Available?

Company Overview

  • The University of Texas MD Anderson Cancer Center is one of the world’s most respected centers devoted exclusively to cancer patient care, research, education and prevention. It was founded in 1994, and is headquartered in Houston, Texas, USA, with a workforce of 10001+ employees. Its website is https://www.mdanderson.org/.
  • Apply To This Job

    Related roles

    [Remote] Senior Customer Success Engineer

    Remote · USA Full-time

    [Remote] Project Control Analyst

    Remote · USA Full-time

    [Remote] Full-Stack Analytics Engineer II

    Remote · USA Full-time

    [Remote] Business Analyst – Technical / Financial Requirements

    Remote · USA Full-time

    [Remote] Consultant, FedRAMP Assessment

    Remote · USA Full-time

    [Remote] ENERGY STORAGE ENGINEERING, PROJECT MANAGEMENT, LEGAL & COMMERCIAL ROLES — SUNGROW (REMOTE, US)

    Remote · USA Full-time

    [Remote] Senior Data Scientist

    Remote · USA Full-time

    [Remote] Account Executive

    Remote · USA Full-time

    [Remote] Clinical Study Manager

    Remote · USA Full-time

    [Remote] Account Executive, New Logo & Account Manager, Customer

    Remote · USA Full-time

    Google Jobs Remote | $25–$35/Hour Chat-Based Support – Use Google Tools from Home (No Experience Required)

    Remote · USA Full-time

    Experienced Part-Time Data Entry Specialist – Remote Opportunity at blithequark

    Remote · USA Full-time

    Clinical Appeals RN - Remote - M-F working alternating Saturdays

    Remote · USA Full-time

    Service Technician

    Remote · USA Full-time

    Freelance Content Creator (Flexible Hours, Great Earning Potential)

    Remote · USA Full-time

    Remote Data Entry Amazon Specialist – Entry‑Level E‑Commerce Listing & Optimization Role (No Experience Required) – $25/hr Flexible Hours

    Remote · USA Full-time

    Experienced Freelance Remote Worker - Customer Service and Administrative Support with Flexible Schedule - Join American Express Team

    Remote · USA Full-time

    (Entry Level Remote Jobs) Amazon Jobs at Home – Office Associate

    Remote · USA Full-time

    AI Evaluator – Travel (US)

    Remote · USA Full-time

    Experienced Senior Manager Self Service Engineering and Customer Support Specialist – Full Stack Software Development and IT Service Management

    Remote · USA Full-time