Skip to main content
Search JobsOpen search form

Explore remote jobs

Pursue your passion and potential

Senior Principal Infrastructure & Operations Eng - Remote or Hybrid

Minnetonka, Minnesota

Caring. Connecting. Growing together.

With these values to guide us, our people are committed to making a meaningful difference in the lives of those we are honored to serve.

Senior Principal Infrastructure & Operations Eng - Remote or Hybrid

Requisition number: 2370606 Job category: Technology Primary location: Minnetonka, MN Date posted: 07/02/2026 Overtime status: Exempt Travel: No

Optum Tech is a global leader in health care innovation. Our teams develop cutting-edge solutions that help people live healthier lives and help make the health system work better for everyone. From advanced data analytics and AI to cybersecurity, we use innovative approaches to solve some of health care's most complex challenges. Your contributions here have the potential to change lives. Ready to build the next breakthrough? Join us to start Caring. Connecting. Growing together.

This Senior Principal role is accountable for advancing enterprise reliability across a complex, high-scale application portfolio by setting the technical direction, operating model, and leadership approach needed to improve stability, resilience, and operational performance.

You'll enjoy the flexibility to work remotely * from anywhere within the U.S. as you take on some tough challenges.  For all hires in the Minneapolis or Washington, D.C. area, you will be required to work in the office a minimum of four days per week.

Primary Responsibilities: 

Enterprise Reliability Leadership

  • Establish and execute a comprehensive reliability strategy across a portfolio of 510+ applications supporting critical business operations
  • Define and govern enterprise reliability standards, Service Level Objectives (SLOs), Service Level Indicators (SLIs), error budgets, resiliency requirements, and operational maturity models
  • Create a reliability operating model that spans modern cloud-native platforms, legacy systems, mainframe workloads, third-party hosted solutions, and SaaS applications
  • Serve as the executive leader accountable for enterprise application reliability, stability, recoverability, and operational risk reduction

AI-First SRE Transformation

  • Design and implement an AI-first approach to reliability engineering leveraging generative AI, AIOps, predictive analytics, autonomous remediation, intelligent alert management, and operational copilots
  • Identify opportunities to eliminate manual operational work through automation and machine-driven decision support
  • Establish AI-powered workflows for:
  • Incident detection and triage
  • Root cause analysis
  • Event correlation
  • Capacity forecasting
  • Reliability risk identification
  • Automated remediation
  • Knowledge management
  • Operational reporting
  • Deliver measurable reductions in Mean Time to Detect (MTTD), Mean Time to Resolve (MTTR), operational toil, and incident volume

Portfolio Reliability Management

  • Develop a portfolio-wide reliability framework capable of managing highly heterogeneous technology stacks including:
    • Mainframe platforms
    • Middleware and integration technologies
    • Distributed applications
    • Containerized workloads
    • Public cloud platforms
    • Vendor-hosted applications
    • SaaS ecosystems
  • Establish application criticality tiers and reliability targets across the portfolio
  • Implement standardized observability and operational telemetry strategies regardless of technology platform

Team Leadership

  • Build, lead, and mentor a high-performing team of direct reports and contractors
  • Create an elite SRE organization of five or fewer highly skilled engineers capable of delivering enterprise-scale outcomes through leverage, automation, and platform capabilities
  • Recruit and develop engineers with solid expertise in software engineering, automation, observability, AI, and systems reliability
  • Foster a culture of ownership, innovation, operational excellence, and continuous improvement

Vendor and Partner Management

  • Drive reliability accountability across a complex ecosystem of vendors, managed service providers, and third-party technology partners
  • Establish operational performance expectations, reliability metrics, service level agreements, and governance mechanisms with external partners
  • Ensure vendors contribute actionable telemetry, operational transparency, and incident management discipline
  • Lead escalations and executive-level discussions related to service disruptions and reliability concerns

Observability and Platform Engineering

  • Define and implement enterprise observability standards across metrics, logs, traces, events, synthetic monitoring, and user experience monitoring
  • Drive platform engineering initiatives that simplify operational support and reduce application-specific operational burden
  • Establish self-service reliability capabilities for application teams

Incident and Resilience Management

  • Lead major incident management, post-incident review processes, and enterprise resilience initiatives
  • Drive systemic problem elimination through engineering-led root cause analysis and preventive action programs
  • Develop disaster recovery, business continuity, and resiliency testing strategies.
  • Ensure reliability practices are embedded throughout the software development lifecycle

You'll be rewarded and recognized for your performance in an environment that will challenge you and give you clear direction on what it takes to succeed in your role as well as provide development for other roles you may be interested in.

Required Qualifications: 

  • Undergraduate degree in applicable area of expertise or equivalent experience
  • 10+ years of experience in technology operations, software engineering, infrastructure engineering, platform engineering, or Site Reliability Engineering
  • 5+ years leading enterprise-scale SRE, reliability engineering, or production engineering organizations
  • Demonstrated experience owning reliability outcomes for portfolios exceeding 200+ applications, preferably 500+
  • Proven success building and leading high-performing engineering teams
  • Experience managing direct reports, contractors, managed services providers, and vendor relationships
  • Deep understanding of modern SRE principles including:
    • SLOs and SLIs
    • Error budgets
    • Reliability engineering
    • Incident management
    • Resilience engineering
    • Capacity management
    • Observability
  • Experience supporting diverse technology ecosystems spanning legacy platforms, mainframe, distributed systems, and cloud environments
  • Proven solid executive communication and stakeholder management skills

Preferred Qualifications:  

  • Demonstrated implementation of AI-driven operations, AIOps, or autonomous operations capabilities at enterprise scale
  • Experience leveraging Generative AI, LLMs, operational copilots, agentic workflows, or predictive analytics to improve operational outcomes
  • Experience leading large-scale operational transformations with measurable business results.
  • Demonstrated background in software engineering or platform engineering
  • Experience with cloud platforms such as AWS, Azure, or GCP
  • Demonstrated familiarity with observability platforms such as Datadog, Dynatrace, New Relic, Splunk, Grafana, Open Telemetry, or similar technologies
  • Experience in highly regulated industries such as healthcare, financial services, insurance, or government

*All employees working remotely will be required to adhere to UnitedHealth Group's Telecommuter Policy.

Pay is based on several factors including but not limited to local labor markets, education, work experience, certifications, etc. In addition to your salary, we offer benefits such as, a comprehensive benefits package, incentive and recognition programs, equity stock purchase and 401k contribution (all benefits are subject to eligibility requirements). No matter where or when you begin a career with us, you'll find a far-reaching choice of benefits and incentives. The salary for this role will range from $134,600 to $230,800 annually based on full-time employment. We comply with all minimum wage laws as applicable.

Application Deadline: This will be posted for a minimum of 2 business days or until a sufficient candidate pool has been collected. Job posting may come down early due to volume of applicants.

At UnitedHealth Group, our mission is to help people live healthier lives and make the health system work better for everyone. We believe everyone-of every race, gender, sexuality, age, location and income-deserves the opportunity to live their healthiest life. Today, however, there are still far too many barriers to good health which are disproportionately experienced by people of color, historically marginalized groups and those with lower incomes. We are committed to mitigating our impact on the environment and enabling and delivering equitable care that addresses health disparities and improves health outcomes - an enterprise priority reflected in our mission.

UnitedHealth Group is an Equal Employment Opportunity employer under applicable law and qualified applicants will receive consideration for employment without regard to race, national origin, religion, age, color, sex, sexual orientation, gender identity, disability, or protected veteran status, or any other characteristic protected by local, state, or federal laws, rules, or regulations.

UnitedHealth Group is a drug - free workplace. Candidates are required to pass a drug test before beginning employment.

Benefits

Our mission of helping people live healthier lives extends to our team members. Learn more about our range of benefits designed to help you live well.

Life

Resources and support to focus on what matters most to you, in every facet of your life.

Emotional

Education, tools and resources to help you reduce and manage stress, build resilience and more.

Physical

Health plans and other coverage to support wellness for you and your loved ones.

Financial

Benefits for today and to help you plan for the future, including your retirement.

Learn more
testimonial-img-1
testimonial-img-2
testimonial-img-3

We’re honored to be recognized for our exceptional work culture

AGWF recognition award
2025 Campus Forward Award badge from RippleMatch
LinkedIn Top Companies 2025 award badge
Forbes Best Large Employers in the United States 2024 award badge
America’s Greatest Workplaces 2024 award badge