Lead Site Reliability Engineer

Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by inclusion, talented peers, comprehensive benefits and career development opportunities. Come make an impact on the communities we serve as you help us advance health optimization on a global scale. Join us to start Caring. Connecting. Growing together.
Primary Responsibilities:
- Lead the design and operation of secure, highly available cloud infrastructure on Azure/AWS
- Lead IAM services and supporting infrastructure across cloud platforms
- Own DevSecOps and SRE practices, including reliability standards (SLOs/SLAs), monitoring, and incident response
- Establish and enforce DevSecOps best practices (shift-left security, automated testing, secrets management, and compliance checks)
- Guide teams on containerization and orchestration using Docker and Kubernetes (AKS), including cluster design and upgrade strategies
- Build and govern CI/CD pipelines using GitHub Actions for fast and reliable release
- Lead Infrastructure-as-Code (IaC) using Terraform with reusable, secure modules
- Oversee system observability and monitoring using Splunk (logs, metrics, traces) to proactively detect and resolve issues
- Lead production incident response, on-call operations, and root cause analysis (ServiceNow)
- Own SSL/TLS certificate lifecycle, IAM services, and security automation
- Plan and execute application releases and deployment strategies (Blue/Green, Canary)
- Drive disaster recovery, resilience testing, and operational readiness
- Mentor DevSecOps engineers and collaborate closely with development and security teams
- Identify & remediate security vulnerabilities in applications, containers, CI/CD pipelines, cloud resources, and runtime environments
- Promote use of AI-assisted tools to improve delivery quality and efficiency
- Comply with the terms and conditions of the employment contract, company policies and procedures, and any and all directives (such as, but not limited to, transfer and/or re-assignment to different work locations, change in teams and/or work shifts, policies in regard to flexibility of work benefits and/or work environment, alternative work arrangements, and other decisions that may arise due to the changing business environment). The Company may adopt, vary or rescind these policies and directives in its absolute discretion and without any limitation (implied or otherwise) on its ability to do so
Required Qualifications:
- Graduate degree or equivalent experience
- B.Tech., MCA, or HS diploma/GED with 8-12+ years of experience in DevOps, DevSecOps, SRE, or platform engineering roles
- Solid experience with RDBMS technologies such as MySQL or PostgreSQL
- Extensive hands-on experience with Docker and Kubernetes (AKS) in production environments
- Proven experience designing and maintaining GitHub Actions CI/CD pipelines
- Experience leveraging AI tools within the SDLC to improve reliability, security, and engineering productivity
- Proficiency in Java with experience building services using Spring Boot
- Solid experience supporting mission-critical production systems, incident response, and on-call operations (ServiceNow)
- Solid expertise in Terraform for large-scale infrastructure provisioning and lifecycle management
- Deep understanding of Azure services (compute, networking, storage, Kubernetes, IAM)
- In-depth knowledge of SSL/TLS, PKI concepts, certificate automation, and secrets management
- Solid understanding of Git, microservices architecture, and modern deployment strategies (Blue/Green, Canary)
- Demonstrated ability to lead complex technical initiatives and influence cross-functional teams
- Proven problem-solving, troubleshooting, and decision-making skills in high-pressure environments
- Proven excellent communication, collaboration, and stakeholder management skills
Preferred Qualifications:
- AI Builder mindset, experienced in Azure AI, MS CoPilot Studio, Prompt Engg., GitHub Copilot, Gen AI,RAG frameworksGood to have AI Builder mindset, experienced in Azure AI, MS CoPilot Studio, Prompt Engg., GitHub Copilot, Gen AI,RAG frameworks
At UnitedHealth Group, our mission is to help people live healthier lives and make the health system work better for everyone. We believe everyone – of every race, gender, sexuality, age, location and income – deserves the opportunity to live their healthiest life. Today, however, there are still far too many barriers to good health which are disproportionately experienced by people of color, historically marginalized groups and those with lower incomes. We are committed to mitigating our impact on the environment and enabling and delivering equitable care that addresses health disparities and improves health outcomes – an enterprise priority reflected in our mission.
Información adicional sobre la vacante
Número de la requisición 2346025
Segmento de negocio Optum
Disponibilidad para viajar No
País IN
Estado de horas extras Exempt
Vacante de teletrabajo No

