Site Reliability Engineer
Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by inclusion, talented peers, comprehensive benefits and career development opportunities. Come make an impact on the communities we serve as you help us advance health optimization on a global scale. Join us to start Caring. Connecting. Growing together.
Primary Responsibilities:
- Defining and setting up best industry alert and monitoring practices across line of business and design/architect efficient monitoring dashboards on Splunk/DTSaas/DataDog/Grafana common for all applications/products across line of business
- Participating in 5-9 program and other peak season readiness initiatives and collaboration with application teams evaluating applications from resiliency, availability, and reliability perspective
- Act as a gatekeeper for changes rolling into production
- Embrace continuous learning of engineering practices to ensure industry best practices and technology adoption, including DevOps, Cloud and Agile thinking
- Tech debt reduction/Tech transformation including opensource/inner source adoption, Cloud adoption, HCP assessment and adoption
- Improve processes/runbooks and lead automation efforts of any manual items around support cutting down manual toil
- Participate in on-call rotation
- Improve operational tooling, frameworks, perform chaos engineering activities
- Respond to platform emergencies, alerts, and escalations from Customer Support
- Comply with the terms and conditions of the employment contract, company policies and procedures, and any and all directives (such as, but not limited to, transfer and/or re-assignment to different work locations, change in teams and/or work shifts, policies in regards to flexibility of work benefits and/or work environment, alternative work arrangements, and other decisions that may arise due to the changing business environment). The Company may adopt, vary or rescind these policies and directives in its absolute discretion and without any limitation (implied or otherwise) on its ability to do so
Required Qualifications:
- Graduation Degree
- Experience in integrating monitoring and alerting into cloud software solutions
- Coding experience with one or more of the follow languages Java, C#, C/C++, Go, Python, Perl, PowerShell or JavaScript with a willingness and ability to learn new ones
- Experience building and programmatically consuming REST APIs
- Experience in Splunk / Dynatrace / DataDog/Grafana/ Telemetry or similar for monitoring tools
- Experience with programmatic interaction with a relational database SQL Server/MySQL/PostgreSQL
- Experience planning and supporting 99.999% availability against critical applications in production
- Proven ability to communicate effectively to both technical and non-technical, globally distributed audiences
- Proven technical writing skills (creating flow diagrams, end user documentation, etc)
At UnitedHealth Group, our mission is to help people live healthier lives and make the health system work better for everyone. We believe everyone–of every race, gender, sexuality, age, location and income–deserves the opportunity to live their healthiest life. Today, however, there are still far too many barriers to good health which are disproportionately experienced by people of color, historically marginalized groups and those with lower incomes. We are committed to mitigating our impact on the environment and enabling and delivering equitable care that addresses health disparities and improves health outcomes — an enterprise priority reflected in our mission.
Información adicional sobre la vacante
Número de la requisición 2293080
Segmento de negocio Optum
Disponibilidad para viajar No
País IN
Estado de horas extras Exempt
Vacante de teletrabajo No