Lead Site Reliability Engineer

Número de la requisición: 2292094

Categoría de la vacante: Technology

Localização da vaga: Noida, Uttar Pradesh

Aplicar

Man standing and writing on a white board while presenting to coworkers in a meeting room.

\Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by inclusion, talented peers, comprehensive benefits and career development opportunities. Come make an impact on the communities we serve as you help us advance health optimization on a global scale. Join us to start Caring. Connecting. Growing together.

Software engineering is the application of engineering to the design, development, implementation, testing and maintenance of software in a systematic method. The roles in this function will cover all primary development activity across all technology functions that ensure we deliver code with high quality for our applications, products and services and to understand customer needs and to develop product roadmaps.

These roles include, but are not limited to analysis, design, coding, engineering, testing, debugging, standards, methods, tools analysis, documentation, research and development, maintenance, new development, operations and delivery. With every role in the company, each position has a requirement for building quality into every output. This also includes evaluating new tools, new techniques, strategies; Automation of common tasks; build of common utilities to drive organizational efficiency with a passion around technology and solutions and influence of thought and leadership on future capabilities and opportunities to apply technology in new and innovative ways. Generally work is self-directed and not prescribed.

Primary Responsibilities:

Manage Azure Cloud Infrastructure and building resilient and self-scaling systems
Implement solutions to continuously improve operational reliability of the cloud infrastructure
You will be responsible for the availability, performance, monitoring and Infra Provisioning for the Platform which comprises of Cloud infrastructure and On Prem technologies
Closely partner with Engineering and Technical Support teams to drive resolution of critical issues
Publish and implement operational standards for all Cloud infrastructure and services
Work towards reducing Operations toil by automating repeatable tasks
Focus would be to mentor and develop other members in the SRE subject area
Application deployments using CI/CD tools, code repository, code scanning, artifact repo, compliance scanning, packaging, deployment, and configuration management
Build Operations Dashboards leveraging tools like Dynatrace, Splunk or Grafana
Handling incident, change and problem management
Help with provisioning of Infrastructure using Terraform
Enhancing Platform Observability Dashboards
Closely partnering with Development Teams and help address Platform related roadblocks
Conduct post-mortem after a production issues.
React to production deficiencies by continuously implementing automation, self-healing, and real-time monitoring to production systems
Work with Docker, Kubernetes, Azure cloud, Prometheus, Grafana, Java, Python and many other modern SaaS technologies
Participate in projects involving people of many different disciplines: Engineering, Cloud, Networking, CI/CD, Project management, Monitoring, alerting etc.
Stay informed of new technologies and Innovate
Works with less structured, more complex issues
Serves as a resource to others
Comply with the terms and conditions of the employment contract, company policies and procedures, and any and all directives (such as, but
not limited to, transfer and/or re-assignment to different work locations, change in teams and/or work shifts, policies in regards to flexibility of work benefits and/or work environment, alternative work arrangements, and other decisions that may arise due to the changing business environment). The Company may adopt, vary or rescind these policies and directives in its absolute discretion and without any limitation (implied or otherwise) on its ability to do so.
#NIC #NJP

Required Qualifications:

Bachelor’s or advanced Degree in a related technical field
3+ years IT Experience
3+ years DevOps Experience
2+ years experience on Infrastructure as Code (Terraform/Ansible/Chef/Puppet)
2+ years experience on Docker and Container Orchestration (Kubernetes/OpenShift)
2+ years experience on DevOps and CI/CD tools such as Git, Jenkins
2+ years experience on Kafka Support
2+ years experience on Monitoring tools and technologies (Splunk, Dynatrace, new relic)

Preferred Qualifications:

Infrastructure Engineering Experience
Cloud Experience (Azure/AWS/GCP)
Automation experience
Good Knowledge on SRE principles
Hands on scripting with one or more: YAML, JSON, PowerShell, BASH or Python
#Nic #NJP

At UnitedHealth Group, our mission is to help people live healthier lives and make the health system work better for everyone. We believe everyone – of every race, gender, sexuality, age, location and income – deserves the opportunity to live their healthiest life. Today, however, there are still far too many barriers to good health which are disproportionately experienced by people of color, historically marginalized groups and those with lower incomes. We are committed to mitigating our impact on the environment and enabling and delivering equitable care that addresses health disparities and improves health outcomes – an enterprise priority reflected in our mission.

Aplicar

Información adicional sobre la vacante

Número de la requisición 2292094

Segmento de negocio Optum

Disponibilidad para viajar No

País IN

Estado de horas extras Exempt

Vacante de teletrabajo No