Pursue your passion and potential

Data Engineering Consultant - AI Platform

Bengaluru, India

Caring. Connecting. Growing together.

With these values to guide us, our people are committed to making a meaningful difference in the lives of those we are honored to serve.

Integrity Compassion Inclusion Relationships Innovation Performance

Data Engineering Consultant - AI Platform

Requisition number: 2369782 Job category: Technology Primary location: Bengaluru, Karnataka Date posted: 06/24/2026 Overtime status: Exempt Travel: No

Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by inclusion, talented peers, comprehensive benefits and career development opportunities. Come make an impact on the communities we serve as you help us advance health optimization on a global scale. Join us to start Caring. Connecting. Growing together.

We're looking for a hands-on Senior Data Engineer - AI Platforms to build and scale AI-ready data platforms that power AI/ML, Generative AI, Agentic AI, analytics, and intelligent enterprise applications. This role focuses on engineering modern data platforms, data products, semantic foundations, and scalable data pipelines that enable AI systems to consume trusted, governed, and context-rich data.

The ideal candidate brings solid expertise in data engineering, distributed processing, modern cloud data platforms, and AI-centric data foundations. You will work closely with Data Architects, AI/ML Engineers, Applied Scientists, and Platform Engineers to deliver data platforms that support model training, inference, RAG, semantic retrieval, and enterprise AI applications.

Primary Responsibilities:

AI Data Platform Engineering

Build and enhance AI-ready data platforms supporting AI/ML, Generative AI, Agentic AI, analytics, and operational workloads
Develop scalable data pipelines spanning:
- Data ingestion
- Data transformation
- Data processing
- Data serving
- Data consumption
- Implement modern data architectures using:
- Lakehouse
- Data Lake
- Data Warehouse
- Medallion Architecture (Bronze, Silver, Gold)
Support data platforms that enable model training, inference, feature engineering, RAG, and enterprise AI applications

Data Engineering & Processing

Develop high-performance pipelines supporting structured, semi-structured, and unstructured data
Build batch, streaming, and real-time processing solutions using modern distributed data technologies
Implement scalable data processing frameworks utilizing:
- Apache Spark
- PySpark
- Kafka
- Cloud-native data services
Optimize data storage, partitioning, indexing, and query performance for scalability and cost efficiency
Implement resilient data processing patterns including checkpointing, retries, recovery mechanisms, and data validation

AI Data Foundations

Build and maintain AI-ready datasets, feature pipelines, and data products
Develop embedding generation pipelines and vectorized data preparation workflows
Support semantic search, retrieval, and RAG use cases through efficient data engineering practices
Enable AI data readiness through:
- Data quality management
- Feature engineering
- Data enrichment
- Metadata management
- Semantic indexing
Contribute to building semantic data layers that provide business context and improve AI consumption of enterprise data

Data Governance & Security

Implement data governance standards covering:
- Metadata management
- Data lineage
- Data quality
- Data cataloging
- Data stewardship
Support compliance with HIPAA, GDPR, PII protection, and enterprise governance standards
Implement secure data access controls using:
- RBAC
- Encryption
- Data masking
- Auditing
Ensure data platforms meet security, privacy, and regulatory requirements

Platform Reliability & Operational Excellence

Implement monitoring, logging, lineage tracking, alerting, and operational dashboards for data platforms
Support platform scalability, reliability, performance, and operational efficiency
Contribute to DataOps practices including:
- CI/CD for data pipelines
- Automated testing
- Deployment automation
- Data observability
Troubleshoot production issues and support continuous improvement initiatives

Collaboration & Innovation

Partner with AI/ML Engineers, Data Scientists, Applied Scientists, Architects, and Platform Teams to deliver AI-ready data solutions
Contribute to reusable engineering frameworks, shared services, and platform accelerators
Support adoption of emerging technologies across AI data platforms, semantic retrieval, and modern data ecosystems
Participate in architecture discussions and contribute to enterprise engineering standards
Comply with the terms and conditions of the employment contract, company policies and procedures, and any and all directives (such as, but not limited to, transfer and/or re-assignment to different work locations, change in teams and/or work shifts, policies in regards to flexibility of work benefits and/or work environment, alternative work arrangements, and other decisions that may arise due to the changing business environment). The Company may adopt, vary or rescind these policies and directives in its absolute discretion and without any limitation (implied or otherwise) on its ability to do so

Required Qualifications:

Bachelor's degree in computer science, Engineering, Information Systems, Data Engineering, or related field
8+ years of experience in Data Engineering, Data Platforms, Analytics Engineering, or related disciplines
Experience building and operating enterprise-scale data pipelines and data platforms
Experience implementing modern data architectures including Data Lakes, Lakehouse, Data Warehouses, and Medallion Architecture
Experience developing data pipelines supporting AI/ML and analytics workloads
Experience working with structured, semi-structured, and unstructured datasets
Experience with metadata management, data quality, lineage, and governance practices
Experience implementing CI/CD, automated testing, and DataOps practices
Solid experience with:
- Databricks
- Snowflake
- Apache Spark
- PySpark
- SQL
- Python
Solid understanding of distributed processing, scalability, fault tolerance, and performance optimization
Understanding of security, privacy, and compliance requirements for enterprise data platforms
Familiarity with feature engineering, embeddings, semantic search, vectorized data, and AI-ready data foundations
Proven solid analytical, communication, problem-solving, and collaboration skills

Preferred Qualifications:

Hands-on experience with Databricks Lakehouse Platform, Snowflake Data Cloud, Delta Lake, Apache Iceberg, and cloud-native data platforms
Experience building AI-ready data platforms that support AI/ML, Generative AI, Agentic AI, and Retrieval-Augmented Generation (RAG) workloads
Experience developing feature stores, embedding pipelines, semantic indexing solutions, and AI data products
Experience with vector databases, semantic retrieval platforms, and enterprise search solutions
Experience implementing batch, streaming, and event-driven architectures using Kafka and related technologies
Experience working with cloud platforms including Azure, AWS, or GCP
Experience contributing to reusable data frameworks, platform accelerators, and shared engineering services
Experience within healthcare, financial services, insurance, banking, or other regulated industries
Experience mentoring junior engineers and contributing to engineering best practices
Solid understanding of DataOps, data observability, automated data quality, and platform engineering practices
Familiarity with semantic layers, knowledge graphs, ontology-driven models, and context-aware data architectures

Technical Stack

Data Platforms: Databricks, BigQuery, Snowflake, Azure Synapse, Delta Lake
Processing: Apache Spark, PySpark, Spark SQL
Streaming: Kafka, Spark Streaming, Event Hub, Kinesis
Storage: S3, ADLS, GCS, Parquet, ORC
Databases: PostgreSQL, MySQL, SQL Server, Cosmos DB, NoSQL
AI Data Layer: Pinecone, ChromaDB, FAISS, embeddings, semantic search, RAG pipelines
Orchestration: Airflow, Azure Data Factory, dbt
Programming: Python, SQL
Cloud: AWS, Azure, GCP
DevOps: Docker, Kubernetes, CI/CD (Jenkins, GitHub Actions)
Observability & Governance: Monitoring, logging, lineage, data catalogs
Security & Compliance: RBAC, IAM, encryption, masking
Integration: REST APIs, microservices

At UnitedHealth Group, our mission is to help people live healthier lives and make the health system work better for everyone. We believe everyone-of every race, gender, sexuality, age, location and income-deserves the opportunity to live their healthiest life. Today, however, there are still far too many barriers to good health which are disproportionately experienced by people of color, historically marginalized groups and those with lower incomes. We are committed to mitigating our impact on the environment and enabling and delivering equitable care that addresses health disparities and improves health outcomes - an enterprise priority reflected in our mission.

Apply Internal apply

Benefits

Our mission of helping people live healthier lives extends to our team members. Learn more about our range of benefits designed to help you live well.

Life

Resources and support to focus on what matters most to you, in every facet of your life.

Emotional

Education, tools and resources to help you reduce and manage stress, build resilience and more.

Physical

Health plans and other coverage to support wellness for you and your loved ones.

Financial

Benefits for today and to help you plan for the future, including your retirement.

Learn more

Since joining Optum, my professional growth has been significant. The dynamic environment has enhanced my problem-solving abilities, and the company’s commitment to innovation and continuous learning motivates me to stay. Optum provides continuous training, mentorship, and a clear path for advancement, all while supporting a healthy work-life balance.