Data Engineer Minimum 6+ years in Data Engineering, with at least 4 years of hands-on experience in Databricks and Azure ecosystem this is a must.
End-to-End Data Pipeline Delivery Proven track record of building and owning ETL/ELT pipelines, data models, and platform components at enterprise scale across complex business domains.
Expert SQL Skills Advanced proficiency in complex joins, window functions, CTEs, query optimization, and performance tuning across large datasets.
Strong PySpark & Python Hands-on experience building distributed data processing pipelines, transformations, and automation scripts using PySpark and Python.
Databricks Expertise Deep working knowledge of Delta Lake, Unity Catalog, Workflows, medallion architecture (bronze-silver-gold), notebook development, and lakehouse best practices.
Clean Code & Engineering Discipline Writes modular, well-documented, and testable code. Follows coding standards, maintains version control hygiene, and builds reusable components that other developers can extend.
Problem Solving & Debugging Strong ability to diagnose and resolve complex data pipeline failures, distributed system issues, and production incidents under pressure.
Performance Tuning Demonstrated expertise in optimizing Spark jobs, SQL queries, and storage layers for cost efficiency, throughput, and reliability.
Developer Ownership Mindset Takes end-to-end responsibility from development through unit testing, deployment validation, and production support not just writing code and handing off.
Adaptability Comfortable navigating evolving technology stacks, shifting priorities, and cross-functional collaboration in a dynamic, fast-paced Agile environment.
Team Player Collaborative engineer who actively contributes to code reviews, architecture discussions, knowledge sharing, and mentoring within a globally distributed team.
Nice to Have:
AI Exposure Familiarity with AI/ML concepts such as model training pipelines, feature engineering, RAG architectures, or tools like Azure OpenAI and LangChain.
Experience working with supply chain data domains such as procurement, logistics, inventory, or demand planning is a strong plus. Understanding of SAP or ERP data structures in a supply chain context is an added advantage.
Full Project Lifecycle & Cross-Team Collaboration Has worked through the complete project lifecycle from requirements gathering to production deployment. Experienced in collaborating with API and UI/UX teams, and understands how data flows end-to-end across upstream sources, backend services, APIs, and front-end applications.

Keyskills: Pyspark SQL Python Data Engineering Data Pipeline ETL Data Bricks