Role: Ai/ML-Data Engineering,
Location Bangalore WFO,
Type: C2H,
Exp 7+ Years Rel.
Mandatory Skills: Ml-Data Engineer, SQL/NoSQL, Python, Py-Spark, Airflow, Big Data modeling, ML-flow, Unix, Hadoop, Docker, K8s.
Key Responsibilities:
Design, build, and optimize scalable batch and streaming data pipelines using distributed framework like Spark/Databricks/Kafka.
Data and ML Engineering thought leadership (What, Why & How) ) - Design & Code - robust data models, feature pipelines, and ETL/ELT frameworks for analytics and ML.
Ensure data quality, observability, lineage, and performance across data platforms.
Build and refine ML models endtoend: feature engineering, training, evaluation, and deployment.
Partner with data scientists to convert prototypes into productiongrade ML solutions.
Implement CI/CD, model versioning, monitoring, and automation across data and ML workflows.
Product Driven Mindset: Collaborate with engineering, product teams to deliver datadriven outcomes.
Required Skills
7+ years of experience in ML-Data Engineering development.
Strong SQ/NoSQL, Python, Py-Spark, and ML Models Lifecycle & Frameworks (ML-Flow, Spark-ml), Orchestration (Airflow/Oozie/Dagster etc.)
Expertise in Big Data modeling, Distributed processing, and Lake & Warehouse architectures at large operational scale.
Handson with ML lifecycle tools (ML-flow, Feature Store, model monitoring, Evaluation).
Strong Analytical & Problem-Solving Skills - Data/Process Intensive Design/Architecture, Strong debugging, optimization.
Basic hold on foundational modelling concepts & algorithms such as - Regression, Classification and Statistical models.
Good Hold on Concepts- Distributed File Formats, Open table Formats, Distributed transaction management, Workload Parallelizing.
Jands on - Unix, Hadoop, Object store fundamental operations & commands
Basic skilled with containerized processing (Docker + K8s)

Keyskills: Hadoop Big Data Docker ML Data Engineer Python Airflow Unix Pyspark Warehouse ML Models SQL Containerization ML Flow Kubernetes