Experience with PySpark or Python Experience with big data tools Spark, Hadoop Kafka
would be good to have Hands on experience in Microsoft Azure Data Factory Azure Logic
apps service Data Bricks Experience integrating different data sources Should have
experience in Azure services [e.g. Azure Data factory, Azure Data lake (Gen 2) , Azure
SQL, Blob Storage and Azure DevOps, ARM] Should have experience in Azure
Databricks [PySpark], Databricks Delta, nice to have experience in Scala, Python
Experience building data pipeline by using Azure Big data stack Extensive ingesting,
cleaning, transformation and aggregation of massive amounts of data from multiple internal
and external sources (Salesforce, Google Analytics, etc.) by using Azure Services Create
data pipeline that enable various systems within ecosystem to stream high volume data
from multiple sources into central repository for processing Reporting at scale enabled by
Databricks Delta for incremental processing Understanding of spark framework and tuning
of spark applications Extensive experience with horizontally scalable and highly available
system design and implementation, with focus on performance and resiliency Coordinate
with all stakeholders and understand the requirement and meet the delivery deadlines.
Job Requirements: PySpark, ETL Validation, ETL Design
Location:
Chennai
Experience:
6 to 10 Years
Skills Required:
PySpark
