Senior Data Engineer @ Idexcel

Home > Software Development

Senior Data Engineer

Idexcel
4 - 6 years
Hyderabad
2 months ago
Email to a friend
Report this job

Job Description

Databricks (Spark)

Develop scalable ETL/ELT pipelines using PySpark (RDD/DataFrame APIs), Delta Lake, Auto Loader (cloudFiles), and Structured Streaming.
Optimize jobs: partitioning, bucketing, Z-Ordering, OPTIMIZE + VACUUM, broadcast joins, AQE, checkpointing.
Manage Unity Catalog: catalogs/schemas/tables, data lineage, permissions, secrets, tokens, and cluster policies.
CI/CD for Databricks assets: notebooks, Jobs, Repos, MLflow
Build Medallion Architecture (Bronze/Silver/Gold) with Delta Live Tables (DLT) and expectations for data quality.
Event-driven ingestion: Kafka/Kinesis Databricks Streaming

Snowflake (DW & ELT)

Model and implement star/snowflake schemas, data marts, and secure views.
Performance tuning: clustering keys, micro-partitions, result caching, warehouse sizing, query profile
Implement Task/Stream patterns for CDC; external tables for data lakes (S3); Snowpipe for near-real-time ingestion.
Python/Snowpark for transformations and UDFs; SQL best practices (CTEs, window functions).
Security: Row Level Security (RLS), Column Masking, OAuth/SCIM, network policies, data sharing (reader accounts).

AWS Data Engineering

Storage & compute: S3 (lifecycle, encryption, partitioning), EMR (if needed), Lambda, Glue (ETL/Schema registry), Athena, Kinesis (Data Streams/Firehose), RDS/Aurora, Step Functions.
Orchestration: MWAA/Airflow or Step Functions (error handling, retries, backfills, SLA alerts).
Infra-as-code: Terraform/CloudFormation for reproducible environments (Databricks workspace, IAM, S3, networking).
Security/compliance: IAM least privilege, KMS, VPC endpoints/private links, Secrets Manager, CloudTrail/CloudWatch, GuardDuty.
Observability: CloudWatch metrics/logs, structured logging, datadog/Prometheus (optional), cost monitoring (tags/budgets).

Data Quality, Governance & Security

Implement unit/integration tests for pipelines (e.g., pytest + Great Expectations + DLT expectations).
Data contracts and schema evolution; monitor SLA/SLO; DQ dashboards (missingness, drift, freshness, completeness).
PII handling: tokenization/pseudonymization, field-level encryption, KYB/KYC data flows adherence; audit trails.
Cataloging & lineage through Unity Catalog and/or OpenLineage/Purview (if applicable).

DevOps & CI/CD

Git workflows (branching, PR reviews), Databricks CLI/Terraform modules for jobs/clusters/UC, Snowflake DevOps (object versioning via schemachange or SQL-based migration).
Automated testing in pipelines; feature flags, canary releases for data jobs; rollback strategies.

Client-Facing PoCs & Delivery

Rapid PoC build: clearly defined success metrics, benchmark cost/performance, produce a transition plan to production.
Present architectural decisions, trade-offs (Spark vs Snowflake ELT), and cost projections (Databricks DBU, Snowflake credits, storage egress).
Produce runbooks, operational playbooks, and knowledge transfer documents for client teams.

Required Technical Skillset

Databricks: PySpark, Delta Lake, Auto Loader, DLT, Jobs, Unity Catalog, MLflow basics.
Snowflake: SQL, Snowpipe, Tasks/Streams, Snowpark (Python), warehouse sizing, performance tuning, security policies.
Python: strong in packages for DE (pandas, pyarrow, pytest), robust error handling, typing, and packaging.
Orchestration: Airflow DAGs (Sensors, Operators, XCom), Step Functions state machines.
Streaming & CDC: Kafka/Kinesis, Debezium (nice-to-have), CDC patterns to Delta/Snowflake.
AWS: S3, Glue, Lambda, Kinesis, IAM/KMS, VPC, CloudWatch; Terraform/CloudFormation.
Data Modeling: 3NF/Dimensional, slowly changing dimensions (SCD Type 2), surrogate keys, surrogate vs natural debates.
Security & Compliance: encryption at rest/in transit, tokenization, key rotation, audit logging, governance controls.
Performance & Cost: Spark job tuning, Snowflake warehouse right-sizing, partitioning/clustering, object storage best practices.

Nice-to-Have:

dbt (Snowflake) with tests & exposures; Great Expectations.
Databricks SQL Warehouses and BI connectivity; Photon engine awareness.
Lakehouse Federation (UC external locations); Delta Sharing; Iceberg
Kafka Connect/Debezium, NiFi or MuleSoft (for data integrations).
Experience in financial services
Exposure to ISO/IEC 27001 controls in data platforms.

Education & Certifications

Bachelors/Masters in CS/IT/EE or related.
Certifications (plus): Databricks Data Engineer Associate/Professional, Snowflake SnowPro Core/Advanced, AWS Solutions Architect/Big Data/DP.

Job Classification

Industry: Recruitment / Staffing
Functional Area / Department: Engineering - Software & QA
Role Category: Software Development
Role: Data Engineer
Employement Type: Full time

Contact Details:

Company: Idexcel
Location(s): Hyderabad

+ View Contact

Login

Candidates can login here to view contacts and apply.

Sign In Sign Up

Email:

Password:

Password too short

To create your profile, apply for a job or make a registration

Your name (*)

Email (*)

Mobile (*)

Preferred City (* max. 2 w/comma)

Designation / Expected Role

Current / Recent Company (*)

Experience (*)

Expected Salary (*)

Desired Industry (*):

Functional area / Department (*):

Enter Skills (key skills, subjects, technologies & roles to use in search)

Write briefly about yourself, your experience and education (*)

Attach Resume Max 2.38 MB (RTF, PDF, DOC, DOCX formats only parsed)

Please, check the file size and type.

Add social media [ + ]

Create password

I agree with website service terms and conditions

Candidates are expected to provide most recent and accurate profile information, inappropriate content is strictly prohibited!

Keyskills: Data Engineering PySpark Auto Loader Data Quality DevOps Snowflake Delta Lake CI/CD Data Modeling Databricks ETL AWS Data Governance Python

Fraud Alert to job seekers!

₹ Not Disclosed

Job application

We will notify the employer with your details. You can also attach a resume or a cover letter.

Sign In Sign Up

Email:

Password:

Password too short

To create your profile, apply for a job or make a registration

Your name (*)

Email (*)

Mobile (*)

Preferred City (* max. 2 w/comma)

Designation / Expected Role

Current / Recent Company (*)

Experience (*)

Expected Salary (*)

Desired Industry (*):

Functional area / Department (*):

Enter Skills (key skills, subjects, technologies & roles to use in search)

Write briefly about yourself, your experience and education (*)

Attach ResumeMax 2.38 MB (RTF, PDF, DOC, DOCX formats only parsed)

Please, check the file size and type.

Add social media [ + ]

Create password

I agree with website service terms and conditions

Similar positions

Custom Software Engineer

Accenture

5 - 10 years

Chennai

4 hours ago

₹ Not Disclosed

Custom Software Engineer

Accenture

2 - 5 years

Bengaluru

9 hours ago

₹ Not Disclosed

Custom Software Engineer

Accenture

5 - 10 years

Bengaluru

10 hours ago

₹ Not Disclosed

Custom Software Engineer

Accenture

5 - 10 years

Chennai

10 hours ago

₹ Not Disclosed

Idexcel

Idexcel Technologies Private Limited Idexcel is a Professional Services and Technology Solutions provider specializing in Cloud Services, Application Modernization, and Data Analytics. Idexcel is proud that for more than 21 years it has provided services that implement complex technologies that ...

Senior Data Engineer @ Idexcel

Home > Software Development