Job Description
Job Description (Job Summary/Roles & Responsibilities)
We are looking for an experienced AWS Data Engineer with a strong background Data Engineering in Python, Cloud Services (AWS) such as S3, Lambda, Glue, Redshift, Athena, and monitoring CloudWatch, and SQL/PostgreSQL to join the data engineering team. The ideal candidate should be passionate about building robust data pipelines, optimizing data workflows, and working on scalable cloud-based data solutions.
Responsibilities:
- Design, develop, and maintain data solutions using AWS services, with a particular focus on S3, Athena, Glue, Lambda and Redshift.
- Implement Apache Airflow for managing and scheduling complex data workflows
- Create and maintain efficient ETL processes using PySpark and other relevant tools.
- Develop scalable and performant data pipelines to process large volumes of data.
- Implement data quality checks and monitoring systems to ensure data integrity.
- Write efficient SQL queries to extract, manipulate, and analyze data from various relational and non-relational data sources.
- Proficient with SQL and NoSQL databases, optimizing queries and database structures for performance.
- Design and implement database schemas that align with business requirements and data models.
- Collaborate with data scientists, analysts, and other engineers to understand data needs and deliver reliable solutions.
- Continuously monitor and optimize the performance of data processing jobs and queries.
- Implement best practices for cost optimization in AWS environments.
- Troubleshoot and resolve performance bottlenecks in data pipelines and analytics processes.
Requirements
- Bachelors degree in Computer Science, Engineering, or a related field (or equivalent experience).
- Proven experience as a AWS Data Engineer or similar role.
- Strong expertise in architecture principles, design patterns, and best practices.
- Hands-on experience with cloud platforms (e.g., AWS) and related services(AWS Glue, AWS Athena, AWS Lambda, AWS Devops, AWS CloudWatch, AWS S3).
- Proficiency in PySpark.
- Experience in Databases such as Oracle SQL, IBM DB2, PostgreSQL, and SAP HANA, MySQL for query optimization and performance tuning.
- Strong knowledge and understanding in writing, debugging, and optimizing complex SQL queries for data extraction and transformation
- Excellent analytical and problem-solving skills with a strong attention to detail.
- Outstanding communication and interpersonal skills, with the ability to collaborate effectively with diverse teams and stakeholders.
- Experience in Agile methodologies and DevOps practices.
- Experience in digital transformation initiatives, migration projects, and Data Modeling.
- Knowledge of containerization and orchestration tools.
Skills: PySpark and ETL, Apache Airflow, Oracle SQL, IBM DB2, PostgreSQL, and SAP HANA, MySQL, S3, Lambda, Glue, Redshift, Athena, and monitoring CloudWatch
Desired Skills
- AWS Certified Associate Architect
- Experience in containerization and orchestration tools like
Education & Certifications:
B.Tech/M.Tech/MCA
Job Classification
Industry: IT Services & Consulting
Functional Area / Department: Engineering - Software & QA
Role Category: Software Development
Role: Data Engineer
Employement Type: Full time
Contact Details:
Company: Solugenix
Location(s): Indore
Keyskills:
PySpark
ETL
AWS