Job Description
JOB DESCRIPTION
Develop and implement data pipelines and systems to connect and process data for
analytics and business intelligence (BI) platforms.
- Document systems and source-to-target mappings to ensure transparency and a clear
understanding of data flow.
- Re-engineer manual data flows to enable scalability, automation, and efficiency for
repeatable use.
- Adhere to and contribute to best practice guidelines, continuously striving for
optimization and improvement.
- Write clean, secure, and well-tested code, ensuring reliability, maintainability, and
compliance with development standards.
- Monitor and operate the services and pipelines you build, proactively identifying and
resolving production issues.
- Assess and prioritise feature requests based on business needs, technical feasibility,
and impact.
- Identify opportunities to optimise existing data flows, promoting efficiency and reducing
redundancy.
- Collaborate closely with team members and stakeholders to align efforts and achieve
shared objectives.
- Implement data quality checks and validation processes to ensure accuracy and resolve
data inconsistencies.
Requirements and Skills:
- Strong background in Software Engineering, with proficiency in Python development (3+
years of experience).
- Excellent problem-solving, communication, and organisational skills.
- Ability to work independently and collaboratively within a team environment.
- Understanding of industry-recognised data modelling patterns and standards, and their
practical application.
- Familiarity with data security and privacy principles, ensuring compliance with
governance and regulatory requirements.
- Proficiency in SQL, with experience in PostgreSQL database management.
- Experience in API implementation and integration, with an understanding of REST
principles and best practices.
- Knowledge of validation libraries like Marshmallow or Pydantic.
- Expertise in Pandas, Polars, or similar libraries for data manipulation and analysis.
- Proficiency in workflow orchestration tools like Apache Airflow and Dagster, ensuring
efficient data pipeline scheduling and execution.
- Experience working with Apache Iceberg, enabling optimized data management and
storage within large-scale analytics environments.
- Understanding of data lake architectures, leveraging scalable storage solutions for
structured and unstructured data.
- Familiarity with data warehouse solutions, ensuring efficient data processing, query
performance, and analytics workflows.
- Knowledge of operating systems (Linux) and modern development practices, including
infrastructure deployment (DevOps).
- Proficiency in code versioning tools such as Git/GitHub, and experience with CI/CD
pipelines (e.g., CircleCI).
Job Classification
Industry: IT Services & Consulting
Functional Area / Department: Data Science & Analytics
Role Category: Data Science & Machine Learning
Role: Data Engineer
Employement Type: Full time
Contact Details:
Company: Vserve
Location(s): Bengaluru
Keyskills:
Airflow
GCP
Python