Your browser does not support javascript! Please enable it, otherwise web will not work for you.

Site Reliability Engineer @ Sonata Software

Home > Devops

 Site Reliability Engineer

Job Description

We are seeking a Site Reliability Engineer with strong expertise in AWS, CI/CD, IaC, and Kubernetes to ensure the reliability, scalability, and security of large-scale data infrastructure. The ideal candidate will blend DevOps best practices with data engineering operations, focusing on automation, observability, and cloud-native solutions.

Primary Skills (Must-Have)

  • AWS (core services: EC2, EKS, Lambda, Redshift, S3, IAM, VPC)
  • CI/CD (Jenkins, GitHub Actions, AWS CodePipeline)
  • Infrastructure as Code (Terraform, CloudFormation)
  • Kubernetes (EKS) and container orchestration

Secondary Skills (Good-to-Have)

  • AWS Systems Manager, Dataiku platform operations
  • Experience with platform patching, upgrades, and maintenance

Tools & Platforms

  • Data Warehousing & Processing: Snowflake, Redshift, Apache Airflow, dbt
  • CI/CD & Deployment: Jenkins, GitHub Actions, AWS CodePipeline, Terraform
  • Cloud & Event Processing: AWS Lambda, API Gateway, SNS/SQS, Kafka, Step Functions
  • Monitoring & Logging: DataDog, AWS CloudWatch, Prometheus, Splunk
  • Incident Management: PagerDuty, Opsgenie, AWS Health Dashboard
  • Collaboration & Code Review: GitHub, Jira, Confluence

Key Responsibilities

Data Pipeline Reliability & Observability

  • Maintain highly available, fault-tolerant infrastructure for ETL jobs and real-time data processing
  • Implement monitoring of Airflow DAGs, Snowflake queries, and AWS data workflows
  • Automate health checks, error handling, and self-healing for data pipelines

Infrastructure & Cloud Automation

  • Deploy and manage AWS-based infrastructure with Terraform & CloudFormation
  • Optimize Kubernetes (EKS) clusters for scale and cost efficiency
  • Support scaling and reliability for Redshift, Snowflake, and storage solutions

Performance, Monitoring & Incident Response

  • Build real-time monitoring, logging, and alerting with DataDog, CloudWatch, and Prometheus
  • Define & track SLOs/SLIs to improve data platform uptime
  • Perform RCA, post-mortems, and security audits after incidents

Security & Compliance

  • Ensure compliance with GDPR, CCPA, SOC 2 across data pipelines
  • Apply AWS security best practices (IAM, KMS, Shield, WAF)
  • Secure API Gateways, data access policies, and encryption standards

Collaboration & Leadership

  • Partner with data engineers, analytics, and DevOps teams to improve reliability
  • Participate in DR (Disaster Recovery) planning and security compliance reviews
  • Promote best practices in automation, observability, and cost optimization

Job Classification

Industry: IT Services & Consulting
Functional Area / Department: Engineering - Software & QA
Role Category: DevOps
Role: Site Reliability Engineer
Employement Type: Full time

Contact Details:

Company: Sonata Software
Location(s): Hyderabad

+ View Contactajax loader


Keyskills:   Jenkins CI/CD IAAC AWS Kubernetes Terraform

 Job seems aged, it may have been expired!
 Fraud Alert to job seekers!

₹ Not Disclosed

Similar positions

Application Support Engineer

  • Accenture
  • 3 - 8 years
  • Ahmedabad
  • 4 days ago
₹ Not Disclosed

Custom Software Engineer

  • Accenture
  • 2 - 5 years
  • Hyderabad
  • 5 days ago
₹ Not Disclosed

DevOps Engineer

  • Accenture
  • 3 - 6 years
  • Pune
  • 5 days ago
₹ Not Disclosed

Aws Devops Engineer

  • Capgemini
  • 4 - 9 years
  • Bengaluru
  • 10 days ago
₹ Not Disclosed

Sonata Software

Sonata is a global technology company, that enables successful platform based digital transformation initiatives for enterprises, to create businesses that are connected, open, intelligent and scalable. Sonata€™s Platformation„¢ methodology brings together industry expertise, platform technol...