Role: Senior SRE
Experience: 6-10 years
Location: Bangalore
Roles & Responsibilities
Reliability & Operations
- Design, implement, and maintain highly available and resilient systems in Kubernetes-based environments
- Define and enforce SLOs, SLIs, and error budgets
- Lead incident response, RCA, and postmortems
- Drive reliability improvements through automation
Observability (Core Focus)
- Architect and operate observability platforms for metrics, logging, tracing, and alerting
- Work with Prometheus, Alertmanager, OpenTelemetry, Grafana, Loki / ELK / OpenSearch
- Implement cloud-native monitoring (GCP Cloud Monitoring & Logging preferred)
- Establish actionable alerting standards
Cloud & Platform Engineering
- Build and manage infrastructure on GCP (preferred) or AWS
- Operate Kubernetes clusters (GKE preferred)
- Deploy services using Helm
- Manage containerized workloads using Docker
Automation & Tooling
- Strong Python skills with emphasis on reliability, automation, and observability tooling
- Develop automation and tooling using Python
- Create internal reliability and monitoring tools
- Integrate CI/CD pipelines with observability and reliability checks
Collaboration & Leadership
- Mentor junior engineers
- Influence architecture decisions
- Collaborate across engineering teams
Please share your updated resume to pr*******m@te******s.in

Keyskills: Elk Sre Python Terraform Docker AWS Observability Grafana Kubernetes
About Company : Tekskills India is an ISO: 9001 certified CMM Level 3 Company having a global presence in US, UK & India. In INDIA we have offices in Hyderabad, Bangalore, Kolkata, Chennai & Pune. We're a global leader in business and technology services, helping our clients bring the futur...