Your browser does not support javascript! Please enable it, otherwise web will not work for you.

Monitoring Engineer @ Cirruslabs

Home > IT Infrastructure Services

 Monitoring Engineer

Job Description

Experience - 4-6 years

Role Summary
Mid-level engineer to help build, operate, and continuously improve our monitoring and observability platform. You'll ensure reliability and performance by creating actionable alerting, clear dashboards, and resilient telemetry pipelines across infrastructure and applications. Stakeholder management and maintain customer satisfaction on Monitoring platform.
Key Responsibilities

  • Design and maintain monitoring for services, infrastructure, and cloud resources (metrics, logs, traces).
  • Build and tune alerting rules to reduce noise and improve actionable signal quality.
  • Create dashboards and service health views for engineering and operations stakeholders.
  • Support incident response by improving detection, triage, and post-incident learnings (RCA).
  • Implement SLOs/SLIs and error budgets; track reliability/performance trends.
  • Automate monitoring workflows (onboarding new services, alert routing, report generation).
  • Partner with app teams to instrument code and standardize telemetry (OpenTelemetry where applicable).
  • Maintain and optimize observability tooling, capacity, and cost (data retention, sampling, indexing).
  • Document standards, runbooks, and operational procedures; mentor junior team members.

Required Qualifications

  • 46 years of experience in monitoring/observability, SRE, platform engineering, or production operations.
  • Strong fundamentals in Cloud Platforms, networking, and troubleshooting distributed systems.
  • Hands-on experience with at least one monitoring stack ( Datadog, Dynatrace).
  • Familiarity with tracing and instrumentation concepts (e.g., OpenTelemetry, Jaeger, Zipkin).
  • Scripting/automation skills (Python, Go, or Bash) and comfort with APIs.
  • Experience with on-call/incident management and writing postmortems.
  • Clear communication skills and ability to work cross-functionally.

Preferred Qualifications

  • Cloud experience (AWS/Azure/GCP) and managed monitoring services.
  • Container/Kubernetes monitoring (Kubernetes, Helm, service meshes).
  • Infrastructure as Code (Terraform/CloudFormation) and CI/CD (GitHub Actions, Jenkins, GitLab).
  • Experience defining SLOs and implementing alerting based on user impact.
  • Knowledge of IT service management tooling (e.g., ServiceNow) and alert routing

Soft Skills & Ways of Working

  • Bias for automation and measurable improvements (alert quality, MTTR, availability).
  • Strong ownership mindset for production reliability.
  • Pragmatic approach to standardsimprove consistency without blocking delivery.

Job Classification

Industry: Recruitment / Staffing
Functional Area / Department: IT & Information Security
Role Category: IT Infrastructure Services
Role: IT Operations Management
Employement Type: Full time

Contact Details:

Company: Cirruslabs
Location(s): Hyderabad

+ View Contactajax loader


Keyskills:   SRE troubleshooting Python IT Infrastructure Management

 Fraud Alert to job seekers!

₹ 9.5-19 Lacs P.A

Similar positions

Advisor, Network WAN Engineering

  • Fiserv
  • 12 - 18 years
  • Noida, Gurugram
  • 3 days ago
₹ Not Disclosed

Azure Infrastructure Engineer

  • Capgemini
  • 4 - 6 years
  • Mumbai
  • 10 days ago
₹ Not Disclosed

Gcp Engineer

  • Citiustech
  • 5 - 10 years
  • Pune
  • 12 days ago
₹ Not Disclosed

SOAR Engineer

  • Tata Consultancy
  • 7 - 12 years
  • Pune
  • 13 days ago
₹ Not Disclosed

Cirruslabs

We are CirrusLabs. Our vision is to become the world's most sought-after niche digital transformation company that helps customers realize value through innovation. Our mission is to co-create success with our customers, partners and community. Our goal is to enable employees to dream, grow and make...