Your browser does not support javascript! Please enable it, otherwise web will not work for you.

MLops Architect @ Citiustech

Home > IT Infrastructure Services

 MLops Architect

Job Description

MLOps Engineering


Experience operationalizing & managing ML/AI workloads in production environments
Distributed Tracing & Observability

Strong understanding and hands-on implementation of metrics, logs, and traces (three pillars of observability)
Monitoring & Alerting

Production experience building Grafana dashboards and actionable alert systems; understands that dashboards without alerts lack operational value
Azure Databricks Operations

Cluster management, performance optimization, timeout resolution, library troubleshooting, and compute issue resolution
Azure Cloud Services

Deep knowledge of Azure PaaS, AKS, cloud-native architectures, and Azure monitoring/diagnostics ecosystem


Good-to-Have Skills
GCP Experience

Exposure to Google Cloud Platform services and telemetry collection
Multi-Cloud Operations : Experience across Azure, GCP, or AWS environments
Apache Airflow : Workflow orchestration experience (basic level acceptable; can be learned on job)
Python/Scripting : Automation and scripting proficiency
MLOps Knowledge : Understanding of ML lifecycle management and MLOps practices
Technology Stack
Primary Cloud : Microsoft Azure
Key Platforms : Azure Databricks, Azure Kubernetes Services (AKS), Azure PaaS services
Observability : Grafana, distributed tracing tools, metrics/logs/traces platforms
Orchestration : Apache Airflow (basic usage)
Secondary Cloud : GCP services (limited scope)


Key Responsibilities
Design and implement comprehensive observability solutions using metrics, logs, and distributed traces
Build unified Grafana dashboards for single-pane-of-glass visibility across multi-cloud environments
Establish actionable alerting frameworks that drive incident response
Implement distributed tracing for AI/ML workloads and microservices
Proactively identify and remediate performance bottlenecks
Monitor, troubleshoot, and optimize Azure Databricks compute environments
Right-size clusters and resolve performance issues (timeouts, long-running jobs, library failures)
Build observability layers where current gaps exist
Manage and optimize AKS workloads and Azure PaaS offerings
Collect telemetry from Azure and GCP services and pipe to observability stack
Integrate diverse cloud services into unified monitoring infrastructure
Implement logging, metrics collection, and tracing across heterogeneous environments
Ensure comprehensive visibility across entire technology stack
Create and manage support cases with Databricks and Microsoft
Provide technical support for AI/ML workloads on cloud infrastructure
Research and implement solutions for unfamiliar technologies

Job Classification

Industry: IT Services & Consulting
Functional Area / Department: IT & Information Security
Role Category: IT Infrastructure Services
Role: Infrastructure Architect
Employement Type: Full time

Contact Details:

Company: Citiustech
Location(s): Pune

+ View Contactajax loader


Keyskills:   grafana azure Kubernetes ml mlops kuberflow aks Ml Deployment Ml Pipelines mlflow

 Fraud Alert to job seekers!

₹ Not Disclosed

Similar positions

Cloud Jenkins L4 Specialist/ Architect

  • Cognizant
  • 9 - 14 years
  • Hyderabad
  • 15 days ago
₹ Not Disclosed

Devops Architect

  • Cognizant
  • 16 - 23 years
  • Chennai
  • 15 days ago
₹ Not Disclosed

Infrastructure and Platform Architect L2

  • Wipro
  • 8 - 10 years
  • Hyderabad
  • 20 days ago
₹ Not Disclosed

Lead Architect

  • Orange Business
  • 15 - 24 years
  • India
  • 1 month ago
₹ Not Disclosed

Citiustech

CitiusTech is a specialist provider of healthcare technology services and solutions, with strong presence across the globe. As a strategic partner to some of the world's largest healthcare organizations, CitiusTech plays a deep and meaningful role in accelerating technology innovation and shaping th...