Your browser does not support javascript! Please enable it, otherwise web will not work for you.

Lead Site Reliability Engg (gcp Ops, Terraform,python,github Actions) @ Optum

Home > Devops

 Lead Site Reliability Engg (gcp Ops, Terraform,python,github Actions)

Job Description

Position Overview:
We are seeking a motivated and detail-oriented Site Reliability Engineer (SRE) to help us improve the reliability, scalability, and performance of our systems. As an SRE, you will collaborate with cross-functional teams to design, build, and maintain the infrastructure and tools that support our applications. This is an excellent opportunity for someone who is passionate about DevOps, automation, and cloud-native technologies.

Key Responsibilities:

  • Design, deploy, and maintain Kubernetes-based infrastructure to ensure high availability and scalability of applications.
  • Build and manage CI/CD pipelines using GitHub Actions to enable fast and reliable deployments.
  • Use Terraform to provision and manage infrastructure in Google Cloud Platform (GCP).
  • Manage and optimize Apache Kafka-based systems to ensure reliable message streaming and data processing.
  • Monitor and improve system performance and reliability using Prometheus and Grafana.
  • Collaborate with developers to automate workflows and implement best practices for infrastructure-as-code (IaC).
  • Write Python scripts for automation and tooling to enhance operational efficiency.
  • Troubleshoot and resolve system issues to minimize downtime and impact on users.
  • Participate in on-call rotations and incident response to ensure high service reliability.

Required Skills & Qualifications:

  • Familiarity with Google Cloud Platform (GCP) services such as Compute Engine, Kubernetes Engine, and Cloud Storage.
  • Hands-on experience with Kubernetes for deploying and managing containerized applications.
  • Understanding of GitHub Actions for creating and maintaining CI/CD pipelines.
  • Basic to intermediate knowledge of Terraform for infrastructure provisioning and management.
  • Proficiency in Python for scripting, automation, and tooling.
  • Experience with Apache Kafka for building, maintaining, and troubleshooting message-driven systems.
  • Experience using Prometheus and Grafana for monitoring and observability.
  • Strong problem-solving skills and an eagerness to learn new technologies.
  • Excellent communication and teamwork skills.

Nice-to-Have Skills (Optional):

  • Familiarity with other cloud providers (e.g., AWS or Azure).
  • Knowledge of Helm for Kubernetes package management.
  • Experience with debugging and optimizing distributed systems.
  • Exposure to security best practices for cloud infrastructure.
  • Knowledge of Java for developing and troubleshooting backend systems.
  • Familiarity with DataHub or similar data cataloging and metadata management platforms.
  • Understanding of Artificial Intelligence (AI) concepts and tools, such as building or managing machine learning pipelines, integrating AI models, or working with ML platforms like TensorFlow, PyTorch, or Vertex AI.
  • Experience with Golang for developing infrastructure tools or cloud-native applications.

Education & Experience:

  • Bachelor's degree in Computer Science, Information Technology, or related field (or equivalent work experience).
  • 1-3 years of experience in DevOps, SRE, or related roles (internships and project experience are acceptable for entry-level candidates).

Job Classification

Industry: Analytics / KPO / Research
Functional Area / Department: Engineering - Software & QA
Role Category: DevOps
Role: Site Reliability Engineer
Employement Type: Full time

Contact Details:

Company: Optum
Location(s): Bengaluru

+ View Contactajax loader


Keyskills:   GCP Ops GitHub Actions Terraform Python Kubernetes GenAI Artificial Intelligence

 Job seems aged, it may have been expired!
 Fraud Alert to job seekers!

₹ Not Disclosed

Similar positions

Site Reliability Engineer

  • Talent Corner Hr
  • 3 - 5 years
  • Ahmedabad
  • 13 days ago
₹ 12-15 Lacs P.A.

Test Lead

  • Infinite
  • 5 - 10 years
  • Noida, Gurugram
  • 13 days ago
₹ Not Disclosed

Site Reliability Engineer Lead

  • Kiya.ai
  • 9 - 14 years
  • Chennai
  • 13 days ago
₹ Not Disclosed

Test Lead

  • Infinite
  • 5 - 7 years
  • Noida, Gurugram
  • 14 days ago
₹ Not Disclosed

Optum

Naukri E-hire Campaign