Your browser does not support javascript! Please enable it, otherwise web will not work for you.

Sr. Developer @ Cognizant

Home > Software Development

Cognizant  Sr. Developer

Job Description

 


 

Job Summary

We are seeking a highly skilled GCP Infrastructure Engineer to design build and manage the cloud infrastructure that powers Generative AI (GenAI) applications at scale. In this role you will leverage Google Cloud Platform (GCP) Vertex AI IBM Watsonx and containerization technologies such as Docker and Kubernetes (GKE) to deliver secure scalable and high-performance AI solutions. You will own the end-to-end infrastructure lifecycle from design and provisioning to automation monitoring


 

Responsibilities

We are seeking a highly skilled GCP Infrastructure Engineer to design build and manage the cloud infrastructure that powers Generative AI (GenAI) applications at scale. In this role you will leverage Google Cloud Platform (GCP) Vertex AI IBM Watsonx and containerization technologies such as Docker and Kubernetes (GKE) to deliver secure scalable and high-performance AI solutions. You will own the end-to-end infrastructure lifecycle from design and provisioning to automation monitoring and optimization while enabling data scientists and ML engineers to seamlessly deploy and operate GenAI workloads.

Key Responsibilities

Cloud Infrastructure and Platform Engineering

Design provision and maintain scalable secure and cost-efficient infrastructure for GenAI applications on GCP.

Deploy and manage containerized workloads using Docker and Kubernetes (GKE).

Configure and optimize Vertex AI and IBM Watsonx platforms for training fine-tuning and serving LLMs and other generative models.

Implement high-performance GPU-TPU clusters to support distributed training and large-scale inference.

Ensure business continuity through backup disaster recovery and multi-region deployments.

Automation and Reliability

Develop and maintain Infrastructure as Code (IaC) templates with Terraform or Cloud Deployment Manager.

Adopt GitOps practices (Flux) for infrastructure lifecycle management.

Build and optimize CI-CD pipelines for data pipelines model workflows and GenAI applications.

Apply SRE principles (SLIs SLOs SLAs) to guarantee platform reliability and uptime.

Security Governance and Compliance

Embed DevSecOps best practices across the infrastructure lifecycle including policy-as-code vulnerability scanning and secrets management.

Enforce identity and access management (IAM) network segmentation and data encryption in compliance with standards (HIPAA SOX GDPR FedRAMP).

Collaborate with enterprise security and compliance teams to implement governance frameworks for GenAI platforms.

Monitoring Observability and Cost Optimization

Implement observability stacks (Prometheus Grafana Cloud Monitoring Datadog) for both infra health and ML-specific metrics (model drift data anomalies).

Define KPIs to monitor system health performance and adoption across AI workloads.

Optimize cloud cost efficiency for GPU-TPU-intensive workloads using autoscaling preemptible instances and utilization monitoring.

Collaboration and Enablement

Partner with data scientists ML engineers and software teams to streamline GenAI application development and deployment.

Provide onboarding documentation and reusable templates to enable faster adoption of AI infrastructure.

Stay current with the latest advancements in GenAI cloud-native infrastructure and container orchestration.

Required Education

Bachelors or masters degree in computer science Software Engineering or a related field.

Required Experience

5 years of experience in cloud infrastructure engineering DevOps or platform engineering.

Experience with GenAI use cases (chatbots content generation code assistants etc.).

Strong hands-on expertise with Google Cloud Platform (GCP) especially Vertex AI.

Experience with IBM Watsonx for AI application deployment and management.

Proven skills in Docker Kubernetes (GKE) and container orchestration at scale.

Proficiency in Python Bash or other relevant scripting languages.

Strong understanding of cloud networking IAM and security best practices.

Experience with CI-CD tools (GitHub Actions GitLab CI Jenkins) and IaC tools (Terraform Pulumi Ansible Deployment Manager).

Familiarity with data pipelines and integration tools (Dataflow Apache Beam Pub-Sub Kafka).

Excellent problem-solving debugging and communication skills.

Preferred Experience

Experience in MLOps practices for model deployment monitoring and retraining.

Exposure to multi-cloud or hybrid cloud environments (GCP AWS Azure on-prem).

Hands-on experience with feature stores (Vertex AI Feature Store Feast) and ML observability tools (EvidentlyAI Fiddler).

Knowledge of distributed training frameworks (Horovod DeepSpeed PyTorch Distributed).

Contributions to open-source projects in infrastructure MLOps or GenAI.

Experience managing infrastructure in regulated industries.

Job Classification

Industry: IT Services & Consulting
Functional Area / Department: Engineering - Software & QA
Role Category: Software Development
Role: Data Platform Engineer
Employement Type: Full time

Contact Details:

Company: Cognizant
Location(s): Pune

+ View Contactajax loader


Keyskills:   kubernetes continuous integration orchestration vertex networking docker ansible cloud scripting iam gcp devops jenkins debugging scripting languages communication skills deployment cd python github microsoft azure engineering grafana kafka cloud infrastructure bash gitlab aws

 Fraud Alert to job seekers!

₹ Not Disclosed

Similar positions

Application Developer

  • Accenture
  • 2 - 5 years
  • Bengaluru
  • 3 hours ago
₹ Not Disclosed

GoLang Developer

  • Cognizant
  • 5 - 10 years
  • Hyderabad
  • 3 hours ago
₹ Not Disclosed

Application Developer

  • Accenture
  • 3 - 8 years
  • Bengaluru
  • 4 hours ago
₹ Not Disclosed

.NET Azure Developer

  • Hexaware Technologies
  • 7 - 11 years
  • Chennai
  • 4 hours ago
₹ 10-18 Lacs P.A.

Cognizant

Accenture is a global professional services company with leading capabilities in digital, cloud and security. Combining unmatched experience and specialized skills across more than 40 industries, we offer Strategy and Consulting, Interactive, Technology and Operations services—all powered by the w...