Responsible for designing, building, and deploying large-scale, high-performance ML solutions using strong core computer science fundamentals. The role focuses on developing deep learning models, optimizing inference systems for low latency and high throughput, and integrating them into distributed production environments. The position also involves improving system efficiency, ensuring secure and compliant architectures, and collaborating with cross-functional teams to convert research models into production-grade applications.
Accountabilities :
End-to-End ML Pipeline Development
Build and optimize model training, evaluation, and deployment pipelines for large-scale production environments.
High-Performance Inference Engineering
Architect and scale distributed inference systems capable of processing large request volumes with minimal latency and efficient compute utilization
Model Optimization and Tuning
Implement model refinement techniques such as quantization, pruning, ONNX/TensorRT acceleration, and GPU-level optimizations to improve real-time inference performance.
Data Engineering and Processing
Develop robust data ingestion, preprocessing, and augmentation frameworks for structured, unstructured, and multimodal datasets while maintaining data integrity and quality.
Model Deployment & Performance Metrics :
1. Reduction in inference latency, compute cost, and system memory footprint
2. Successful deployment of ML models meeting defined SLAs (e.g., throughput, latency)
Pipeline Reliability & Efficiency :
1. Uptime, stability, and scalability of ML services in production
2. Faster development cycle time through optimized pipelines and tooling
Educational Qualifications
Bachelors or Masters degree in Computer Science, Engineering, or related field
Skills Required (Technical and / or Behavioral)
Strong fundamentals in computer architecture, operating system internals, and system design.
Experience in designing and maintaining scalable API-based systems.
(Preferred) Proficiency in Python with deep learning frameworks such as PyTorch and TensorFlow, as well as experience in system-level programming (C++ preferred).
Experience building high-performance APIs using frameworks such as FastAPI.
(Preferred) Knowledge of computer vision and NLP frameworks including OpenCV and HuggingFace Transformers.
(Preferred) Familiarity with MLOps tools and platforms such as MLflow, Kubeflow, Airflow, Docker, and Kubernetes.

Keyskills: Artificial Intelligence Machine Learning Aiml Deep Learning Python
Publicis Sapient is a digital business transformation company. We partner with global organizations to help them create and sustain competitive advantage in a world that is increasingly digital. We operate through our expert SPEED capabilities: Strategy and Consulting, Product, Experience, Engineeri...