Job Description
QA Specialist with strong Databricks experience
Job summary:
SMBC is seeking highly skilled AI/ML QA Specialists with strong Databricks experience to ensure the quality, reliability, and regulatory readiness of AI/ML platforms. This role will focus on endtoend testing of CCAR and ESG projects, covering data pipelines, feature engineering, model training, validation, deployment, and monitoring. The ideal candidate blends data engineering QA, ML lifecycle validation, and platform testing in cloudnative environments. The plan is to automate the QA Testing and make it part of the regression suite that will be run for every future deployment.
Key Responsibilities
Model Development Platform QA
- Validate data ingestion, feature engineering, and training pipelines built on Databricks (Spark, Delta, MLflow).
- Design and execute QA strategies for:
- Dataset quality, schema validation, and lineage
- Feature consistency and drift checks
- Reproducibility of model training and experiments
- Test MLflow experiments, model versioning, and artifacts for completeness and traceability.
- Ensure compliance with model risk management (MRM), audit, and documentation standards.
- Conduct regression testing to ensure existing functionality remains unaffected after updates or enhancements.
Model Execution / Production Platform QA
- Test model deployment pipelines, including batch and realtime model execution.
- Validate:
- Model scoring accuracy and performance
- Input/output data contracts and SLAs
- Error handling, fallback logic, and retries
- Perform regression, performance, and volume testing for production workloads.
- Validate monitoring metrics (model health, drift, latency, failures).
- Conduct regression testing to ensure existing functionality remains unaffected after updates or enhancements.
Automation & Tooling
- Build and maintain automated test frameworks for data and ML pipelines (Databricks notebooks, PySpark, Python).
- Implement datadriven QA checks (DQ rules, nulls, thresholds, statistical validation).
- Integrate QA into CI/CD pipelines for ML workflows.
- Build automated regression suite to be run prior to any deployments.
Governance & Collaboration
- Partner with Data Scientists, ML Engineers, Platform Engineers, and Model Risk teams.
- Support UAT, audit reviews, and regulatory validation initiatives.
- Document QA results clearly for technical and nontechnical stakeholders.
Required Skills & Qualifications
- 58+ years of QA or data validation experience, with strong focus on AI/ML or data platforms
- Handson experience with Databricks:
- Spark / PySpark
- Delta Lake
- MLflow
- Strong Python experience for testing and automation
- Solid understanding of the ML lifecycle (data features training validation deployment)
- Experience testing:
- Data pipelines and largescale datasets
- Batch and realtime model execution
- Knowledge of cloud platforms (Azure preferred)
- Familiarity with CI/CD, Git, and automated testing frameworks
Preferred / NicetoHave
- Experience with model risk management (MRM) or regulated environments (banking, risk, compliance).
- Exposure to:
- Feature stores
- Model monitoring and drift detection
- Power BI or downstream analytical reporting validation
- Experience with performance testing at scale in distributed environments.
- Prior work on platform modernization or cloud migration initiatives.
Education
- Bachelors or Masters degree in Computer Science, Data Science, Engineering, or a related field.
Job Classification
Industry: IT Services & Consulting
Functional Area / Department: Engineering - Software & QA
Role Category: Quality Assurance and Testing
Role: Software Developer in Test (SDET)
Employement Type: Full time
Contact Details:
Company: Infobeans
Location(s): Indore
Keyskills:
databricks
ETL Testing
Python