Bachelor's degree in computer science, information technology, or a related field.
At least 5+ years of SRE in handling observability with 2+ years in the Development and Operations of applications/services with 99.9% production uptime requirement.
Strong hands-on coding experience in one or more programming languages such as Python, Golang, Java, Bash or other shell scripting languages.
Proficient in creating and configuring dashboards, logging and using Observability tools such as Prometheus, New Relic, and Datadog. Knowledge of search technologies such as ElasticSearch, OpenSearch, Typesense, Vespa, and Manticore preferred.
Proven experience in managing AWS infrastructure in a large-scale environment focusing on performance optimization.
AWS DevOps Engineer certification is a strong plus. Exposure to alerting systems such as Pager Duty.
Experience in measuring and implementing Service Level Indicators (SLIs) and Service Level Objectives (SLOs) across standalone and distributed systems.
Well-versed in troubleshooting, debugging and diagnosing operational issues and Production Incidents and driving these to closure.
Ability to work with a creative fast growing engineering team.
Understanding of Agile/Lean SDLC and DevOps processes.
Experience with containerization and orchestration technologies, such as Docker and Kubernetes, is advantageous.
Job Classification
Industry: IT Services & ConsultingFunctional Area / Department: Production, Manufacturing & EngineeringRole Category: EngineeringRole: Application EngineerEmployement Type: Full time