Site reliability engineer (SRE)(Canada Remote)

luxoft

Canada Remote

Responsibilities

• Work closely with software product development teams (ITSO, Product Owner, SME) to implement monitoring & observability instrumentation within their platforms.
• Drive adoption of best practices in monitoring, alerting, automation, and site reliability.
• Lead/contribute to engineering efforts from design to implementation focusing on instrumentation of logs, metrics, and traces.
• Drive use of automation in software instrumentation as well as in response to service degradation events.
• Identify and execute on opportunities to implement instrumentation in pre-production environments.
• Proactively pursue continuous improvement and expansion in observability coverage, service reliability best practices, incident management, and problem management.

Skills

Must have

• Advanced Splunk experience and technical proficiency required.
• 5+ years IT related experience, preferably in devops, sys admin, and/or developer role.
• 3+ years cumulative experience in the following technologies: Splunk/ITSI, AWS CloudWatch, APM (AppDynamics), Solarwinds, Grafana, Prometheus, or similar.
• 2+ years experience in service oriented architecture (SOA), microservices, and/or api network design paradigm.
• Working knowledge of software development using modern programming languages such as C#/VB (.net core), Python, Go, etc…
• Working knowledge of network protocols/technology, databases, and application servers and their roles in service delivery.
• Experience using cloud native technologies (Kubernetes, open telemetry, GitHub, etc ..) in a production environment.


Job information can change without notice

Remote Jobs Global

Trabajos remotos para hispanohablantes

Resumes to Get the Job – Currículum cv para conseguir el trabajo

Visit web page for more information

Share with an amigo to support the website