Loading...
USM Jobs / Site Reliability Engineer
High Contract

JB061398 - Site Reliability Engineer Apply

  • Start Date:
    Interview Types
  • Skills Production support
    Visa Types Green Card, US Citiz..

High‑Level Requirements

  • Strong expertise in Microservices architecture, with practical experience designing, deploying, and supporting distributed systems in production environments.
  • Deep hands‑on knowledge of Kubernetes, including deployment management, scaling, upgrades, troubleshooting, and cluster operations, with a strong focus on reliability, resilience, and performance.
  • Working proficiency with API Gateway platforms such as Azure API Management (APIM), Kong, and IBM API Connect (APIC) for traffic management, rate limiting, routing, and API observability.
  • Solid experience with observability and monitoring tools, including Splunk, AppDynamics, Instana, or similar platforms, covering log analytics, metrics, distributed tracing, dashboards, alerting, and SLO‑based monitoring.
  • Proven ability to diagnose and resolve complex production issues, perform root cause analysis (RCA), and implement preventative and corrective measures.
  • Familiarity with Site Reliability Engineering (SRE) best practices, including error budgets, SLIs/SLOs, incident response, post‑mortems, automation, and continuous improvement initiatives.
  • 5–7 years of relevant experience, primarily focused on operations support; administrative support experience is also considered highly desirable.