Job Orders details

High Contract

JB060313 - Senior DevOps and Site Reliabi Apply

Start Date:

Interview Types

Skills	AWS, cloud infrastru..
Visa Types	Green Card, US Citiz..

Job Description
Apply for this Job

Job Title: Senior DevOps and Site Reliability Engineer

Location: Washington, DC (Onsite – No Remote)

Job Description

Randstad is seeking a Senior DevOps and Site Reliability Engineer (SRE) to support a key client in the DC Metro area. This is a high-impact, senior-level role responsible for enhancing the reliability, performance, security, and scalability of mission-critical production environments hosted on AWS.

The ideal candidate is a hands-on technical leader with deep expertise in DevOps, Infrastructure-as-Code (IaC), observability, and incident response. You will implement automation at scale, lead reliability engineering efforts, and help define SRE practices across cross-functional teams.

Key Responsibilities

Deployment & Automation

Build and maintain CI/CD pipelines (GitHub Actions, AWS CodePipeline, Jenkins).
Automate infrastructure provisioning using IaC tools (Terraform, CloudFormation, AWS CDK).
Develop automation scripts and self-service tools to streamline operations.
Use programming languages (Python, Go, Java) for automation and debugging.

Site Reliability Engineering

Lead incident response as an on-call engineer, including disaster recovery activities.
Conduct post-incident reviews and implement systemic improvements.
Define and monitor SLIs, SLOs, and manage error budgets.
Use observability tools (Dynatrace preferred, ELK Stack, AppDynamics) for monitoring and root cause analysis.
Implement distributed tracing and anomaly detection dashboards.

Performance, Capacity & Cost Optimization

Forecast system capacity needs and plan for scalability.
Lead cost optimization across cloud infrastructure.
Implement performance/resiliency testing frameworks.
Manage auto-scaling configurations for resource optimization.

Security & Governance

Investigate and respond to security incidents.
Automate compliance checks and security enforcement.
Drive adoption of zero-trust security models in cloud environments.
Apply ITIL principles using ITSM tools (ServiceNow preferred).

Required Qualifications

Education & Experience

Bachelor’s in Computer Science, Engineering, or related field.
5–8 years in DevOps, SRE, or Platform Engineering roles.
3+ years supporting high-availability production systems.
Proven experience leading complex technical initiatives.

Technical Expertise

Strong expertise in AWS cloud infrastructure (certification a plus).
Mastery of IaC tools – Terraform, CloudFormation, AWS CDK.
Proficient in Python, Go, or Java.
Deep understanding of observability/APM tools – Dynatrace strongly preferred.
Familiarity with database technologies (relational, NoSQL, cloud-native).

Professional & Leadership Skills

Effective team mentor and cross-functional collaborator.
Strong documentation skills (e.g., RCAs, technical articles).
Willingness to support on-call duties and non-standard hours during critical incidents.

Name*

Email*

Phone*

Visa Type*

Attach Resume*

Choose file

City*

State

Country

Start Date:
Interview Types

JB060313 - Senior DevOps and Site Reliabi Apply

Information