Site Reliability Engineer

Location:

London/Remote Duration:

Contract - 6 Months - Day Rate Negotiable We are currently seeking a highly skilled Site Reliability Engineer (SRE) to join our financial services client. This is a contract role offering the flexibility to work remotely or from our client’s London office. The ideal candidate will have strong experience in monitoring and observability platforms, particularly

Datadog , and be well-versed in Kubernetes and automation technologies. Key Responsibilities:

Datadog Expertise:

Implement, maintain, and enhance monitoring solutions using Datadog, ensuring optimal performance and real-time observability across client environments. Kubernetes & OpenShift (OCP):

Leverage extensive experience with OCP and Kubernetes to manage, scale, and optimize containerized applications and infrastructure. Automation & Testing:

Apply your automation skills to streamline operations and testing workflows using industry-specific tools, ensuring efficiency, reliability, and scalability. Disaster Recovery & Operational Excellence:

Develop and maintain Disaster Recovery (DR) strategies and ensure the adoption of Operational Excellence best practices within client infrastructure. Cloud & Container Certification:

Demonstrate expertise through certifications in AWS and Kubernetes while applying this knowledge to client projects. Client Engagement:

Collaborate directly with clients, bringing your consulting experience to deliver technical solutions that meet their unique needs and business objectives. Required Skills and Experience:

Datadog Experience:

Proven track record of implementing and managing Datadog in production environments. OCP/Kubernetes:

Strong experience in managing Kubernetes and OpenShift (OCP) platforms in high-availability environments. Automation Tools Knowledge:

Hands-on experience with automation tools and frameworks, such as Terraform, Ansible, or similar, to optimize infrastructure as code. Certifications:

Certification in AWS (Solutions Architect, SysOps Administrator, or similar) and/or Kubernetes (CKA/CKAD). Disaster Recovery & Best Practices:

Strong knowledge of DR strategies, coupled with expertise in Operational Excellence frameworks and best practices. Consulting & Client-Facing Experience:

Preferred background in a consulting or client-facing role, with the ability to communicate effectively with both technical and business stakeholders. What You’ll Bring:

A proactive, solutions-driven mindset with a focus on automation and resilience. The ability to work independently and manage multiple projects in a fast-paced, client-driven environment. Strong communication skills and the ability to collaborate across teams and with clients. This is an excellent opportunity for an experienced Site Reliability Engineer to make a significant impact on a dynamic financial services organization, utilizing cutting-edge technology and best practices.

#J-18808-Ljbffr

Site Reliability Engineer

Recent Jobs

Assistant Project Manager – Healthcare

Site Operations Manager

Duty Officer

Quick Search

The Platform

For Employers

Contact Us