Location:
London/Remote Duration:
Contract - 6 Months - Day Rate Negotiable We are currently seeking a highly skilled Site Reliability Engineer (SRE) to join our financial services client. This is a contract role offering the flexibility to work remotely or from our client’s London office. The ideal candidate will have strong experience in monitoring and observability platforms, particularly
Datadog , and be well-versed in Kubernetes and automation technologies. Key Responsibilities:
Datadog Expertise:
Implement, maintain, and enhance monitoring solutions using Datadog, ensuring optimal performance and real-time observability across client environments. Kubernetes & OpenShift (OCP):
Leverage extensive experience with OCP and Kubernetes to manage, scale, and optimize containerized applications and infrastructure. Automation & Testing:
Apply your automation skills to streamline operations and testing workflows using industry-specific tools, ensuring efficiency, reliability, and scalability. Disaster Recovery & Operational Excellence:
Develop and maintain Disaster Recovery (DR) strategies and ensure the adoption of Operational Excellence best practices within client infrastructure. Cloud & Container Certification:
Demonstrate expertise through certifications in AWS and Kubernetes while applying this knowledge to client projects. Client Engagement:
Collaborate directly with clients, bringing your consulting experience to deliver technical solutions that meet their unique needs and business objectives. Required Skills and Experience:
Datadog Experience:
Proven track record of implementing and managing Datadog in production environments. OCP/Kubernetes:
Strong experience in managing Kubernetes and OpenShift (OCP) platforms in high-availability environments. Automation Tools Knowledge:
Hands-on experience with automation tools and frameworks, such as Terraform, Ansible, or similar, to optimize infrastructure as code. Certifications:
Certification in AWS (Solutions Architect, SysOps Administrator, or similar) and/or Kubernetes (CKA/CKAD). Disaster Recovery & Best Practices:
Strong knowledge of DR strategies, coupled with expertise in Operational Excellence frameworks and best practices. Consulting & Client-Facing Experience:
Preferred background in a consulting or client-facing role, with the ability to communicate effectively with both technical and business stakeholders. What You’ll Bring:
A proactive, solutions-driven mindset with a focus on automation and resilience. The ability to work independently and manage multiple projects in a fast-paced, client-driven environment. Strong communication skills and the ability to collaborate across teams and with clients. This is an excellent opportunity for an experienced Site Reliability Engineer to make a significant impact on a dynamic financial services organization, utilizing cutting-edge technology and best practices.
#J-18808-Ljbffr