About Beeks
Beeks Group is a leading managed cloud provider exclusively within the agile and fast-moving financial services and capital markets sector. Our Infrastructure-as-a-Service (IAAS) model is optimised for low-latency private cloud compute, connectivity and analytics, providing the flexibility to deploy and connect to Exchanges, trading venues and public cloud for a true hybrid cloud experience.
Founded in 2011, Beeks Group is listed on the London AIM Stock Exchange and has enjoyed continued growth each year. Beeks Group now employs over 100 team members across the globe and has an international network of over thirty data centres.
We have a fantastic opportunity for a Site Reliability Engineer to join us at our unique Head Office in Renfrew, which includes our state-of-the-art gym with weekly circuit training, a personal trainer and yoga classes as well as the Beeks Bar or weekly masseuse to help you unwind!
About the role
As part of the newly-formed Site Reliability Engineering team at Beeks, you will be working closely with other teams throughout the business to foster the adoption of SRE practices and methodologies. In particular, you will be driving improvements in reliability through the design and implementation of automation, and in leading a shift from classical monitoring towards observability. You will also assist the existing NOC-based on-call team by acting as a senior point of escalation for incident management.
Key Responsibilities
Champion the adoption of SRE culture and practices in other teams throughout the business
Identify opportunities for the implementation of automation and improved tooling
Enhance service reliability by maintaining and improving monitoring and alerting systems
Take part in the product design lifecycle to advocate for best practices around reliability
Be involved in the incident management process, working closely with the existing NOC-based on-call team when incidents occur
Proactively identify areas where the application of SRE methodologies could lead to improvements in reliability and efficiency
Required Qualifications and Skills
Proven track record of success as an SRE or in a related role (DevOps, Infrastructure Engineer, etc.)
Highly experienced with modern monitoring and observability solutions (Prometheus/Grafana, ELK stack, or third-party hosted solutions such as Datadog, NewRelic, etc.)
Highly proficient with automation and orchestration platforms (Ansible, Chef, Puppet, etc.)
Fluent in a programming or scripting language (preferably Python)
Experienced in the use of CI/CD tools (e.g. BitBucket, Jenkins, etc.)
Desired Skills
Bachelor's or Master’s degree in Computer Science, Engineering, or a related field
Formal training or certifications in SRE concepts or related disciplines
Experience with installation and support of Kubernetes container hosting platforms
Familiarity with maintaining Web sites and applications implemented using Django
Experienced in the use of Atlassian Jira for project and change management
What We Can Offer You
Compensation & Benefits
A competitive salary
A unique and highly rewarding Share Options scheme
Highly competitive pension scheme
EV salary exchange scheme
Life assurance cover
Investment in Training
Private Health Insurance
Lifestyle
Hybrid working (3 days in the office, 2 days at home)
Flexible work hours
33 days annual leave
This full-time position is available only to candidates who have full Right to Work in the UK.
We are an equal opportunity employer.
#J-18808-Ljbffr