Site Reliability Engineer

·
Full time
Location: London
·
Job offered by: Luupli
·
Category:
Luupli is a social media app that has equity, diversity, and equality at its heart. We believe that social media can be a force for good, and we are committed to creating a platform that maximizes the value that creators and businesses can gain from it, while making a positive impact on society and the planet. Luupli started internal testing since June 2024 and is getting ready for a commercial BETA testing from December 2024, with the hope of launching fully in the summer of 2025. Job Title: Site Reliability Platform Engineer

About Luupli:

Luupli is a social media app that has equity, diversity, and equality at its heart. We believe that social media can be a force for good, and we are committed to creating a platform that maximizes the value that creators and businesses can gain from it, while making a positive impact on society and the planet. Our team is made up of passionate and dedicated individuals who are committed to making Luupli a success. Role Description:

We are seeking a talented and experienced Site Reliability Engineer (SRE) to join our team. As an SRE, you will play a crucial role in ensuring the reliability, scalability, and performance of our cloud-based infrastructure and services, primarily hosted on AWS. If you have a passion for problem-solving, a deep understanding of AWS services, hands-on experience with Terraform, and proficiency in scripting with Python or Bash, we invite you to apply for this exciting opportunity. Role and Responsibilities:

Infrastructure Design and Automation:

Collaborate with software engineering and operations teams to design, build, and maintain cloud-based infrastructure using AWS and Terraform. Implement and enhance infrastructure-as-code (IaC) practices using Terraform to ensure reproducibility and scalability of infrastructure components.

Monitoring and Incident Management:

Develop and maintain monitoring solutions to proactively identify performance bottlenecks, system outages, and other potential issues. Participate in incident response and root cause analysis efforts to drive continuous improvement and prevent future incidents.

Reliability and Performance Optimization:

Optimize system performance, reliability, and cost efficiency through continuous monitoring, performance tuning, and capacity planning. Identify opportunities to automate manual processes and improve system resilience.

Scripting and Automation:

Utilize Python or Bash scripting to create and maintain automation tools for various operational tasks and deployments. Implement and improve continuous integration and continuous deployment (CI/CD) pipelines.

Security and Compliance:

Collaborate with security teams to implement best practices for securing cloud infrastructure and services. Ensure compliance with relevant industry standards and regulations.

Deployment and Release Management:

Support CI/CD pipelines for application deployments and updates. Contribute to the design and implementation of deployment strategies that promote zero-downtime releases.

Documentation and Knowledge Sharing:

Maintain clear and up-to-date documentation for infrastructure configurations, processes, and incident resolution procedures. Participate in knowledge sharing with team members to enhance overall expertise and skill sets.

Requirements:

Education and Experience:

Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent practical experience). Proven experience as a Site Reliability Engineer or similar role.

Technical Skills:

Extensive experience with Amazon Web Services (AWS) and its core services (EC2, S3, RDS, IAM, etc.). Strong proficiency in infrastructure-as-code (IaC) tools, with a focus on Terraform. Proficient in scripting with Python or Bash for automation and operational tasks. Solid understanding of networking principles and protocols. Knowledge of CI/CD pipelines and related tools.

Problem-Solving and Analytical Abilities:

Ability to diagnose and resolve complex technical issues in a fast-paced environment. Analytical mindset to proactively identify potential system weaknesses and performance bottlenecks.

Collaboration and Communication:

Strong teamwork and collaboration skills to work effectively with cross-functional teams. Excellent verbal and written communication skills.

Compensation:

This is an equity-only position, offering a unique opportunity to gain a stake in a rapidly growing company and contribute directly to its success.

#J-18808-Ljbffr

Recent Jobs

London (On site) · Full time

Are you a smart, driven professional who takes pride in making a difference in local communities? Turner & Townsend’s Real Estate division is experiencing significant growth and we’re looking for an experienced industry professional with health project experience to join our high-performing and collaborative Project Management team. Why Join Us? Impactful Work: Contribute to social [...]Read More... from Assistant Project Manager – Healthcare See details

Chasetown (On site) · Full time

My client, Autosmart International are a manufacturing success story! Site Operations Manager – leading fast-paced manufacturing and warehousing About Our Client Autosmart International is a manufacturing success story, leading the field in vehicle cleaning products. We are the No.1 choice of automotive trade customers across the UK. We have doubled in size in the last [...]Read More... from Site Operations Manager See details

London (On site) · Full time

CSS are looking for an experienced duty officer to join our client’s team who are a local council responsible for all areas within the Tendering district. Working hours: All shifts are 8 hours long with various start times available: Monday to Friday – start times between 6AM – 3PM Saturday & Sunday – 6AM – [...]Read More... from Duty Officer See details