Last updated: 2025-06-06

35 Site Reliability Engineering jobs in London.

Hiring now: Sr Software Engineer @ Palantir, Site Reliability Engineer Sre @ Pioneering, Site Reliability Engineer @ Blockchain, Sr Site Reliability Engineer @ Reddit, Staff Platform Engineer @ Birdie, Site Reliability Engineer @ Axon, Sr Site Reliability Engineer @ Pleo, Staff Software Engineer Ai Reliability E @ Anthropic, Staff Platform Engineer @ Blacklane, Sr Infrastructure Engineer External @ Serotonin. Explore more at at jobswithgpt.com

🔥 Skills

Kubernetes (22) AWS (15) Terraform (12) CI/CD (10) Python (9) Site Reliability Engineering (7) DevOps (7) Automation (6) observability (6) Site Reliability Engineer (5)

📍 Locations

London (34) Staines (1)

Palantir

Palantir builds the world’s leading software for data-driven decisions and operations. By bringing the right data to the people who need it, our platforms empo…

Senior Software Engineer

London

  • Skills: Kubernetes, Production Infrastructure, Cloud Hyperscalers, K8s Clusters, Engineering Rigor, Operational Excellence, Automation, Self-Healing Systems, Compliance Regimes, CNCF Components
  • Experience: Senior level experience in software engineering, particularly in infrastructure and Kubernetes.

Pioneering Technology Company

Specialising in cutting-edge Language Models (LLM) and Machine Learning solutions.

Site Reliability Engineer (SRE)

London

  • Skills: Site Reliability Engineer, SRE, Language Models, Machine Learning, infrastructure, reliability, scalability, performance, monitor system health, cross-functional teams

Blockchain

Blockchain is the world's leading software platform for digital assets, offering the largest production blockchain platform globally and aiming to build an ope…

Site Reliability Engineer

London

  • Skills: Site Reliability Engineer, infrastructure, monitoring tools, GCP, AWS, Terraform, GitOps, containerization, programming languages, configuration management
  • Experience: Experience with containerization and service orchestration; strong knowledge of at least one programming language; experience with cloud solutions; experience with modern monitoring tools; experience with infrastructure as code tools; solid background with configuration management tools; experience with GitOps and CI; experience with messaging systems; experience with database management.
  • Type: Full-time
  • Salary: Full-time salary based on experience and meaningful equity

Palantir

Palantir builds the world’s leading software for data-driven decisions and operations. By bringing the right data to the people who need it, our platforms empo…

Site Reliability Engineer

London

  • Skills: Site Reliability Engineer, production infrastructure, cloud environments, on-prem environments, automate processes, systems design, diagnosing production issues, resolution of issues, partner teams, high-performance services
  • Type: Full-time

Reddit

Reddit is a community of communities, built on shared interests, passion, and trust. It hosts one of the most open and authentic conversations on the internet.

Senior Site Reliability Engineer

London

  • Skills: Site Reliability Engineering, software development, distributed systems, Kubernetes, Cloud systems, Go, Python, observability, DevOps, automation
  • Experience: 5+ years of experience in Software Engineering, Site Reliability Engineering, or a development-focused DevOps role.

Birdie

Birdie is the leading home healthcare technology platform that aims to radically transform the lives of older adults. Its all-in-one solution supports around 4…

Staff Platform Engineer

London

  • Skills: AWS, cloud-native architecture, Kubernetes, DevOps, CI/CD, SRE, security best-practices, micro-services, platform engineering, observability
  • Type: Full-time
  • Salary: £105k - £125k per annum

Axon

At Axon, we’re on a mission to Protect Life. We’re explorers, pursuing society’s most critical safety and justice issues with our ecosystem of devices and clou…

Site Reliability Engineer

London

  • Skills: cloud-native, site reliability, Azure, AWS, Kubernetes, Python, Go, CI/CD, Infrastructure as Code, observability
  • Experience: 10+ years of applicable experience.

Pleo

Pleo is a scale-up in the fintech industry, focused on providing the best possible experience for customers while managing expenses. Their mission is to help e…

Senior Site Reliability Engineer

London

  • Skills: Grafana, AWS, GCP, Terraform, Kubernetes, Flux, GitOps, Istio, GitHub Actions, Go
  • Type: Full-time

Anthropic

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a who…

Staff Software Engineer, AI Reliability Engineering

London

  • Skills: Service Level Objectives, monitoring systems, high-availability infrastructure, automated failover, incident response, cost optimization, distributed systems, SLO/SLA frameworks, chaos engineering, AI infrastructure
  • Type: Full-time
  • Salary: Annual Salary: £255,000 - £390,000 GBP

BLACKLANE

Blacklane provides professional chauffeur services, prioritizing scalability, resilience, and security in their platform. They emphasize employee growth, commu…

Staff Platform Engineer

London

  • Skills: Site Reliability Engineering, Developer Experience, Technical Leadership, Collaboration, Problem Solving, Mentorship, Scalability, Resilience, Security, Continuous Improvement
  • Experience: Proven expertise in Site Reliability Engineering
  • Type: Full-time
  • Salary: Up to 120,000

Serotonin

A leading institutional investment platform in the digital asset space, providing comprehensive infrastructure and technology for investors to manage their ent…

Senior Infrastructure Engineer (External)

London

  • Skills: Infrastructure, AWS, Linux, Networking, Automation, Security, Solana, Blockchain, DeFi, Digital Assets
  • Type: Full-Time

Axon

At Axon, we’re on a mission to Protect Life. We’re explorers, pursuing society’s most critical safety and justice issues with our ecosystem of devices and clou…

SRE Contributor

London

  • Skills: site reliability engineering, cloud-native, Kubernetes, Azure, AWS, Python, Terraform, CI/CD, observability, infrastructure as code
  • Experience: 7+ years of applicable experience.

Toggle

Toggle's software engineers drive the development of cutting-edge technologies that transform how millions of users connect, explore information, and engage wi…

Site Reliability Engineer

London

  • Skills: Site Reliability Engineering, Kubernetes, Distributed Systems, Cloud-native Applications, GitOps, Microservice Architectures, Automation, Performance, Data Structures, Networking
  • Experience: 4 years of experience as a software engineer.
  • Type: Full-time
  • Salary: £50,000 - £70,000 GBP

VML Enterprise Solutions

At VML, we are a beacon of innovation and growth in an ever-evolving world. Our heritage is built upon a century of combined expertise, where creativity meets …

Lead Observability Engineer

London

  • Skills: New Relic, observability, monitoring solutions, cloud platforms, programming languages, stakeholder management, technical documentation, incident response, problem identification, training
  • Experience: Deep expertise in New Relic platform configuration and optimisation and extensive understanding of observability principles.

Cisco ThousandEyes

Cisco ThousandEyes is a Digital Experience Assurance platform that empowers organizations to deliver flawless digital experiences across every network – even t…

Senior Site Reliability Engineering Manager, Production Engineering

London

  • Skills: Site Reliability Engineering, Kubernetes, cloud platforms, operational excellence, security compliance, incident response, DevOps, microservices architecture, cross-functional collaboration, performance optimization
  • Experience: 3+ years in a management role
  • Type: Full-time

Orgvue

Orgvue is an organisational design and planning platform that empowers businesses to transform their workforce by understanding the work people do and the skil…

Principal Site Reliability Engineer

London

  • Skills: Site Reliability Engineering, Kubernetes, AWS, Infrastructure as Code, Observability, Automation, CI/CD, Incident Management, Disaster Recovery, DevOps

Forter

Forter was founded on the insight that it's not about what is being purchased, nor where— but who is behind the interaction. The Forter Decision Engine finds p…

Senior Observability Engineer

London

  • Skills: Observability, Monitoring, Alerting, Scalable systems, High availability, Continuous learning, Collaboration, ELK stack, Cloud Infrastructure, Microservices
  • Experience: 4+ years of experience in the relevant field
  • Type: Full-time
  • Salary: Competitive salary and bonus plan

Axon

Axon is dedicated to building technology for public safety and justice, including body cameras, sensors, and evidence platforms that power the future of real-t…

Site Reliability Engineer (SRE)

London

  • Skills: site reliability engineering, cloud-native services, Kubernetes, CI/CD, Python, AWS, Azure, observability tools, Infrastructure as Code, problem-solving
  • Experience: 5+ years of applicable experience

Isomorphic Labs

Isomorphic Labs was founded in 2021 to advance human health through AI-driven drug discovery, building on the success of Google DeepMind’s AlphaFold.

Software Engineer (Reliability Engineering)

London

  • Skills: reliability, cloud infrastructure, docker, kubernetes, terraform, CI/CD, DevOps, monitoring, Python, ML operations

Cisco ThousandEyes

Cisco ThousandEyes is a leading Digital Experience Assurance platform that empowers organizations to deliver seamless digital experiences across every network.

Senior Site Reliability Engineer, Production Engineering

London

  • Skills: SaaS, cloud-native, AWS, Kubernetes, OpenTelemetry, automation, disaster recovery, microservice, multi-region, scale testing
  • Experience: 5+ years

Kiln

Kiln is the leading enterprise-grade rewards platform that enables institutional customers to stake assets and integrate staking & DeFi functionality into thei…

Senior SRE - Protocol

London

  • Skills: Blockchain Protocols, Site Reliability Engineering, Kubernetes, Terraform, Helm, Prometheus, Hashicorp Vault, Infrastructure-as-Code, Crypto, Web3
  • Experience: +5 years in Software or Infrastructure
  • Type: Full-time

Engine by Starling

Engine is Starling's SaaS business, powering Starling Bank and available to banks and financial institutions worldwide.

Platform Engineer

London

  • Skills: Cloud Infrastructure, GCP, Kubernetes, Terraform, Golang, Site Reliability Engineering, Automation, Security, Monitoring, CI/CD

Genomics England

Genomics England is involved in genomics and bioinformatics services, focusing on healthcare and bioinformatics.

Principal Site Reliability Engineer

London

  • Skills: SRE principles, platform engineering, AWS, CI/CD, infrastructure as code, monitoring and alerting, bioinformatics, cloud computing, software automation, resilience

Writer

Site Reliability Engineer

London

  • Skills: cloud infrastructure, automation, Terraform, Python, monitoring, reliability engineering, system architecture, cloud platforms, containerization, Kubernetes
  • Experience: 7+ years of hands-on experience in Site Reliability Engineering
  • Type: Full time

Kiln

Kiln is the leading enterprise-grade rewards platform that enables institutional customers to stake assets and integrate staking & DeFi functionality into thei…

Senior Site Reliability Engineer - Platform

London

  • Skills: cloud infrastructure, multi-cloud, bare metal, Kubernetes, Terraform, Terraform/Terragrunt, Helm, Prometheus, Vault, SOC 2
  • Experience: 5+ years in infrastructure/platform engineering in high-standard environments (FinTech a plus)
  • Type: Full-time

Apollo Research

Apollo Research focuses on behavioral model evaluations, black-box approaches, and applied interpretability in AI, with an emphasis on frontier AI evaluations …

Evals Software Engineer (Infrastructure focus)

London

  • Skills: infrastructure, AI evaluations, Kubernetes, AWS, security, Terraform, Python, scalability, cloud, IaC
  • Type: Full-time

Splunk

Build a safer and more resilient digital world by helping security, IT, and DevOps teams keep their organizations secure and reliable.

Site Reliability Engineer

London

  • Skills: cloud, operational, Linux, networking, distributed systems, automation, monitoring, DevOps, scaling, large-scale
  • Experience: operational experience at scale, hands-on with operating systems and networking, possibly cloud technologies

Gorgias

Gorgias is the conversational AI platform for ecommerce that drives sales and resolves support inquiries. Trusted by over 15,000 ecommerce brands, Gorgias supp…

Senior Site Reliability Engineer (SRE)

London

  • Skills: PostgreSQL, GKE, Kubernetes, Cloud Providers, Terraform, CI/CD, Python, Linux, Disaster Recovery, Observability
  • Experience: 5+ years experience as a Site Reliability Engineer or similar role

Behavox

Behavox is shaping the future for how businesses harness their most important raw material - data. Our mission is bold: Organize enterprise data into actionabl…

Site Reliability Engineer 3 (TS&CG)

London

  • Skills: Site Reliability Engineer, DevOps, high-load, data processing, Public Clouds, GCP, AWS, Automation, Python, Golang
  • Experience: 5+ years of experience as an SRE/DevOps engineer responsible for deployment and maintenance of production systems

ServiceNow

It all started in sunny San Diego, California in 2004 when a visionary engineer, Fred Luddy, saw the potential to transform how we work. Fast forward to today …

Senior Reliability Engineer

Staines

  • Skills: AI, relational databases, Java, scripting languages, Unix/Linux, web applications, performance degradation, team environment, customer service, Technical Support
  • Type: full_time
  • Salary: {'min': None, 'max': None, 'period': '', 'currency': ''}

Teleport

Teleport is the market leader in Identity-Native Infrastructure Access Management. Every company must protect its critical computing infrastructure from hacker…

Senior Customer Reliability Engineer

London

  • Skills: Identity-Native Infrastructure Access, Customer Reliability Engineer, technical support, Kubernetes, AWS, GCP, Azure, Linux server administration, Ansible, zero trust
  • Type: full_time
  • Salary: {'min': 129000.0, 'max': 151000.0, 'period': 'annual', 'currency': 'GBP'}

Palantir Technologies

Palantir builds the world’s leading software for data-driven decisions and operations. By bringing the right data to the people who need it, our platforms empo…

Ceph Infrastructure Engineer

London

  • Skills: Ceph, Kubernetes, automation, infrastructure design, system design, SRE, data centers, networking principles, Terraform, Go
  • Type: full_time
  • Salary: {'min': None, 'max': None, 'period': '', 'currency': ''}

Duffel

We are rebuilding the infrastructure that underpins the travel industry to make the future of travel effortless.

Site Reliability Engineer

London

  • Skills: Elixir, Phoenix, Kubernetes, Google Cloud Platform, Terraform, OpenTelemetry, Grafana, Prometheus, CI/CD, Honeycomb
  • Type: full_time
  • Salary: {'min': None, 'max': None, 'period': '', 'currency': ''}