Last updated: 2025-07-21
20 Site Reliability Engineering jobs in London.
Palantir
Palantir builds the world’s leading software for data-driven decisions and operations. By bringing the right data to the people who need it, our platforms empo…
London
- Skills: Kubernetes, Production Infrastructure, Cloud Hyperscalers, K8s Clusters, Engineering Rigor, Operational Excellence, Automation, Self-Healing Systems, Compliance Regimes, CNCF Components
- Level: mid
- Type: full_time
Pioneering Technology Company
Specialising in cutting-edge Language Models (LLM) and Machine Learning solutions.
London
- Skills: Site Reliability Engineer, SRE, Language Models, Machine Learning, infrastructure, reliability, scalability, performance, monitor system health, cross-functional teams
- Level: mid
- Type: full_time
Blockchain
Blockchain is the world's leading software platform for digital assets, offering the largest production blockchain platform globally and aiming to build an ope…
London
- Skills: Site Reliability Engineer, infrastructure, monitoring tools, GCP, AWS, Terraform, GitOps, containerization, programming languages, configuration management
- Level: mid
- Type: full_time
Palantir
Palantir builds the world’s leading software for data-driven decisions and operations. By bringing the right data to the people who need it, our platforms empo…
London
- Skills: Site Reliability Engineer, production infrastructure, cloud environments, on-prem environments, automate processes, systems design, diagnosing production issues, resolution of issues, partner teams, high-performance services
- Level: mid
- Type: full_time
Reddit is a community of communities, built on shared interests, passion, and trust. It hosts one of the most open and authentic conversations on the internet.
London
- Skills: Site Reliability Engineering, software development, distributed systems, Kubernetes, Cloud systems, Go, Python, observability, DevOps, automation
- Level: mid
- Type: full_time
Axon
At Axon, we’re on a mission to Protect Life. We’re explorers, pursuing society’s most critical safety and justice issues with our ecosystem of devices and clou…
London
- Skills: cloud-native, site reliability, Azure, AWS, Kubernetes, Python, Go, CI/CD, Infrastructure as Code, observability
- Level: senior
- Type: full_time
Pleo
Pleo is a scale-up in the fintech industry, focused on providing the best possible experience for customers while managing expenses. Their mission is to help e…
London
- Skills: Grafana, AWS, GCP, Terraform, Kubernetes, Flux, GitOps, Istio, GitHub Actions, Go
- Level: mid
- Type: full_time
Axon
At Axon, we’re on a mission to Protect Life. We’re explorers, pursuing society’s most critical safety and justice issues with our ecosystem of devices and clou…
London
- Skills: site reliability engineering, cloud-native, Kubernetes, Azure, AWS, Python, Terraform, CI/CD, observability, infrastructure as code
- Level: senior
- Type: full_time
Orgvue
Orgvue is an organisational design and planning platform that empowers businesses to transform their workforce by understanding the work people do and the skil…
London
- Skills: Site Reliability Engineering, Kubernetes, AWS, Infrastructure as Code, Observability, Automation, CI/CD, Incident Management, Disaster Recovery, DevOps
- Level: mid
- Type: full_time
Forter
Forter was founded on the insight that it's not about what is being purchased, nor where— but who is behind the interaction. The Forter Decision Engine finds p…
London
- Skills: Observability, Monitoring, Alerting, Scalable systems, High availability, Continuous learning, Collaboration, ELK stack, Cloud Infrastructure, Microservices
- Level: mid
- Type: full_time
Axon
Axon is dedicated to building technology for public safety and justice, including body cameras, sensors, and evidence platforms that power the future of real-t…
London
- Skills: site reliability engineering, cloud-native services, Kubernetes, CI/CD, Python, AWS, Azure, observability tools, Infrastructure as Code, problem-solving
- Level: mid
- Type: full_time
Kiln
Kiln is the leading enterprise-grade rewards platform that enables institutional customers to stake assets and integrate staking & DeFi functionality into thei…
London
- Skills: Blockchain Protocols, Site Reliability Engineering, Kubernetes, Terraform, Helm, Prometheus, Hashicorp Vault, Infrastructure-as-Code, Crypto, Web3
- Level: mid
- Type: full_time
Engine by Starling
Engine is Starling's SaaS business, powering Starling Bank and available to banks and financial institutions worldwide.
London
- Skills: Cloud Infrastructure, GCP, Kubernetes, Terraform, Golang, Site Reliability Engineering, Automation, Security, Monitoring, CI/CD
- Level: mid
- Type: full_time
Writer
London
- Skills: cloud infrastructure, automation, Terraform, Python, monitoring, reliability engineering, system architecture, cloud platforms, containerization, Kubernetes
- Level: senior
- Type: full_time
Kiln
Kiln is the leading enterprise-grade rewards platform that enables institutional customers to stake assets and integrate staking & DeFi functionality into thei…
London
- Skills: cloud infrastructure, multi-cloud, bare metal, Kubernetes, Terraform, Terraform/Terragrunt, Helm, Prometheus, Vault, SOC 2
- Level: mid
- Type: full_time
Splunk
Build a safer and more resilient digital world by helping security, IT, and DevOps teams keep their organizations secure and reliable.
London
- Skills: cloud, operational, Linux, networking, distributed systems, automation, monitoring, DevOps, scaling, large-scale
- Level: mid
- Type: full_time
Palantir Technologies
Palantir builds the world’s leading software for data-driven decisions and operations. By bringing the right data to the people who need it, our platforms empo…
London
- Skills: Ceph, Kubernetes, automation, infrastructure design, system design, SRE, data centers, networking principles, Terraform, Go
- Level: mid
- Type: full_time
Duffel
We are rebuilding the infrastructure that underpins the travel industry to make the future of travel effortless.
London
- Skills: Elixir, Phoenix, Kubernetes, Google Cloud Platform, Terraform, OpenTelemetry, Grafana, Prometheus, CI/CD, Honeycomb
- Level: mid
- Type: full_time
Meta
Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Ap…
London
- Skills: Production Engineering, Reliability, Scalability, Cloud Infrastructure, Load Balancers, Relational Databases, APIs, Distributed Systems, Network Fundamentals, Debugging
- Level: mid
- Type: full_time
Birdie
Birdie is the leading home healthcare technology platform that aims to radically transform the lives of older adults. It supports care providers with tools to …
London
- Skills: Kubernetes, AWS, DevOps, CI/CD, SRE, DevSecOps, cloud-native, observability, GitOps, micro-services
- Level: mid
- Type: full_time