Last updated: 2025-07-21
19 Site Reliability Engineering jobs in Seattle.
Hive
Hive is the leading provider of cloud-based AI solutions to understand, search, and generate content, and is trusted by hundreds of the world's largest and mos…
Seattle
- Skills: DevOps, Systems Engineer, machine learning, cloud-based solutions, automation, high performance computing, enterprise SaaS, reliability, optimizing performance, hybrid infrastructure
- Level: mid
- Type: full_time
Palantir
Palantir builds the world’s leading software for data-driven decisions and operations. By bringing the right data to the people who need it, our platforms empo…
Seattle
- Skills: Kubernetes, production infrastructure, cloud hyperscalers, K8s clusters, tooling, CNCF components, scale, stability, security, automation
- Level: mid
- Type: full_time
Gusto
Gusto is a modern, online people platform that helps small businesses take care of their teams. On top of full-service payroll, Gusto offers health insurance, …
Seattle
- Skills: storage infrastructure, MySQL, Postgres, data streaming, Kafka, cloud platforms, AWS, Terraform, resiliency, automation
- Level: mid
- Type: full_time
SpaceX
SpaceX was founded under the belief that a future where humanity is out exploring the stars is fundamentally more exciting than one where we are not. Today Spa…
Redmond
- Skills: automation, Kubernetes, Linux, Bash, Python, site reliability engineering, DevOps, infrastructure, scalable products, performance improvement
- Level: mid
- Type: full_time
Kirkland
- Skills: Site Reliability Engineering, software systems engineering, distributed systems, capacity planning, system health, incident response, software platforms, automation, algorithm complexity, large-scale systems
- Level: senior
- Type: full_time
Salesforce
Salesforce is a cloud-based software company that provides customer relationship management (CRM) services and applications.
Seattle
- Skills: Site Reliability Engineering, resiliency engineering, observability, automation, cloud-native environments, infrastructure as code, CI/CD, Kubernetes, Terraform, compliance
- Level: senior
- Type: full_time
Google is a global technology company that specializes in Internet-related services and products, including search engines, online advertising technologies, cl…
Kirkland
- Skills: Site Reliability Engineering, Software Development, Distributed Systems, Automation, Scaling Systems, Incident Response, System Design Consulting, Monitoring, Fault-tolerant Systems, Complexity Analysis
- Level: senior
- Type: full_time
Salesforce
Salesforce is a customer relationship management software company that helps businesses connect with customers, streamline operations, and drive growth, focusi…
Bellevue
- Skills: observability, distributed systems, telemetry, metrics, infrastructure, automation, Java, Python, CI/CD, Docker
- Level: mid
- Type: full_time
Google is a global technology company that specializes in internet-related services and products, including search engine technology, online advertising, cloud…
Kirkland
- Skills: Site Reliability Development, software development, large-scale systems, data structures, automation, capacity planning, incident response, system health, technical leadership, cloud computing
- Level: mid
- Type: full_time
Salesforce
Salesforce is a global leader in CRM software solutions, helping businesses improve their operations through innovative technology.
Seattle
- Skills: Platform Orchestration, AWS, Infrastructure as Code, Terraform, Continuous Integration, Continuous Delivery, Incident Workflow Automation, Service Reliability, Scalable Systems, Cloud Resource Optimization
- Level: mid
- Type: full_time
Veeam
Veeam, the #1 global market leader in data resilience, believes businesses should control all their data whenever and wherever they need it. Veeam provides dat…
Des Moines
- Skills: SaaS, Azure, DevOps, CI/CD, container orchestration, monitoring, cloud infrastructure, distributed systems, data resilience, problem-solving
- Level: mid
- Type: full_time
Seattle
- Skills: Site Reliability Engineering, software development, data structures, algorithms, fault-tolerant systems, infrastructure, automation, coding, complex system design, review code
- Level: mid
- Type: full_time
Rokt
Rokt is the global leader in ecommerce, unlocking real-time relevance in the moment that matters most. Their AI Brain and ecommerce Network powers billions of …
Seattle
- Skills: Kubernetes, AWS EKS, cloud-native infrastructure, container orchestration, terraform, Python, Bash, network topology, Spark, observability
- Level: mid
- Type: full_time
Apple
Apple is a technology company that designs, manufactures, and sells consumer electronics, computer software, and online services.
Seattle
- Skills: Site Reliability Engineering, Cloudkit, Distributed Systems, Monitoring, Alerting, Automation, Linux/Unix, Networking, Systems Management, Fault Analysis
- Level: mid
- Type: full_time
Anthropic
Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a who…
Seattle
- Skills: AI reliability engineering, service level objectives, monitoring systems, high-availability infrastructure, incident response, cost optimization, distributed systems, model serving, SLO/SLA frameworks, chaos engineering
- Level: mid
- Type: full_time
Cognitiv
At Cognitiv, we are industry trailblazers redefining media buying with our Deep Learning Advertising Platform.
Bellevue
- Skills: Site Reliability Engineering, AWS, Hybrid Cloud, Infrastructure as Code, Python, Bash, Data-Center, Service Management, Monitoring, Disaster Recovery
- Level: mid
- Type: full_time
Flexe
Flexe is a team of technology entrepreneurs and logistics experts who are transforming the $1.5T logistics industry. We deliver technology-powered flexible war…
Seattle
- Skills: Site Reliability Engineering, infrastructure as code, Terraform, cloud platforms, Kubernetes, CI/CD, monitoring, observability, automation, cross-team collaboration
- Level: mid
- Type: full_time
Stacklok
Stacklok is an AI-first company led by Kubernetes co-creator Craig McLuckie, helping enterprise developers connect the data, systems, and services that power t…
Bellevue
- Skills: Kubernetes, Terraform, ArgoCD, Cloud-Native, Incident Response, Monitoring, Automation, Observability, SLO, Infrastructure as Code
- Level: mid
- Type: full_time
OpenAI
OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the bounda…
Seattle
- Skills: distributed systems, cloud infrastructure, reliability engineering, observability, Python, Go, C++, Rust, Kubernetes, Terraform
- Level: mid
- Type: full_time