Baseten
Join our dynamic team at Baseten, where we’re revolutionizing AI deployment with cutting-edge inference infrastructure. Backed by premier investors such as IVP…
Site Reliability Engineer
San Francisco
- Skills: Site Reliability Engineer, Kubernetes, Scalable Infrastructure, Infrastructure-as-Code, CI/CD Tools, Project Management, Collaboration, Mentorship, Performance Optimization, Machine Learning
- Experience: 3+ years of professional work experience in a fast-paced, high-growth environment
- Type: Full-time
Alchemy
Alchemy is the only complete developer platform that offers the powerful APIs, SDKs, and tools necessary to build and scale onchain apps and rollups. Our infra…
Infrastructure Engineer (Reliability Focus)
San Francisco
- Skills: Reliability, Observability, Infrastructure Engineer, Production Systems, AWS, Docker, Kubernetes, CI/CD, Infrastructure-as-Code, Engineering Excellence
- Experience: 5+ years of experience as an Infrastructure Engineer focused on Reliability (e.g., Site Reliability Engineer, Production Engineer, Platform Engineer)
- Type: Full-time
- Salary: $135,000 - $350,000 annually
Arkose Labs
Arkose Labs protects enterprises from cybercrime and abuse, offering the world's first $1M warranties for credential stuffing and SMS toll fraud. They have a s…
Senior Director of Engineering
San Mateo
- Skills: Platform Engineering, Infrastructure, Site Reliability, Cloud Infrastructure, Incident Response, AWS, Azure, Distributed Systems, CI/CD, Infrastructure-as-Code
- Experience: 5+ years of leadership experience in Platform, Infrastructure, SRE, or related fields; 10+ years of experience in software engineering.
- Type: Full-time
- Salary: $270,000.00-$350,000.00
Astranis
Astranis is a telecommunications company that operates satellites from geostationary orbit (GEO) to connect millions of people worldwide, currently expanding i…
Senior Site Reliability Engineer - Ground Software
San Francisco
- Skills: Kubernetes, site reliability engineer, DevOps, Linux, monitoring, deployment practices, software systems, automation, mission control, shell programming
- Experience: 7+ years of experience as a Site Reliability Engineer, DevOps or DevSecOps; 7+ years of experience on Linux
- Type: Full-time
- Salary: $150,000 - $215,000 USD
Succinct
Succinct is focused on making zero knowledge proofs accessible to developers, with state of the art zkVM technology and a proving network infrastructure.
Senior Software Engineer
San Francisco
- Skills: zero knowledge proofs, zkVM, distributed system, infrastructure, container orchestration, autoscaling, monitoring stack, observability, Rust, Golang
- Experience: Previous experience with container-based orchestration, autoscaling, and monitoring/observability stack.
- Type: Full-time
- Salary: Above-market salary and generous equity compensation
Figma
Figma is growing our team of passionate people on a mission to make design accessible to all. Figma helps entire product teams brainstorm, design and build bet…
Senior Engineer, Production Engineering
San Francisco
- Skills: Production Engineering, reliability, durability, scalability, performance, infrastructure, AWS, Cloud services, operational maturity, debugging
- Experience: 5+ years of experience operating infrastructure components/services at scale
- Type: Full-time
- Salary: $149,000 – $350,000 USD
Hive
Hive is the leading provider of cloud-based AI solutions to understand, search, and generate content, and is trusted by hundreds of the world's largest and mos…
DevOps and Systems Engineer
San Francisco
- Skills: cloud-based AI solutions, machine learning, DevOps, Site Reliability, automation, enterprise SaaS, distributed computing, high performance computing, hybrid infrastructure, GPU integration
- Type: Full-time
Neuralink
We are creating devices that enable a bi-directional interface with the brain. These devices allow us to restore movement to the paralyzed, restore sight to th…
Infrastructure Team Member
Fremont
- Skills: software engineering, cloud architecture, infrastructure, networking protocols, Linux systems, hybrid cloud, security fundamentals, IAC tools, cryptographic protocols, systems administration
- Experience: Experience building hybrid cloud/on-prem infrastructure, software engineering skills, and system administration experience.
- Type: Full-time
- Salary: $35/Hr USD
NewsBreak
NewsBreak is redefining the way users interact with local news and their communities. By bridging local users, local content creators, and local businesses, ou…
Software Engineer in Reliability & Availability
Mountain View
- Skills: AWS, Kubernetes (EKS), EMR (Elastic MapReduce), service reliability, fault-tolerant architectures, Infrastructure-as-Code (IaC), CI/CD pipelines, monitoring tools (Prometheus, Grafana), high-availability strategies, incident response
- Experience: 2+ years in SRE, DevOps, or Infrastructure Engineering roles
- Type: Full-time
- Salary: $130,000 – $260,000 USD
Luma AI
Site Reliability Engineer (SRE)
Palo Alto
- Skills: Site Reliability Engineer, SRE, Infrastructure, GPU clusters, H100 GPUs, Monitoring tools, Management tools, Performance problems, Maintenance problems, Data Processing
Xero
Xero helps businesses by automating routine tasks and connecting them with the right data, advisors, and apps, ultimately contributing to a stronger economy.
Team Lead of Product SRE
San Mateo
- Skills: Product SRE, SRE engineers, reliability, Observability, high performing services, Engineering, high performing teams, Product SRE strategy, transformation, expert communicator
- Experience: Strong Engineering background, deep experience in SRE
Zoox
Staff Technical Operations Engineer
Foster City
- Skills: IT Technical Operations, real-time command center, monitoring services, Site Reliability Engineering (SRE), Technical Operations Engineering, stability, live robot missions, strategic initiatives, innovative solutions, reliability and performance
Replit
Replit is the fastest way to turn ideas into software. With our powerful AI-powered Agent and Assistant, anyone can create and launch apps from natural languag…
Site Reliability Engineer
Foster City
- Skills: Site Reliability Engineering, SRE, Infrastructure Automation, Monitoring Solutions, Infrastructure as Code, CI/CD Pipelines, Incident Management, Performance Optimization, Distributed Systems, Cloud-native Technologies
- Experience: 3+ years of experience in Site Reliability Engineering or similar roles (DevOps, Systems Engineering, Infrastructure Engineering)
- Type: Full-Time
Gusto
Gusto is a modern, online people platform that helps small businesses take care of their teams. On top of full-service payroll, Gusto offers health insurance, …
Storage Infrastructure Engineer
San Francisco
- Skills: storage infrastructure, MySQL, Postgres, data streaming, Kafka, cloud platforms, AWS, Terraform, resiliency, automation
- Experience: 4+ years of experience with software development and architecture; 2+ years of experience with database technologies like MySQL or Postgres; 2+ years of experience with data streaming technologies, particularly Kafka
- Type: Full-time
- Salary: $164,000-$237,000 in Denver & most remote locations, $235,000-$265,000 for San Francisco & New York
Robinhood Markets
Robinhood Markets was founded on a simple idea: that our financial markets should be accessible to all. With customers at the heart of our decisions, Robinhood…
Staff Software Engineer - Reliability
Menlo Park
- Skills: reliability, scalability, performance, security, software engineering, distributed systems, incident metrics, operational excellence, mentoring, infrastructure
- Experience: 8+ years experience in designing, building, and maintaining large-scale, distributed systems
- Type: Full-time
- Salary: $217,000 — $255,000 USD (Zone 1); $190,000 — $224,000 USD (Zone 2); $169,000 — $199,000 USD (Zone 3)
Twitter
Twitter is a social media platform that allows users to post and interact with messages known as tweets.
Site Reliability Engineering Team Lead
San Francisco
- Skills: site reliability engineering, team leadership, engineering collaboration, technical design, reliability practices, coaching, team empowerment, personal development, cross-team communication, system scalability
- Experience: 5+ years in a leadership role within engineering
- Type: Full-time
Meta
Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Ap…
Production Engineer
Menlo Park
- Skills: Production Engineering, DevOps Engineer, Site Reliability Engineer, UNIX, TCP/IP, Python, Kubernetes, Terraform, MySQL, Infrastructure Management
- Experience: 2+ years of experience in UNIX and TCP/IP network fundamentals, 2+ years of coding experience
- Type: Full-time
- Salary: $117,000/year to $173,000/year + bonus + equity + benefits
Coupang
Coupang is a large-scale e-commerce company, operating complex systems to deliver mission-critical services.
Site Reliability Engineer (SRE)
Mountain View
- Skills: Site Reliability Engineering, Automation, Infrastructure Automation, Cloud-based Infrastructure, DevOps, CI/CD, Kubernetes, Observability, Large-Scale Systems, E-commerce
- Experience: 10+ years of industry experience building and operating large-scale distributed systems.
- Type: Full-time
Sigma Computing
Sigma is the only cloud analytics and business intelligence tool empowering business teams to break free from the confines of the dashboard, explore data for t…
Senior Software Engineer - Observability and Reliability
San Francisco
- Skills: observability, distributed tracing, application performance management, cloud security, GCP, AWS, Azure, data analytics, Kubernetes, best practices
- Experience: 5+ years industry experience building and maintaining high-quality software
- Type: Full-time
- Salary: $150k - $220k annually
Palo Alto Networks
Palo Alto Networks is a cybersecurity company that offers advanced firewalls and cloud-based security services to secure the digital transformation.
Manager, Site Reliability Engineering (Cortex, Tools and Platforms)
Santa Clara
- Skills: DevOps, Site Reliability Engineering, Cortex, Security, Engineering Management, Cloud, Platforms, Production Operations, AI, Software Development
- Type: Full-time
Fieldguide
Fieldguide is establishing a new state of trust for global commerce and capital markets through automating and streamlining the work of assurance and audit pra…
Infrastructure Engineer
San Francisco
- Skills: SRE, DevOps, infrastructure, AWS, Kubernetes, TypeScript, Python, Go, IaC, reliability
- Experience: 5+ years of total experience
- Type: Full-time
Glean
We’re on a mission to make knowledge work faster and more humane. We believe that AI will fundamentally transform how people work.
Senior Site Reliability Engineer (SRE)
Palo Alto
- Skills: SRE, cloud infrastructure, automation, monitoring, incident management, performance optimization, scalability, security compliance, software development, cloud platforms
- Experience: 8+ years of experience in a senior-level role within Site Reliability Engineering or similar role
- Type: Full-time
- Salary: $155,000 - $250,000 annually
Watershed
Watershed is the enterprise sustainability platform. Companies like Airbnb, Carlyle Group, FedEx, Visa, and Dr. Martens use Watershed to manage climate and ESG…
Senior Engineer, Developer Infrastructure
San Francisco
- Skills: engineering experience, developer infrastructure, testing tooling, observability, production monitoring, JavaScript, TypeScript, React, GCP, Terraform
- Experience: 6+ years
- Salary: $189,000 - $241,500 USD
Orb
Orb is on a mission to revolutionize billing infrastructure for the modern era of AI and software. We empower businesses to align their monetization with produ…
Infrastructure Engineer
San Francisco
- Skills: infrastructure, reliability, observability, scalability, performance-critical, event processing, cloud, AWS, resiliency, mentorship
- Experience: 5+ years in software engineering, 4+ years in infrastructure domain
- Type: Full-time
Visa
Transform global payment systems through automation and innovation.
Middleware Reliability Engineer
Foster City
- Skills: automation, infrastructure as code, DevOps, observability, CI/CD, Terraform, Ansible, Python, Java, Go
- Type: Hybrid
Braze
Braze is the leading customer engagement platform that empowers brands to Be Absolutely Engaging.™ Braze allows any marketer to collect and take action on any …
Site Reliability Engineer
San Francisco
- Skills: Site Reliability Engineer, infrastructure, automation, Kubernetes, Docker, Terraform, DevOps, monitoring, SCALABILITY, Linux
- Experience: 3+ years of experience as a Software, DevOps, or Site Reliability Engineer
- Type: Full-time
- Salary: $120,960 - $194,400/year, with an expected On Target Earnings (OTE) between $134,400 and $216,000/year
2K Games
2K is a global video game company, publishing titles developed by some of the most influential game development studios in the world. Our portfolio of titles i…
Sr Manager of Devops and Observability
Novato
- Skills: Devops, Observability, SRE, cloud services, infrastructure management, CICD, security, performance, stakeholder management, microservices
- Experience: 5+ years in SRE, Devops, or system engineering fields, 3+ years coaching and mentoring senior technical talent.
- Type: Full-time
- Salary: $155,800 - $230,560 per Year
Anthropic
Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a who…
Staff Software Engineer, AI Reliability Engineering
San Francisco
- Skills: Software Engineering, Reliability Engineering, Service Level Objectives, Monitoring Systems, High-Availability Infrastructure, Incident Response, Cost Optimization, Distributed Systems, AI Infrastructure, Chaos Engineering
- Type: Full-time
- Salary: $320,000 - $485,000 USD
Palo Alto Networks
Palo Alto Networks is a cybersecurity company that aims to redefine protection and security in the digital age. Their mission is to be the cybersecurity partne…
Sr Site Reliability Engineer (App Service Team)
Santa Clara
- Skills: Site Reliability Engineering, DevOps, cloud-native applications, AWS, GCP, Terraform, Kubernetes, automation, programming languages, CI/CD
- Experience: 4+ years as an engineer in Infrastructure, Operations, DevOps, or System Engineering; 2+ years building high availability, scalable cloud-native applications on AWS and GCP
- Type: Full-time
Hippocratic AI
Hippocratic AI has developed a safety-focused Large Language Model (LLM) for healthcare. The company believes that a safe LLM can dramatically improve healthca…
Senior Site Reliability Engineer (GCP / Kubernetes)
Palo Alto
- Skills: infrastructure automation, deployment pipelines, monitoring, scalable systems, cloud platforms, Kubernetes, Terraform, Ansible, Jenkins, security compliance
- Experience: At least 5 years of professional experience in DevOps engineering or a related field
- Type: Full time
Luma AI
Senior Software Engineer - Reliability
Palo Alto
- Skills: SRE, GPU, infrastructure, monitoring, cloud providers, automation, scalability, containerization, observability, problem-solving
- Experience: 5+ years
- Type: Full-time
OpenAI
OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the bounda…
Software Engineer, Reliability
San Francisco
- Skills: reliability, scalability, performance, monitoring, automation, Infrastructure as Code, containerization, cloud infrastructure, observability, microservices
- Experience: Proven experience as a reliability engineer or a similar role in a fast-paced, rapidly scaling company.
- Type: Full time
- Salary: $255K – $405K
Sesame
Sesame believes in a future where computers are lifelike - with the ability to see, hear, and collaborate with us in ways that feel natural and human. With thi…
Backend Infrastructure Engineer
San Francisco
- Skills: backend, infrastructure, systems, reliability engineering, monitoring, deployments, Terraform, Kubernetes, automation, data engineering
- Type: Full-time
- Salary: $175K - $280K
Palo Alto Networks
Senior Staff DevOps Engineer
Santa Clara
- Skills: DevOps, SRE, Cloud infrastructure, Automation, Terraform, Kubernetes, GitLab CI/CD, Monitoring, Security, Reliability
Foundry Technologies, Inc.
Foundry is actively seeking talented candidates at the Senior to Principal level, with a goal to transform how AI companies access compute power. They are buil…
Senior Site Reliability Engineer, Supply
San Francisco
- Skills: Site Reliability Engineer, Cloud Infrastructure, GPU Management, Incident Response, Monitoring Systems, Ansible, Scripting, Data Center Operations, Technical Documentation, AI Workloads
- Experience: Experience working with Linux systems administration and command-line interfaces, experience leading incident response and root cause analysis.
- Type: Full-time
- Salary: $170,000 - $230,000
Sustainable Talent
Sustainable Talent is a staffing agency partnered with Nvidia, focusing on providing talent for tech roles in infrastructure and data centers.
Platform Reliability & Lab Support Engineer
Santa Clara
- Skills: Infrastructure, Data Centers, Hardware, Software, Networking, Troubleshooting, DevOps, Maintenance, Collaboration, Testing
- Experience: 4+ years of equivalent experience in a Lab or Datacenter environment.
- Type: Full-time
- Salary: $70/hr - $80/hr
Cisco ThousandEyes
Cisco ThousandEyes is a Digital Experience Assurance platform that empowers organizations to deliver flawless digital experiences across every network – even t…
Lead Site Reliability Engineer II, Production Engineering
San Francisco
- Skills: DevSecOps, SRE, cloud-native, Kubernetes, Docker, AWS, security architecture, CI/CD pipelines, vulnerability management, observability
- Experience: 8+ years of experience in SRE, DevSecOps, or similar roles, with a strong focus on security.
- Type: Full-time
- Salary: 198,600-282,900 USD
Palo Alto Networks
Palo Alto Networks is a cybersecurity company committed to protecting our digital way of life. The company aims to redefine cybersecurity standards and focuses…
Principal Site Reliability Engineer (WildFire Cloud Infrastructure)
Santa Clara
- Skills: Site Reliability Engineer, DevOps, Cloud infrastructure, Automation, Kubernetes, GCP, AWS, Python, Docker, Terraform
- Experience: BS or MS in Computer Science, a related field, or equivalent professional experience
- Type: Full-time
- Salary: $160,000 - $225,000/YR
Cisco ThousandEyes
Cisco ThousandEyes is a Digital Experience Assurance platform that empowers organizations to deliver flawless digital experiences across every network – even t…
Senior Site Reliability Engineer, Infrastructure
San Francisco
- Skills: AWS, Terraform, Infrastructure-as-Code, Python, Go, Docker, Networking, Security, Site Reliability Engineering, Distributed Systems
- Experience: 5+ years
- Type: Full-time
- Salary: 174,300 - 203,100 USD
Pano AI
Pano AI is a 90-person growth stage start-up, headquartered in San Francisco, that is the leader in early wildfire detection and intelligence, helping fire pro…
Site Reliability Engineer
San Francisco
- Skills: Site Reliability Engineer, monitoring systems, automation, cloud services, containerization, Infrastructure as Code, troubleshooting, deployment processes, GCP, SRE mindset
- Experience: 5+ years of professional experience in a fast-paced SaaS or a similar business environment, 3+ years of hands-on experience supporting production systems as a Site Reliability Engineer (SRE) or a DevOps Engineer
- Type: Full-time
- Salary: $150,000 - $205,000 a year
Serotonin
A leading institutional investment platform in the digital asset space, providing comprehensive infrastructure and technology for investors to manage their ent…
Celonis
Celonis helps some of the world’s largest and most esteemed brands make processes work for people, companies and the planet. With over 5,000 enterprise custome…
Site Reliability Engineer
Redwood City
- Skills: Site Reliability Engineering, Microservices, Kubernetes, Automation, Incident management, Cloud computing, Java, Python, Observability, CI/CD
- Experience: Minimum of 5 years of experience building and maintaining cloud-based software applications.
- Type: Full-time
- Salary: $160,000 - $210,000 USD
Box
Box (NYSE:BOX) is the leader in Intelligent Content Management. Our platform enables organizations to fuel collaboration, manage the entire content lifecycle, …
Site Reliability Engineer
Redwood City
- Skills: SRE, reliability, scalability, cloud-native, Kubernetes, AWS, GCP, observability, automation, distributed systems
- Experience: 5+ years of working experience designing, developing, and operating large-scale, customer-facing products or services
- Type: Full-time
Xero
Xero is here to help you supercharge your business by automating routine tasks, surfacing actionable insights and connecting businesses with the right data, ad…
Technical Duty Officer / Sr. Site Reliability Engineer
San Mateo
- Skills: Site Reliability Engineer, Incident Management, AWS, Troubleshooting, Process Frameworks, Service Reliability, Collaboration, Technical Leadership, Automation, Communication
- Experience: 5+ years of experience as a Site Reliability Engineer, with relevant experience in an Operations or Engineering environment.
- Type: Permanent / Hybrid
- Salary: $185,000 - $230,000 a year
Celonis
Celonis helps some of the world’s largest and most esteemed brands make processes work for people, companies, and the planet. With over 5,000 enterprise custom…
Site Reliability Engineer
Redwood City
- Skills: Site Reliability Engineering, SRE principles, observability, automation, incident prevention, cloud platforms, Java, Python, Kubernetes, error budgets
- Experience: Minimum of 8+ years of experience in software engineering or SRE roles.
- Type: Full-time
- Salary: $195,000 - $235,000 USD
Robinhood Markets
Robinhood Markets was founded on a simple idea: that our financial markets should be accessible to all. With customers at the heart of our decisions, Robinhood…
Staff Software Engineer - Reliability Engineering
Menlo Park
- Skills: reliability, scalability, performance, security, distributed systems, programming languages, Linux, networking, incident metrics, monitoring
- Experience: 8+ years
- Type: Full-time
- Salary: $217,000 - $255,000 USD
GoodLeap
GoodLeap is a technology company delivering best-in-class financing and software products for sustainable solutions, from solar panels and batteries to energy-…
Site Reliability Engineer
San Francisco
- Skills: Site Reliability Engineer, software engineering, system engineering, automation, monitoring, incident response, infrastructure management, DevOps, observability, AWS
- Type: Full Time
- Salary: $97,000 - $141,000 a year
Loft Orbital
Loft Orbital is revolutionizing access to space by building reliable, shareable satellites that drastically reduce the time and complexity traditionally requir…
Senior Site Reliability Engineer
San Francisco
- Skills: Site Reliability Engineering, Cloud Infrastructure, DevOps, satellites, space operations, integration, delivery, reliability, automated infrastructure, SatDevOps
Crusoe
Crusoe is building the World’s Favorite AI-first Cloud infrastructure company. We’re pioneering vertically integrated, purpose-built AI infrastructure solution…
Senior Site Reliability Engineer
San Francisco
- Skills: Site Reliability Engineer, infrastructure, automation, monitoring, performance, availability, CI/CD, Kubernetes, Docker, Linux
- Experience: 5+ years of professional SRE experience
- Type: Full-time
- Salary: $183,000 - $210,000
Crusoe Energy Systems
Crusoe is building the World’s Favorite AI-first Cloud infrastructure company. We’re pioneering vertically integrated, purpose-built AI infrastructure solution…
Staff Site Reliability Engineer
San Francisco
- Skills: Site Reliability Engineering, infrastructure design, automation, monitoring, incident response, cloud infrastructure, AI applications, network programming, Unix/Linux, programming languages
- Experience: 8+ years of professional SRE experience
- Type: Full time
- Salary: $250,000
Aerospike
Aerospike, a leader in next-generation, always-on, hyperscale data solutions, enables extreme-scale, real-time applications for various industry leaders.
Performance & Reliability Engineer
Mountain View
- Skills: performance engineering, reliability, distributed systems, database concepts, performance tuning, Linux/Unix, observability tools, problem-solving, collaboration, communication
- Experience: Experience with distributed systems or large-scale services, preferably in a production setting.
- Type: Full-time
- Salary: $140,000 - $175,000
Crusoe Energy Systems
Crusoe is building the World’s Favorite AI-first Cloud infrastructure company. We’re pioneering vertically integrated, purpose-built AI infrastructure solution…
Staff Site Reliability Engineer
San Francisco
- Skills: Site Reliability Engineering, AI infrastructure, automation, monitoring, incident response, system performance, network programming, security best practices, CI/CD, cloud infrastructure
- Experience: 8+ years of professional SRE experience
- Type: Full-time
- Salary: up to $250,000 per year + Bonus
Crusoe
Crusoe is building the World’s Favorite AI-first Cloud infrastructure company. We’re pioneering vertically integrated, purpose-built AI infrastructure solution…
Senior Site Reliability Engineer
San Francisco
- Skills: Site Reliability Engineering, AI infrastructure, production systems, system reliability, automation, monitoring, Unix/Linux, Cloud, Kubernetes, CI/CD
- Experience: 5+ years of professional SRE experience and 5+ years of experience contributing to the architecture and design of new and current systems.
- Type: Full-time
- Salary: $183,000 - $210,000 per year + Bonus
ServiceNow
It all started in sunny San Diego, California in 2004 when a visionary engineer, Fred Luddy, saw the potential to transform how we work. Fast forward to today …
Senior Staff Engineer – Operations & Reliability (DevOps Focus)
Santa Clara
- Skills: DevOps, operations, reliability, automation, observability, continuous improvement, incident management, cloud, Kubernetes, Docker
- Experience: 10+ years of software engineering experience
- Type: Full-time
- Salary: $162,600 - $284,600
Personalis, Inc
Personalis is transforming the active management of cancer through breakthrough personalized testing, focusing on cancer management and patient care.
Senior Software Engineer
Fremont
- Skills: software engineering, LIMS, CI/CD pipelines, Python, Java, PostgreSQL, MySQL, Flask, Django, site reliability engineering
- Experience: 5+ years of experience in software engineering, site reliability engineering, and/or devops.
- Type: Full-time
- Salary: $147,000 to $180,000 per year
Intuit
Intuit is the global financial technology platform that powers prosperity for the people and communities we serve. With approximately 100 million customers wor…
Staff Software Engineer
Mountain View
- Skills: Kubernetes, AWS, DevOps, Platform Engineering, Reliability Engineering, Cloud Architecture, Automation, Observability, Incident Management, Data Analysis
- Experience: 7+ years
- Type: Full-time
- Salary: $184,500 - $250,000
Crusoe
Crusoe is building the World’s Favorite AI-first Cloud infrastructure company. We’re pioneering vertically integrated, purpose-built AI infrastructure solution…
Director of Engineering
San Francisco
- Skills: AI infrastructure, Cloud infrastructure, SRE organization, Incident Management, Operational Excellence, Reliability best practices, Observability standards, Mentorship, Incident Management program, Reliability engineering
- Type: Full-time
- Salary: $320,000 - $360,000
ServiceNow
ServiceNow stands as a global market leader, bringing innovative AI-enhanced technology to over 8,100 customers, including 85% of the Fortune 500®. Our intelli…
Senior Staff Machine Learning Engineer - DevOps/Site Reliability Engineer
Santa Clara
- Skills: Machine Learning, DevOps, Site Reliability Engineer, AI, infrastructure, platform operations, Kubernetes, Python, software engineering, cloud-based platform
- Experience: 8+ years of experience with infrastructure and platform operations, deployments, SRE, and DevOps
Autify, Inc.
Autify, Inc. is a San Francisco-based startup that was founded by the first Japanese team to graduate from Alchemist Accelerator, one of the top accelerators i…
Infrastructure Engineer
San Francisco
- Skills: automation, reliable, secure, cost-efficient, cloud infrastructure, software reliability, engineering, SRE, GenAI, test automation
Checkr
Checkr is building the data platform to power safe and fair decisions. Established in 2014, Checkr’s innovative technology and robust data platform help custom…
Site Reliability Engineer II
San Francisco
- Skills: Site Reliability Engineer, AWS, Azure, containers, micro-service architecture, REST APIs, incident commander, production issues, data quality, collaboration
- Experience: 3+ years
- Type: Full-time
- Salary: $135,000 to $159,000
ServiceNow
PLATO (Platform Engineering and AI Technology Organization) at ServiceNow is a customer-focused innovative group building intelligent software using a variety …
Visa Technology & Operations LLC
A subsidiary of Visa Inc. focusing on technology and operations.
Sr. Data Engineer
Foster City
- Skills: Platform-as-a-Service, scalable, secure, Kubernetes, Open-Source, high availability, CRDs, production, debug, documentation
Crusoe
Crusoe is building the World’s Favorite AI-first Cloud infrastructure company. We’re pioneering vertically integrated, purpose-built AI infrastructure solution…
Senior Staff+ Infrastructure Engineer
San Francisco
- Skills: Kubernetes, distributed systems, networking, APIs, scalability, event-driven systems, Go, Rust, AWS, security
- Experience: 8+ years in platform, infrastructure, or backend engineering, 2+ years at a Staff or Principal level
- Type: Full-time
- Salary: $245,000 - $290,000 + Bonus
Coupang
A fastest-growing retail company, disrupting the commerce industry from South Korea, combining startup culture with large global resources.
Technical Program Manager - Site Reliability Engineering (SRE) and Performance
Mountain View
- Skills: site reliability engineering, performance, distributed systems, large-scale systems, project management, security, privacy, compliance, stakeholders, scalability
- Experience: Minimum 12 years managing large-scale cross-functional projects
- Type: Full-time
- Salary: $159,000 - $324,000 per year
Altana
Altana applies AI to the world's largest organized body of supply chain data to power a more resilient, secure, and sustainable model of global commerce, focus…
Senior Manager, Technical Operations & Observability
San Francisco
- Skills: Observability, SRE, Incident Management, IT Operations, FinOps, Automation, Reliability, Cloud Platforms, Monitoring, Alerting
- Salary: $185,000 - $220,000 USD
Astronomer
Astronomer empowers data teams to bring mission-critical software, analytics, and AI to life and is the company behind Astro, the industry-leading unified Data…
Director of Reliability Engineering
San Francisco
- Skills: Reliability Engineering, SRE, Cloud-native, Automation, Observability, Scalability, Incidents Management, Service Uptime, Distributed Systems, Team Leadership
- Experience: 10+ years in software engineering, SRE, or DevOps roles; 5+ years in technical leadership
- Type: Full-time
- Salary: $260,000 - $290,000 plus equity
Speak
Speak is on a journey to fix the language learning experience by creating AI-powered conversational tools to help billions gain fluency.
SRE Engineer, Lead
San Francisco
- Skills: reliability, infrastructure, Kubernetes, GCP, Node.js, PostgreSQL, Redis, observability, incident response, scalability
- Experience: 7+ years in SRE, DevOps, or infrastructure-focused engineering roles
Zoox
Zoox is developing the first ground-up, fully autonomous vehicle fleet and the supporting ecosystem required to bring this technology to market. Sitting at the…
Senior Site Reliability Engineer
Foster City
- Skills: Site Reliability Engineering, Autonomous Vehicles, Microservice Architecture, Kubernetes, Data Pipelines, Performance Metrics, Linux, Python, C/C++, AWS
- Experience: 2+ years
- Type: Full-time
- Salary: $210,000 to $250,000
Moody's Shared Services, Inc.
Senior Systems Engineer
Newark
- Skills: Design, Build, Operate, System operation, Monitoring, Hardware upgrades, Disaster recovery, Vendor communication, Big data Spark clusters, Kubernetes
- Experience: At least two (2) years as a Systems Engineer or related role
- Type: Full-time
- Salary: $110,032 - $220,250/yr
Abridge
Abridge was founded in 2018 with the mission of powering deeper understanding in healthcare, transforming patient-clinician conversations into structured clini…
Software Engineer, SRE
San Francisco
- Skills: platform engineering, SRE, cloud-native, Kubernetes, CI/CD, Google Cloud Platform, Terraform, GCP, security, automation
- Experience: 5+ years of platform/devops experience in a cloud-native software company
- Type: Full time
- Salary: $162K – $234K
Benchling
Benchling’s mission is to unlock the power of biotechnology. The world’s most innovative biotech companies use Benchling’s R&D Cloud to power the development o…
Infrastructure Engineer
San Francisco
- Skills: infrastructure, AWS, security, monitoring, automation, site reliability, cloud computing, Kubernetes, Terraform, CI/CD
- Experience: 5 or more years in DevOps, SRE, or platform engineering
- Type: Full-time
- Salary: $157,150 to $212,750
Ripple
Ripple is building a world where value moves like information does today, providing crypto solutions for financial institutions, businesses, governments, and d…
Staff Software Engineer - Platform Engineering
San Francisco
- Skills: platform, infrastructure, automation, DevOps, Kubernetes, CI/CD, Terraform, observability, disaster recovery, scalability
- Experience: 10+ years professional experience, including 5+ years in cloud-native infrastructure
- Type: Full-time
- Salary: $188,000 - $211,500 USD (CA range)
Reliable Robotics
Building safety-enhancing technology for aviation to improve safety, convenience, and transformation of air transportation.
Site Reliability Engineer (SRE)
Mountain View
- Skills: support, monitoring, infrastructure, tools, deploying, systems safety, technology, automation, supporting, improvement
ABC Labs
ABC Labs contributes to the development of the Reserve protocol and helps to support and grow the Reserve ecosystem.
Rubrik
Rubrik (NYSE: RBRK) is on a mission to secure the world’s data. With Zero Trust Data Security™, we help organizations achieve business resilience against cyber…
Site Reliability Engineer
Palo Alto
- Skills: Site Reliability Engineering, Relational Databases, SQL, Kubernetes, Golang, Python, Java, Scalability, Disaster Recovery, FedRAMP
- Type: full_time
- Salary: {'min': 176800.0, 'max': 265200.0, 'period': 'annual', 'currency': 'USD'}
Wayve
Founded in 2017, Wayve is the leading developer of Embodied AI technology. Our advanced AI software and foundation models enable vehicles to perceive, understa…
Software Engineer
Sunnyvale
- Skills: Site Reliability Engineering, Python, C++, Rust, Cloud Computing, CI/CD, Containerization, Monitoring, Troubleshooting, Autonomous Vehicles
- Type: full_time
- Salary: {'min': None, 'max': None, 'period': '', 'currency': ''}
Veza
Veza is the identity security company. Identity and security teams use Veza to secure identity access across SaaS apps, on-prem apps, data systems, and cloud i…
Site Reliability Engineer
Redwood City
- Skills: Site Reliability Engineering, Cloud Automation, Kubernetes, Terraform, AWS, Monitoring Tools, Incident Response, Technical Documentation, Customer Technical Support, GitOps
- Type: full_time
- Salary: {'min': None, 'max': None, 'period': '', 'currency': ''}
Augment Code
The best software comes from Augmenting developers, not replacing them. We’re bringing joy back to software engineering and keeping developers in flow by build…
Software Engineer, SRE
Palo Alto
- Skills: Kubernetes, GCP, Linux, monitoring, containers, cloud infrastructure, Go, Shell, Jsonnet, AI products
- Type: full_time
- Salary: {'min': 225000.0, 'max': 300000.0, 'period': 'annual', 'currency': 'USD'}
Genmo
We are Genmo, a research lab dedicated to building open, state-of-the-art models for video generation towards unlocking the right brain of AGI.
Site Reliability Engineer
San Francisco
- Skills: GPU clusters, Kubernetes operations, Infrastructure-as-Code, GitOps workflows, CI/CD pipelines, observability stack, high-performance networking, NVIDIA DCGM, containerized GPU stacks, distributed training
- Type: full_time
- Salary: {'min': None, 'max': None, 'period': '', 'currency': ''}