jobswithgpt - 77 Site Reliability Engineer jobs in San Francisco

Replit

Skills & Focus: Site Reliability Engineering, SRE, Infrastructure Automation, Monitoring Solutions, Infrastructure as Code, CI/CD Pipelines, Incident Management, Performance Optimization, Distributed Systems, Cloud-native Technologies

About the Company: Replit is the fastest way to turn ideas into software. With our powerful AI-powered Agent and Assistant, anyone can create and launch apps from natural languag…

Experience: 3+ years of experience in Site Reliability Engineering or similar roles (DevOps, Systems Engineering, Infrastructure Engineering)

Type: Full-Time

Benefits: Flexible Work Hours, Competitive Salary & Equity, Home Office Set-Up Stipend, Health, Dental, Vision and Life Insurance…

Visa

Senior Director of Site Reliability Engineering Foster City

Skills & Focus: Site Reliability Engineering, automation, performance, reliability, DevSecOps, monitoring, scalability, incident management, Service Level Objectives, CI/CD

Type: Hybrid

Zoox

Platform/Site Reliability Engineer Foster City

Skills & Focus: site reliability engineer, uptime, autonomous vehicles, fault-tolerant systems, deployment, operation, data-processing pipelines, compute-intensive tasks, CPUs, GPUs

About the Company: Zoox is a robotics company focused on developing autonomous vehicles with an ethos of automation throughout the infrastructure components they build.

Staff Technical Operations Engineer Foster City

Skills & Focus: IT Technical Operations, real-time command center, monitoring services, Site Reliability Engineering (SRE), Technical Operations Engineering, stability, live robot missions, strategic initiatives, innovative solutions, reliability and performance

Google

Software Developer Manager II, Site Reliability Engineering Fremont

Skills & Focus: Site Reliability Engineering, software development, automation, distributed systems, team management, project leadership, scalability, performance, problem solving, cloud infrastructure

Experience: 8 years of experience with data structures or algorithms, 5 years of experience with software development, 3 years of experience managing people or teams

Neuralink

Infrastructure Engineer Fremont

Skills & Focus: software engineering, networking protocols, Linux systems, cloud infrastructure, system administration, DevOps, automating processes, cryptographic protocols, production environments, Brain-Computer Interface (BCI)

About the Company: We are creating devices that enable a bi-directional interface with the brain. These devices allow us to restore movement to the paralyzed, restore sight to th…

Experience: Robust software engineering skills, experience in Linux systems, cloud/on-prem infrastructure.

Salary: $116,000 - $235,000 USD

Type: Full-time

Benefits: Medical, dental, and vision insurance, paid holidays, commuter benefits, meals provided, equity + 401(k) plan, parental…

Infrastructure Team Member Fremont

Skills & Focus: software engineering, cloud architecture, infrastructure, networking protocols, Linux systems, hybrid cloud, security fundamentals, IAC tools, cryptographic protocols, systems administration

About the Company: We are creating devices that enable a bi-directional interface with the brain. These devices allow us to restore movement to the paralyzed, restore sight to th…

Experience: Experience building hybrid cloud/on-prem infrastructure, software engineering skills, and system administration experience.

Salary: $35/Hr USD

Type: Full-time

Benefits: An opportunity to change the world, growth potential, excellent medical/dental/vision insurance, paid holidays, commute…

Robinhood Markets

Senior Software Engineer - Reliability Menlo Park

Skills & Focus: Reliability, Software Engineering, Large-scale systems, Distributed systems, Production issues, Monitoring, Best practices, Operational excellence, Collaboration, Infrastructure

About the Company: Robinhood Markets was founded on a simple idea: that our financial markets should be accessible to all. With customers at the heart of our decisions, Robinhood…

Experience: 5+ years experience in designing, building, and maintaining large-scale, distributed systems

Salary: $187,000 - $220,000 USD

Type: Full-time

Benefits: 100% paid health insurance for employees with 90% coverage for dependents, Annual lifestyle wallet for personal wellnes…

Character.Ai

Site Reliability Engineer Menlo Park

Skills & Focus: DevOps, SRE, Python, Golang, SQL, Linux, CI/CD, Kubernetes, Terraform, GCP

About the Company: Character.AI empowers people to connect, learn and tell stories through interactive entertainment. Over 20 million people visit Character.AI every month, using…

Experience: 5+ years

Robinhood Markets

Staff Software Engineer - Reliability Menlo Park

Skills & Focus: reliability, scalability, performance, security, software engineering, distributed systems, incident metrics, operational excellence, mentoring, infrastructure

About the Company: Robinhood Markets was founded on a simple idea: that our financial markets should be accessible to all. With customers at the heart of our decisions, Robinhood…

Experience: 8+ years experience in designing, building, and maintaining large-scale, distributed systems

Salary: $217,000 — $255,000 USD (Zone 1); $190,000 — $224,000 USD (Zone 2); $169,000 — $199,000 USD (Zone 3)

Type: Full-time

Benefits: 100% paid health insurance for employees with 90% coverage for dependents; Annual lifestyle wallet for personal wellnes…

Coupang

Site Reliability Engineer (SRE) Mountain View

Skills & Focus: Site Reliability Engineering, Automation, Infrastructure Automation, Cloud-based Infrastructure, DevOps, CI/CD, Kubernetes, Observability, Large-Scale Systems, E-commerce

About the Company: Coupang is a large-scale e-commerce company, operating complex systems to deliver mission-critical services.

Experience: 10+ years of industry experience building and operating large-scale distributed systems.

Type: Full-time

Observability Engineer Mountain View

Skills & Focus: observability solutions, monitoring, alerting, logging, tracing, Kubernetes, DevOps, SRE practices, cloud-based infrastructure, performance indicators

About the Company: Coupang is a leading force in South Korean commerce, known for its exceptional customer service and innovative approach to retail and e-commerce. The company b…

Experience: Strong experience in implementing and managing observability solutions in large-scale, complex environments.

Salary: $159,000 - $324,000/year

Type: Full-time

Benefits: Medical/Dental/Vision/Life insurance, Flexible Spending Accounts, Long-term/Short-term Disability, Employee Assistance …

Inworld Ai

Staff Cloud DevOps/Site Reliability Engineer (SRE) Mountain View

Skills & Focus: DevOps, Site Reliability Engineering, Terraform, Kubernetes, AWS, Azure, GCP, CI/CD, Microservices, Infrastructure-as-Code

About the Company: Inworld is the leading provider of AI technology for real-time interactive experiences, with a $500 million valuation and backing from top tier investors inclu…

Experience: 7+ years

Salary: $180,000 - $280,000

Type: Full-time

Benefits: Total compensation includes equity and benefits.

Newsbreak

Software Engineer in Reliability & Availability Mountain View

Skills & Focus: AWS, Kubernetes (EKS), EMR (Elastic MapReduce), service reliability, fault-tolerant architectures, Infrastructure-as-Code (IaC), CI/CD pipelines, monitoring tools (Prometheus, Grafana), high-availability strategies, incident response

About the Company: NewsBreak is redefining the way users interact with local news and their communities. By bridging local users, local content creators, and local businesses, ou…

Experience: 2+ years in SRE, DevOps, or Infrastructure Engineering roles

Salary: $130,000 – $260,000 USD

Type: Full-time

Benefits: Discretionary bonus and options may also be available; overall rewards package designed to attract top talents.

Google

Software Engineer III, Site Reliability Engineering Mountain View

Skills & Focus: Site Reliability Engineering, software development, large-scale systems, automation, coding, algorithms, problem solving, mentorship, collaboration, performance optimization

About the Company: Google is a global company focused on technology and innovation, committed to creating a culture of belonging and supporting a diverse workforce.

Experience: 2 years of experience with data structures/algorithms and software development in one or more programming languages.

Salary: $141,000-$202,000 + bonus + equity + benefits

Type: Full-time

Benefits: Comprehensive benefits including bonuses and equity

2k Games

Sr Manager of Devops and Observability Novato

Skills & Focus: Devops, Observability, SRE, cloud services, infrastructure management, CICD, security, performance, stakeholder management, microservices

About the Company: 2K is a global video game company, publishing titles developed by some of the most influential game development studios in the world. Our portfolio of titles i…

Experience: 5+ years in SRE, Devops, or system engineering fields, 3+ years coaching and mentoring senior technical talent.

Salary: $155,800 - $230,560 per Year

Type: Full-time

Benefits: Full range of medical, financial, and/or other benefits, including a bonus and/or equity awards.

Luma Ai

Site Reliability Engineer (SRE) Palo Alto

Skills & Focus: SRE, Infrastructure, GPU clusters, H100 GPUs, Training, Data Processing, Monitoring, Management tools, Performance, Maintenance

Senior Software Engineer - Reliability Palo Alto

Skills & Focus: SRE, GPU, infrastructure, monitoring, cloud providers, automation, scalability, containerization, observability, problem-solving

Experience: 5+ years

Type: Full-time

Hippocratic Ai

Senior Site Reliability Engineer (GCP / Kubernetes) Palo Alto

Skills & Focus: infrastructure automation, Kubernetes, DevOps, monitoring, scalability, cloud platforms, security compliance, deployment pipelines, disaster recovery, mentorship

About the Company: Hippocratic AI has developed a safety-focused Large Language Model (LLM) for healthcare, aiming to improve accessibility and outcomes by applying deep healthca…

Experience: At least 5 years of professional experience in DevOps engineering or a related field

Type: Full time

Luma Ai

Site Reliability Engineer (SRE) Palo Alto

Skills & Focus: Site Reliability Engineer, SRE, Infrastructure, GPU clusters, H100 GPUs, Monitoring tools, Management tools, Performance problems, Maintenance problems, Data Processing

Staff Software Engineer - Reliability Palo Alto

Skills & Focus: SRE, GPU infrastructure, monitoring systems, automation tools, scalability, cloud providers, containerization, IaC tools, service level objectives, problem-solving

Experience: 10+ yrs

Salary: $200,000 - $250,000

Type: Full time

Benefits: competitive equity packages in the form of stock options and a comprehensive benefits plan

Hippocratic Ai

Senior Site Reliability Engineer (GCP / Kubernetes) Palo Alto

Skills & Focus: infrastructure automation, deployment pipelines, monitoring, scalable systems, cloud platforms, Kubernetes, Terraform, Ansible, Jenkins, security compliance

About the Company: Hippocratic AI has developed a safety-focused Large Language Model (LLM) for healthcare. The company believes that a safe LLM can dramatically improve healthca…

Experience: At least 5 years of professional experience in DevOps engineering or a related field

Type: Full time

Senior ML Infrastructure Engineer Palo Alto

Skills & Focus: ML Infrastructure, Kubernetes, Terraform, multi-cloud environments, orchestration platform, cloud platforms, resource optimization, automation, system health monitoring, capacity planning

About the Company: Hippocratic AI has developed a safety-focused Large Language Model (LLM) for healthcare. The company believes that a safe LLM can dramatically improve healthca…

Experience: 3-5 years

Type: Full time

Mistral Ai

Site Reliability Engineer (SRE) Palo Alto

Skills & Focus: Site Reliability Engineer, DevOps, cloud computing, distributed systems, Kubernetes, Terraform, CI/CD, monitoring tools, incident response, scripting languages

About the Company: A tight-knit, nimble team dedicated to bringing cutting-edge AI technology to the world, making AI ubiquitous and open.

Experience: 5+ years of experience in a DevOps/SRE role

Salary: Competitive salary and bonus structure

Type: Full-time

Benefits: Comprehensive benefits package; Opportunities for professional growth and development

Glean

Senior Site Reliability Engineer (SRE) Palo Alto

Skills & Focus: SRE, cloud infrastructure, automation, monitoring, incident management, performance optimization, scalability, security compliance, software development, cloud platforms

About the Company: We’re on a mission to make knowledge work faster and more humane. We believe that AI will fundamentally transform how people work.

Experience: 8+ years of experience in a senior-level role within Site Reliability Engineering or similar role

Salary: $155,000 - $250,000 annually

Type: Full-time

Benefits: Competitive compensation, Medical, Vision and Dental coverage, Flexible work environment and time-off policy, 401k, Com…

Google

Software Developer Manager II, Site Reliability Engineering San Bruno

Skills & Focus: Site Reliability Engineering, software development, automation, distributed systems, team management, project leadership, scalability, performance, problem solving, cloud infrastructure

Experience: 8 years of experience with data structures or algorithms, 5 years of experience with software development, 3 years of experience managing people or teams

Abridge

Site Reliability Engineer (SRE) San Francisco

Skills & Focus: Site Reliability Engineer, SRE, Compensation, Full time, Hybrid, San Francisco, New York, Equity, Location, Engineering

Salary: $180K – $265K

Type: Full time

Benefits: Offers Equity

Site Reliability Engineer (SRE) San Francisco

Skills & Focus: Site Reliability Engineer, SRE, Compensation, Full time, Hybrid, San Francisco, New York, Equity, Location, Engineering

Salary: $180K – $265K

Type: Full time

Benefits: Offers Equity

Crusoe

Senior/Staff+ Site Reliability Engineer I - Observability San Francisco

Skills & Focus: Site Reliability Engineering, Observability, Automation, Monitoring, Infrastructure Design, Telemetery, Collaboration, Performance Analysis, Coding, Continuous Improvement

About the Company: Crusoe is building the World’s Favorite AI-first Cloud infrastructure company. We’re pioneering vertically integrated, purpose-built AI infrastructure solution…

Experience: 12+ years of professional SRE experience

Salary: up to $250,000

Type: Full time

Benefits: Industry competitive pay, Restricted Stock Units, health insurance, paid parental leave, 401(k) with a 100% match-up to…

Astranis

Senior Site Reliability Engineer - Ground Software San Francisco

Skills & Focus: Kubernetes, site reliability engineer, DevOps, Linux, monitoring, deployment practices, software systems, automation, mission control, shell programming

About the Company: Astranis is a telecommunications company that operates satellites from geostationary orbit (GEO) to connect millions of people worldwide, currently expanding i…

Experience: 7+ years of experience as a Site Reliability Engineer, DevOps or DevSecOps; 7+ years of experience on Linux

Salary: $150,000 - $215,000 USD

Type: Full-time

Benefits: Equity, high quality company-subsidized healthcare, disability and life insurance benefits, flexible PTO, 401(K) retire…

Fieldguide

Infrastructure Engineer San Francisco

Skills & Focus: SRE, DevOps, infrastructure, AWS, Kubernetes, TypeScript, Python, Go, IaC, reliability

About the Company: Fieldguide is establishing a new state of trust for global commerce and capital markets through automating and streamlining the work of assurance and audit pra…

Experience: 5+ years of total experience

Type: Full-time

Benefits: Competitive compensation packages with meaningful ownership, Unlimited PTO, 401k, Wellness benefits including free ther…

Abridge

Site Reliability Engineer (SRE) San Francisco

Skills & Focus: SRE, Kubernetes, CI/CD pipelines, cloud security, observability, GCP, distributed systems, engineering enablement, scalability, incident response

About the Company: Abridge was founded in 2018 with the mission of powering deeper understanding in healthcare. Our AI-powered platform was purpose-built for medical conversation…

Experience: 6+ years of software engineering experience, with at least 2 years as a back-end engineer.

Salary: $180K – $265K

Type: Full time

Benefits: Generous Time Off, Comprehensive Health Plans, Paid Parental Leave, 401k Matching, Learning and Development Budget, Sab…

Openai

Infrastructure Engineer, Public Sector San Francisco

Skills & Focus: infrastructure, engineering, Kubernetes, Python, FastAPI, Cosmos DB, Postgres, Terraform, reliable systems, cloud

About the Company: OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the bounda…

Experience: 8+ years in engineering, including 4+ years in infrastructure

Salary: $220.5K – $385K

Type: Full time

Benefits: Medical, dental, and vision insurance, mental health and wellness support, 401(k) plan with 50% matching, generous time…

Orb

Infrastructure Engineer San Francisco

Skills & Focus: infrastructure, reliability, observability, scalability, performance-critical, event processing, cloud, AWS, resiliency, mentorship

About the Company: Orb is on a mission to revolutionize billing infrastructure for the modern era of AI and software. We empower businesses to align their monetization with produ…

Experience: 5+ years in software engineering, 4+ years in infrastructure domain

Type: Full-time

Benefits: Excellent medical, dental, and vision insurance - 100% coverage for you and dependents; Unlimited PTO (with 15 days min…

Crusoe Energy Systems

Site Reliability Engineer (SRE) San Francisco

Skills & Focus: observability, monitoring, telemetry, automation, collaboration, SRE, infrastructure, Python, Docker, Kubernetes

About the Company: Crusoe is building the World’s Favorite AI-first Cloud infrastructure company. We’re pioneering vertically integrated, purpose-built AI infrastructure solution…

Experience: 5+ years of professional SRE experience

Salary: $135,000 - $158,000

Type: Full-time

Benefits: Hybrid work schedule, Industry competitive pay, Restricted Stock Units, Health insurance package options, Employer cont…

Loft Orbital

Director, Cloud Infrastructure San Francisco

Skills & Focus: Cloud Infrastructure, Site Reliability Engineering, cloud-based infrastructure, scalability, security, efficiency, cloud automation, Infrastructure-as-Code, CI/CD pipelines, space access

About the Company: Loft Orbital builds a space infrastructure providing a fast & simple path to orbit. We operate satellites, fly customer payloads onboard, and handle the entire…

Freed

Site Reliability Engineer San Francisco

Skills & Focus: cloud infrastructure, Azure, Kubernetes, IaC, observability, monitoring, SQL, Git, security, databases

About the Company: Freed combines clinician love with the latest AI tech to create products that make clinicians happier, including an AI scribe that automates medical documentat…

Experience: 7+ years

Salary: $180K – $230K

Type: Full-time

Benefits: Competitive salary, equity, medical and dental vision for US-based employees, unlimited PTO, company-sponsored annual r…

Abridge

Site Reliability Engineer (SRE) San Francisco

Skills & Focus: Site Reliability Engineer, SRE, Hybrid, Onsite, San Francisco, New York, Full time, Compensation, Equity, Engineering

Salary: $180K – $265K

Type: Full time

Benefits: Equity participation in company stock option plan

Google

Software Engineering Manager II, Site Reliability Engineering San Francisco

Skills & Focus: Software Engineering, Site Reliability, Systems Engineering, Automation, Performance, Scalability, Technical Leadership, Problem-Solving, Project Management, Distributed Systems

About the Company: Google is a global technology company that specializes in Internet-related services and products.

Experience: 8 years of experience with data structures or algorithms; 5 years of software development experience; 3 years managing teams.

Salary: $197,000-$291,000

Type: Full-time

Benefits: Full benefits package including bonus, equity, health, and wellness.

Carta

Senior Site Reliability Engineer, Cloud Networking San Francisco

Skills & Focus: Kubernetes, EKS, Docker, AWS, Cloud Infrastructure, Terraform, Networking, CI/CD, API Services, Infrastructure as Code

About the Company: Carta develops purpose-built software that transforms traditional accounting into a powerful growth engine. Carta’s world-class fund administration platform su…

Salary: $181,688 - $213,750 in Seattle, WA; $191,250 - $225,000 in San Francisco, CA or Santa Clara, CA

Benefits: equity for all full-time roles, exceptional benefits, and commissions plans for applicable roles.

Openai

Stream Infrastructure Engineer San Francisco

Skills & Focus: stream infrastructure, Kafka, Azure EventHub, AWS Kinesis, infrastructure tooling, Terraform, Kubernetes, data platform, scalability, reliability

About the Company: OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the bounda…

Experience: 4+ years in stream infrastructure engineering

Salary: $200K – $385K

Type: Full time

Benefits: Medical, dental, and vision insurance, mental health support, 401(k) with 50% matching, generous time off, paid parenta…

Alchemy

Infrastructure Engineer (Reliability Focus) San Francisco

Skills & Focus: Reliability, Observability, Infrastructure Engineer, Production Systems, AWS, Docker, Kubernetes, CI/CD, Infrastructure-as-Code, Engineering Excellence

About the Company: Alchemy is the only complete developer platform that offers the powerful APIs, SDKs, and tools necessary to build and scale onchain apps and rollups. Our infra…

Experience: 5+ years of experience as an Infrastructure Engineer focused on Reliability (e.g., Site Reliability Engineer, Production Engineer, Platform Engineer)

Salary: $135,000 - $350,000 annually

Type: Full-time

Benefits: Comprehensive medical, dental, and vision coverage, 401k, unlimited flexible time off, equity options

Baseten

Site Reliability Engineer San Francisco

Skills & Focus: Site Reliability Engineer, Kubernetes, Scalable Infrastructure, Infrastructure-as-Code, CI/CD Tools, Project Management, Collaboration, Mentorship, Performance Optimization, Machine Learning

About the Company: Join our dynamic team at Baseten, where we’re revolutionizing AI deployment with cutting-edge inference infrastructure. Backed by premier investors such as IVP…

Experience: 3+ years of professional work experience in a fast-paced, high-growth environment

Type: Full-time

Benefits: Competitive compensation package (Unlimited PTO, 401k, covered healthcare premiums), A unique opportunity to be part of…

Openai

Site Reliability Engineer, Enterprise IAM San Francisco

Skills & Focus: Site Reliability Engineer, IAM, Infrastructure, SRE, DevOps, Cloud, Python, Equity, Team Collaboration, System Reliability

Salary: $255K – $325K

Type: Full time

Benefits: Medical, dental, and vision insurance, mental health support, 401(k) plan with 50% matching, generous time off, paid pa…

Humane

Senior Site Reliability Engineer San Francisco

Skills & Focus: Infrastructure-as-Code, cloud technologies, software development, observability, security, CI/CD, Kubernetes, Terraform, Python, AWS

About the Company: Humane is a team of proven industry experts who have invented, built, and shipped category-defining hardware and software products to billions of people across…

Experience: 5+ years of Production Engineering, SRE, or similar experience

Salary: $180,000 - $230,000

Type: Full-time

Benefits: comprehensive healthcare insurance, disability insurance, life insurance, flexible spending accounts, and a 401K plan; …

Primer

Staff Site Reliability Engineer San Francisco

Skills & Focus: Site Reliability Engineer, Infrastructure, Fault-tolerant systems, Observability, Incident management, Automation, Monitoring, DevOps, Kubernetes, Microservices

About the Company: Primer exists to make the world a safer place. We do this by providing trusted decision-ready AI to the world's most critical organizations. Our software enabl…

Experience: 10+ years in production systems engineering, SRE, or DevOps roles

Salary: $180,000 to $230,000

Type: Full-time

Benefits: Full medical, dental, and vision coverage, fertility benefits, mental health coverage, 401(k), remote work stipends, mo…

Gusto

Storage Infrastructure Engineer San Francisco

Skills & Focus: storage infrastructure, MySQL, Postgres, data streaming, Kafka, cloud platforms, AWS, Terraform, resiliency, automation

About the Company: Gusto is a modern, online people platform that helps small businesses take care of their teams. On top of full-service payroll, Gusto offers health insurance, …

Experience: 4+ years of experience with software development and architecture; 2+ years of experience with database technologies like MySQL or Postgres; 2+ years of experience with data streaming technologies, particularly Kafka

Salary: $164,000-$237,000 in Denver & most remote locations, $235,000-$265,000 for San Francisco & New York

Type: Full-time

Benefits: Health insurance, 401(k), expert HR, Total Rewards philosophy

Anthropic

Staff Software Engineer, AI Reliability Engineering San Francisco

Skills & Focus: Software Engineering, Reliability Engineering, Service Level Objectives, Monitoring Systems, High-Availability Infrastructure, Incident Response, Cost Optimization, Distributed Systems, AI Infrastructure, Chaos Engineering

About the Company: Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a who…

Salary: $320,000 - $485,000 USD

Type: Full-time

Benefits: competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexibl…

Sesame

Backend Infrastructure Engineer San Francisco

Skills & Focus: backend, infrastructure, systems, reliability engineering, monitoring, deployments, Terraform, Kubernetes, automation, data engineering

About the Company: Sesame believes in a future where computers are lifelike - with the ability to see, hear, and collaborate with us in ways that feel natural and human. With thi…

Salary: $175K - $280K

Type: Full-time

Benefits: 401k matching, 100% employer-paid health, vision, and dental benefits, Unlimited PTO and sick time, Flexible spending a…

Focal Systems

Sr. DevOps/Site Reliability Engineer (SRE) San Francisco

Skills & Focus: DevOps, Site Reliability Engineer, GCP, Kubernetes, CI/CD, Infrastructure Automation, Cloud Services, Docker, Monitoring/Alerting, Python

About the Company: Focal Systems is the industry leader in retail AI solutions. We are a Silicon Valley based startup that has more than doubled in size every year since inceptio…

Experience: Solid experience in an infrastructure or Site Reliability Engineer (SRE) role

Salary: $170-190k + stock

Type: Full-time

Benefits: Competitive Salary & Attractive Stock, Paid Time Off, Quarterly Team Retreats, Education grants

Openai

Site Reliability Engineer, Public Sector San Francisco

Skills & Focus: Site Reliability Engineer, Infrastructure, Systems, Cloud, Public Sector, Kubernetes, Docker, Security Clearance, Automation, Troubleshooting

About the Company: OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the bounda…

Experience: 5+ years

Salary: $279K – $385K

Type: Full-time

Benefits: Medical, dental, and vision insurance, mental health and wellness support, 401(k) plan with 50% matching, generous time…

Twitter

Site Reliability Engineering Team Lead San Francisco

Skills & Focus: site reliability engineering, team leadership, engineering collaboration, technical design, reliability practices, coaching, team empowerment, personal development, cross-team communication, system scalability

About the Company: Twitter is a social media platform that allows users to post and interact with messages known as tweets.

Experience: 5+ years in a leadership role within engineering

Type: Full-time

Crusoe

Site Reliability Engineer II - Observability San Francisco

Skills & Focus: observability, infrastructure, monitoring, analytics, telemetry, collaboration, automation, logging, Kubernetes, Python

About the Company: Crusoe is building the World’s Favorite AI-first Cloud infrastructure company, focusing on purpose-built AI infrastructure solutions powered by clean, renewabl…

Experience: 5+ years of professional SRE experience

Salary: $135,000 - $158,000

Type: Full-time

Benefits: Industry competitive pay, Restricted Stock Units, health insurance options, paid parental leave, 401(k) with match, gen…

Peregrine Technologies

Senior Infrastructure Engineer San Francisco

Skills & Focus: Infrastructure Engineer, AWS GovCloud, Kubernetes, Docker, Terraform, Platform Engineering, DevOps, Site Reliability Engineering, Data Security, CI/CD

About the Company: Peregrine supports public safety agencies across the country, empowering public servants to improve operations and make better decisions. They partner with cus…

Experience: 8+ years of experience building and maintaining complex infrastructure for web applications with strict uptime requirements

Salary: $170,000 - $250,000 Annually + Benefits + Equity + Bonus

Type: Full-time

Benefits: Benefits offered include health insurance, equity, and bonuses.

Mercury

Infrastructure Engineer San Francisco

Skills & Focus: AWS, Terraform, Nix/NixOS, Monitoring, Observability, Prometheus, Grafana, OpenTelemetry, CI/CD, ECS

About the Company: Mercury is a financial technology company, not a bank. Banking services provided by Choice Financial Group, Column N.A., and Evolve Bank & Trust; Members FDIC.

Experience: 3+ years

Salary: $173,600 - $204,200 (US employees); $158,000 - $185,800 (Canadian employees)

Type: Full-time

Benefits: Highly competitive salary and equity ranges within the SaaS and fintech industry; commitment to diversity and equal opp…

Braze

Site Reliability Engineer San Francisco

Skills & Focus: Site Reliability Engineer, infrastructure, automation, Kubernetes, Docker, Terraform, DevOps, monitoring, SCALABILITY, Linux

About the Company: Braze is the leading customer engagement platform that empowers brands to Be Absolutely Engaging.™ Braze allows any marketer to collect and take action on any …

Experience: 3+ years of experience as a Software, DevOps, or Site Reliability Engineer

Salary: $120,960 - $194,400/year, with an expected On Target Earnings (OTE) between $134,400 and $216,000/year

Type: Full-time

Benefits: Competitive compensation that may include equity, retirement and employee stock purchase plans, flexible paid time off,…

Hive

DevOps and Systems Engineer San Francisco

Skills & Focus: cloud-based AI solutions, machine learning, DevOps, Site Reliability, automation, enterprise SaaS, distributed computing, high performance computing, hybrid infrastructure, GPU integration

About the Company: Hive is the leading provider of cloud-based AI solutions to understand, search, and generate content, and is trusted by hundreds of the world's largest and mos…

Type: Full-time

Sigma Computing

Senior Software Engineer - Observability and Reliability San Francisco

Skills & Focus: observability, distributed tracing, application performance management, cloud security, GCP, AWS, Azure, data analytics, Kubernetes, best practices

About the Company: Sigma is the only cloud analytics and business intelligence tool empowering business teams to break free from the confines of the dashboard, explore data for t…

Experience: 5+ years industry experience building and maintaining high-quality software

Salary: $150k - $220k annually

Type: Full-time

Benefits: Equity, generous health benefits, flexible time off policy, paid bonding time for all new parents, traditional and Roth…

Salesforce

Principal Software Engineering - Availability San Francisco

Skills & Focus: Service Reliability Engineering, software development, system design, automation, incident response, cloud computing, Kubernetes, monitoring, Agile methodology, service ownership

Experience: 15+ years of software development and engineering experience

Salary: $230,800 - $384,100 (California); $211,500 - $351,800 (Washington)

Type: Full time

Crusoe Energy Systems

Site Reliability Engineer (SRE) - Observability San Francisco

Skills & Focus: Observability, Site Reliability Engineering, Infrastructure, Telemetry, Monitoring, Analytics, Collaboration, Automation, CI/CD, Security

About the Company: Crusoe is building the World’s Favorite AI-first Cloud infrastructure company. We’re pioneering vertically integrated, purpose-built AI infrastructure solution…

Experience: 7+ years of professional SRE experience.

Salary: $155,000 - $183,000

Type: Full-time

Benefits: Hybrid work schedule, industry competitive pay, restricted stock units, health insurance options, HSA contributions, pa…

Writer

Site Reliability Engineer (SRE) San Francisco

Skills & Focus: Site Reliability Engineering, cloud infrastructure, Terraform, Python, AWS, GCP, Docker, Kubernetes, monitoring tools, system optimization

About the Company: Writer is the full-stack generative AI platform delivering transformative ROI for the world’s leading enterprises. Named one of the top 50 companies in AI by F…

Experience: Minimum of 7 years of hands-on experience in Site Reliability Engineering

Type: Full-time

Benefits: Generous PTO, medical, dental, vision coverage, paid parental leave, fertility and family planning support, flexible sp…

6sense

Software Engineer - Data Infrastructure San Francisco

Skills & Focus: Data Infrastructure, Platform Engineering, Distributed Systems, Data Platform, Python, SQL, Scalability, Performance Tuning, Kubernetes, Terraform

About the Company: 6sense is on a mission to revolutionize how B2B organizations create revenue by predicting customers most likely to buy and recommending the best course of act…

Experience: Proven hands-on experience in distributed systems; strong coding skills in Python and SQL; solid foundation in computer science with competencies in data structures, algorithms, and software design.

Salary: $158,735.37 - $232,811.87

Type: Full-time

Benefits: Generous health insurance coverage, life and disability insurance, 401K employer matching, paid holidays, self-care day…

Amplitude

Senior Infrastructure Engineer San Francisco

Skills & Focus: infrastructure engineer, digital analytics, high-volume evaluation service, AWS cloud infrastructure, Java, TypeScript, Python, automated testing, AB testing, cross-functional collaboration

About the Company: Amplitude is the leading digital analytics platform that helps companies unlock the power of their products. Over 3,800 customers, including Atlassian, NBCUniv…

Experience: 5+ years of experience building robust, scalable software systems which consistently meet SLA requirements.

Salary: $170,000 - $256,000 total target cash (inclusive of bonus or commission)

Type: Full-time

Benefits: Excellent Medical, Dental and Vision insurance coverages; Flexible time off; Generous stipends for wellness, learnin…

Arkose Labs

Senior Director of Engineering San Mateo

Skills & Focus: Platform Engineering, Infrastructure, Site Reliability, Cloud Infrastructure, Incident Response, AWS, Azure, Distributed Systems, CI/CD, Infrastructure-as-Code

About the Company: Arkose Labs protects enterprises from cybercrime and abuse, offering the world's first $1M warranties for credential stuffing and SMS toll fraud. They have a s…

Experience: 5+ years of leadership experience in Platform, Infrastructure, SRE, or related fields; 10+ years of experience in software engineering.

Salary: $270,000.00-$350,000.00

Type: Full-time

Benefits: Competitive salary + Equity; 401k plan; Robust benefits package (85% medical, dental, vision for employees; 75% for dep…

Xero

Site Reliability Engineer San Mateo

Skills & Focus: Site Reliability Engineering, SRE, software engineering, systems engineering, product reliability, observability, performance, failure tolerance, engineering support, data connectivity

About the Company: Xero helps businesses by automating routine tasks, surfacing actionable insights, and connecting them with the right data, advisors, and apps. They aim to make…

Type: Full-time

Team Lead, Product SRE San Mateo

Skills & Focus: Product SRE, reliability, Observability, high performing services, SRE culture, engineering, change management, team building, communication, strategy

About the Company: At Xero, we help you supercharge your business by automating routine tasks, surfacing actionable insights and connecting businesses with data, advisors, and ap…

Experience: Strong Engineering background, deep experience in SRE

Type: Full-time

Team Lead of Product SRE San Mateo

Skills & Focus: Product SRE, SRE engineers, reliability, Observability, high performing services, Engineering, high performing teams, Product SRE strategy, transformation, expert communicator

About the Company: Xero helps businesses by automating routine tasks and connecting them with the right data, advisors, and apps, ultimately contributing to a stronger economy.

Experience: Strong Engineering background, deep experience in SRE

Palo Alto Networks

Manager, Site Reliability Engineering (Cortex, Tools and Platforms) Santa Clara

Skills & Focus: DevOps, Site Reliability Engineering, Cortex, Security, Engineering Management, Cloud, Platforms, Production Operations, AI, Software Development

About the Company: Palo Alto Networks is a cybersecurity company that offers advanced firewalls and cloud-based security services to secure the digital transformation.

Type: Full-time

Carta

Senior Site Reliability Engineer, Cloud Networking Santa Clara

Skills & Focus: Kubernetes, EKS, Docker, AWS, Cloud Infrastructure, Terraform, Networking, CI/CD, API Services, Infrastructure as Code

About the Company: Carta develops purpose-built software that transforms traditional accounting into a powerful growth engine. Carta’s world-class fund administration platform su…

Salary: $181,688 - $213,750 in Seattle, WA; $191,250 - $225,000 in San Francisco, CA or Santa Clara, CA

Benefits: equity for all full-time roles, exceptional benefits, and commissions plans for applicable roles.

Palo Alto Networks

Sr Site Reliability Engineer (SASE) Santa Clara

Skills & Focus: Site Reliability Engineering, DevOps, Infrastructure as Code, CI/CD, Cloud Security, Kubernetes, Automation, Agile, Monitoring, Scalability

About the Company: At Palo Alto Networks everything starts and ends with our mission: Being the cybersecurity partner of choice, protecting our digital way of life. We are a comp…

Experience: 4+ years of total experience with Unix/Linux; 2+ years of working with microservice architectures

Salary: $0 - $0/YR

Type: Full-time

Benefits: FLEXBenefits wellbeing spending account, mental and financial health resources, personalized learning opportunities.

Servicenow

Manager of SRE team Santa Clara

Skills & Focus: SRE, team management, career development, automation, reliability, crisis management, continuous improvement, training, onboarding, support

Palo Alto Networks

Senior Staff DevOps Engineer Santa Clara

Skills & Focus: DevOps, SRE, Cloud infrastructure, Automation, Kubernetes, Terraform, GitOps, Performance, Security, Reliability

About the Company: Palo Alto Networks runs a large infrastructure and is one of the largest GCP customers.

Sr Site Reliability Engineer (App Service Team) Santa Clara

Skills & Focus: Site Reliability Engineering, DevOps, cloud-native applications, AWS, GCP, Terraform, Kubernetes, automation, programming languages, CI/CD

About the Company: Palo Alto Networks is a cybersecurity company that aims to redefine protection and security in the digital age. Their mission is to be the cybersecurity partne…

Experience: 4+ years as an engineer in Infrastructure, Operations, DevOps, or System Engineering; 2+ years building high availability, scalable cloud-native applications on AWS and GCP

Type: Full-time

Benefits: FLEXBenefits wellbeing spending account, mental and financial health resources, personalized learning opportunities

Netapp, Inc.

Cloud Infrastructure Site Reliability Engineer Sunnyvale

Skills & Focus: Cloud, Scripting, Automation, Containers, Kubernetes, DevOps, SRE, AWS, Azure, Google Cloud

Experience: 8+ years experience

Salary: 152,150 - 196,900 USD

Benefits: Health Insurance, Life Insurance, Retirement or Pension Plans, Paid Time Off (PTO), various Leave options, Performance-…

Spacex

SR. SOFTWARE INFRASTRUCTURE ENGINEER (STARLINK) Sunnyvale

Skills & Focus: Linux, Terraform, Ansible, Kubernetes, Docker, Python, DevOps, Site Reliability, Automation, Infrastructure

About the Company: SpaceX was founded under the belief that a future where humanity is out exploring the stars is fundamentally more exciting than one where we are not. Today Spa…

Experience: 5+ years of professional experience in systems administration, site reliability engineering, or DevOps; OR 7+ years of professional experience in systems administration, site reliability engineering, or DevOps in lieu of a degree

Salary: $170,000.00 - $230,000.00/per year

Type: Full-time

Benefits: comprehensive medical, vision, and dental coverage; 401(k)-retirement plan; short & long-term disability insurance; lif…

77 Site Reliability Engineer jobs in San Francisco.

🔥 Skills

📍 Locations

Replit

Visa

Zoox

Google

Neuralink

Robinhood Markets

Character.Ai

Robinhood Markets

Meta

Coupang

Inworld Ai

Newsbreak

Google

2k Games

Luma Ai

Hippocratic Ai

Luma Ai

Hippocratic Ai

Mistral Ai

Glean

Google

Abridge

Crusoe

Astranis

Fieldguide

Abridge

Openai

Orb

Crusoe Energy Systems

Loft Orbital

Freed

Abridge

Google

Carta

Openai

Alchemy

Baseten

Openai

Humane

Primer

Gusto

Anthropic

Sesame

Focal Systems

Openai

Twitter

Crusoe

Peregrine Technologies

Mercury

Braze

Hive

Sigma Computing

Salesforce

Crusoe Energy Systems

Writer

6sense

Amplitude

Arkose Labs

Xero

Palo Alto Networks

Carta

Palo Alto Networks

Servicenow

Palo Alto Networks

Netapp, Inc.

Spacex

Meta

Unlock AI-Powered Job Insights