44 Site Reliability Engineer jobs in San Jose.

Hiring now: Site Reliability Engr @ Replit, Sr Dir Of Site Reliabilit @ Visa, Platformsite Reliability @ Zoox, Software Dev Mgr Ii Site @ Google, Infrastructure Engr @ Neuralink, Sr Software Engr Reliabil @ Robinhood , Site Reliability Engr @ Charactera, Production Engr @ Meta, Site Reliability Engr Sre @ Coupang, Staff Cloud Devopssite Re @ Inworld Ai.Explore more at jobswithgpt.com.

🔥 Skills

Kubernetes (18) DevOps (13) Site Reliability Engineering (12) SRE (12) Terraform (11) automation (10) CI/CD (9) scalability (8) monitoring (7) AWS (6)

📍 Locations

Palo Alto (9) Santa Clara (6) Mountain View (5) Foster City (4) Menlo Park (4) San Jose (4) San Mateo (4) Sunnyvale (4) Fremont (3) San Bruno (1)

Replit

Skills & Focus: Site Reliability Engineering, SRE, Infrastructure Automation, Monitoring Solutions, Infrastructure as Code, CI/CD Pipelines, Incident Management, Performance Optimization, Distributed Systems, Cloud-native Technologies
About the Company: Replit is the fastest way to turn ideas into software. With our powerful AI-powered Agent and Assistant, anyone can create and launch apps from natural languag…
Experience: 3+ years of experience in Site Reliability Engineering or similar roles (DevOps, Systems Engineering, Infrastructure Engineering)
Type: Full-Time
Benefits: Flexible Work Hours, Competitive Salary & Equity, Home Office Set-Up Stipend, Health, Dental, Vision and Life Insurance…

Visa

Skills & Focus: Site Reliability Engineering, automation, performance, reliability, DevSecOps, monitoring, scalability, incident management, Service Level Objectives, CI/CD
Type: Hybrid

Zoox

Skills & Focus: site reliability engineer, uptime, autonomous vehicles, fault-tolerant systems, deployment, operation, data-processing pipelines, compute-intensive tasks, CPUs, GPUs
About the Company: Zoox is a robotics company focused on developing autonomous vehicles with an ethos of automation throughout the infrastructure components they build.
Skills & Focus: IT Technical Operations, real-time command center, monitoring services, Site Reliability Engineering (SRE), Technical Operations Engineering, stability, live robot missions, strategic initiatives, innovative solutions, reliability and performance

Google

Skills & Focus: Site Reliability Engineering, software development, automation, distributed systems, team management, project leadership, scalability, performance, problem solving, cloud infrastructure
Experience: 8 years of experience with data structures or algorithms, 5 years of experience with software development, 3 years of experience managing people or teams

Neuralink

Skills & Focus: software engineering, networking protocols, Linux systems, cloud infrastructure, system administration, DevOps, automating processes, cryptographic protocols, production environments, Brain-Computer Interface (BCI)
About the Company: We are creating devices that enable a bi-directional interface with the brain. These devices allow us to restore movement to the paralyzed, restore sight to th…
Experience: Robust software engineering skills, experience in Linux systems, cloud/on-prem infrastructure.
Salary: $116,000 - $235,000 USD
Type: Full-time
Benefits: Medical, dental, and vision insurance, paid holidays, commuter benefits, meals provided, equity + 401(k) plan, parental…
Skills & Focus: software engineering, cloud architecture, infrastructure, networking protocols, Linux systems, hybrid cloud, security fundamentals, IAC tools, cryptographic protocols, systems administration
About the Company: We are creating devices that enable a bi-directional interface with the brain. These devices allow us to restore movement to the paralyzed, restore sight to th…
Experience: Experience building hybrid cloud/on-prem infrastructure, software engineering skills, and system administration experience.
Salary: $35/Hr USD
Type: Full-time
Benefits: An opportunity to change the world, growth potential, excellent medical/dental/vision insurance, paid holidays, commute…

Robinhood Markets

Skills & Focus: Reliability, Software Engineering, Large-scale systems, Distributed systems, Production issues, Monitoring, Best practices, Operational excellence, Collaboration, Infrastructure
About the Company: Robinhood Markets was founded on a simple idea: that our financial markets should be accessible to all. With customers at the heart of our decisions, Robinhood…
Experience: 5+ years experience in designing, building, and maintaining large-scale, distributed systems
Salary: $187,000 - $220,000 USD
Type: Full-time
Benefits: 100% paid health insurance for employees with 90% coverage for dependents, Annual lifestyle wallet for personal wellnes…

Character.Ai

Skills & Focus: DevOps, SRE, Python, Golang, SQL, Linux, CI/CD, Kubernetes, Terraform, GCP
About the Company: Character.AI empowers people to connect, learn and tell stories through interactive entertainment. Over 20 million people visit Character.AI every month, using…
Experience: 5+ years

Robinhood Markets

Skills & Focus: reliability, scalability, performance, security, software engineering, distributed systems, incident metrics, operational excellence, mentoring, infrastructure
About the Company: Robinhood Markets was founded on a simple idea: that our financial markets should be accessible to all. With customers at the heart of our decisions, Robinhood…
Experience: 8+ years experience in designing, building, and maintaining large-scale, distributed systems
Salary: $217,000 — $255,000 USD (Zone 1); $190,000 — $224,000 USD (Zone 2); $169,000 — $199,000 USD (Zone 3)
Type: Full-time
Benefits: 100% paid health insurance for employees with 90% coverage for dependents; Annual lifestyle wallet for personal wellnes…

Meta

Skills & Focus: Production Engineering, DevOps Engineer, Site Reliability Engineer, UNIX, TCP/IP, Python, Kubernetes, Terraform, MySQL, Infrastructure Management
About the Company: Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Ap…
Experience: 2+ years of experience in UNIX and TCP/IP network fundamentals, 2+ years of coding experience
Salary: $117,000/year to $173,000/year + bonus + equity + benefits
Type: Full-time
Benefits: Meta offers various benefits including bonuses and equity options.

Coupang

Skills & Focus: Site Reliability Engineering, Automation, Infrastructure Automation, Cloud-based Infrastructure, DevOps, CI/CD, Kubernetes, Observability, Large-Scale Systems, E-commerce
About the Company: Coupang is a large-scale e-commerce company, operating complex systems to deliver mission-critical services.
Experience: 10+ years of industry experience building and operating large-scale distributed systems.
Type: Full-time
Skills & Focus: observability solutions, monitoring, alerting, logging, tracing, Kubernetes, DevOps, SRE practices, cloud-based infrastructure, performance indicators
About the Company: Coupang is a leading force in South Korean commerce, known for its exceptional customer service and innovative approach to retail and e-commerce. The company b…
Experience: Strong experience in implementing and managing observability solutions in large-scale, complex environments.
Salary: $159,000 - $324,000/year
Type: Full-time
Benefits: Medical/Dental/Vision/Life insurance, Flexible Spending Accounts, Long-term/Short-term Disability, Employee Assistance …

Inworld Ai

Skills & Focus: DevOps, Site Reliability Engineering, Terraform, Kubernetes, AWS, Azure, GCP, CI/CD, Microservices, Infrastructure-as-Code
About the Company: Inworld is the leading provider of AI technology for real-time interactive experiences, with a $500 million valuation and backing from top tier investors inclu…
Experience: 7+ years
Salary: $180,000 - $280,000
Type: Full-time
Benefits: Total compensation includes equity and benefits.

Newsbreak

Skills & Focus: AWS, Kubernetes (EKS), EMR (Elastic MapReduce), service reliability, fault-tolerant architectures, Infrastructure-as-Code (IaC), CI/CD pipelines, monitoring tools (Prometheus, Grafana), high-availability strategies, incident response
About the Company: NewsBreak is redefining the way users interact with local news and their communities. By bridging local users, local content creators, and local businesses, ou…
Experience: 2+ years in SRE, DevOps, or Infrastructure Engineering roles
Salary: $130,000 – $260,000 USD
Type: Full-time
Benefits: Discretionary bonus and options may also be available; overall rewards package designed to attract top talents.

Google

Skills & Focus: Site Reliability Engineering, software development, large-scale systems, automation, coding, algorithms, problem solving, mentorship, collaboration, performance optimization
About the Company: Google is a global company focused on technology and innovation, committed to creating a culture of belonging and supporting a diverse workforce.
Experience: 2 years of experience with data structures/algorithms and software development in one or more programming languages.
Salary: $141,000-$202,000 + bonus + equity + benefits
Type: Full-time
Benefits: Comprehensive benefits including bonuses and equity

Luma Ai

Skills & Focus: SRE, Infrastructure, GPU clusters, H100 GPUs, Training, Data Processing, Monitoring, Management tools, Performance, Maintenance
Skills & Focus: SRE, GPU, infrastructure, monitoring, cloud providers, automation, scalability, containerization, observability, problem-solving
Experience: 5+ years
Type: Full-time

Hippocratic Ai

Skills & Focus: infrastructure automation, Kubernetes, DevOps, monitoring, scalability, cloud platforms, security compliance, deployment pipelines, disaster recovery, mentorship
About the Company: Hippocratic AI has developed a safety-focused Large Language Model (LLM) for healthcare, aiming to improve accessibility and outcomes by applying deep healthca…
Experience: At least 5 years of professional experience in DevOps engineering or a related field
Type: Full time

Luma Ai

Skills & Focus: Site Reliability Engineer, SRE, Infrastructure, GPU clusters, H100 GPUs, Monitoring tools, Management tools, Performance problems, Maintenance problems, Data Processing
Skills & Focus: SRE, GPU infrastructure, monitoring systems, automation tools, scalability, cloud providers, containerization, IaC tools, service level objectives, problem-solving
Experience: 10+ yrs
Salary: $200,000 - $250,000
Type: Full time
Benefits: competitive equity packages in the form of stock options and a comprehensive benefits plan

Hippocratic Ai

Skills & Focus: infrastructure automation, deployment pipelines, monitoring, scalable systems, cloud platforms, Kubernetes, Terraform, Ansible, Jenkins, security compliance
About the Company: Hippocratic AI has developed a safety-focused Large Language Model (LLM) for healthcare. The company believes that a safe LLM can dramatically improve healthca…
Experience: At least 5 years of professional experience in DevOps engineering or a related field
Type: Full time
Skills & Focus: ML Infrastructure, Kubernetes, Terraform, multi-cloud environments, orchestration platform, cloud platforms, resource optimization, automation, system health monitoring, capacity planning
About the Company: Hippocratic AI has developed a safety-focused Large Language Model (LLM) for healthcare. The company believes that a safe LLM can dramatically improve healthca…
Experience: 3-5 years
Type: Full time

Mistral Ai

Skills & Focus: Site Reliability Engineer, DevOps, cloud computing, distributed systems, Kubernetes, Terraform, CI/CD, monitoring tools, incident response, scripting languages
About the Company: A tight-knit, nimble team dedicated to bringing cutting-edge AI technology to the world, making AI ubiquitous and open.
Experience: 5+ years of experience in a DevOps/SRE role
Salary: Competitive salary and bonus structure
Type: Full-time
Benefits: Comprehensive benefits package; Opportunities for professional growth and development

Glean

Skills & Focus: SRE, cloud infrastructure, automation, monitoring, incident management, performance optimization, scalability, security compliance, software development, cloud platforms
About the Company: We’re on a mission to make knowledge work faster and more humane. We believe that AI will fundamentally transform how people work.
Experience: 8+ years of experience in a senior-level role within Site Reliability Engineering or similar role
Salary: $155,000 - $250,000 annually
Type: Full-time
Benefits: Competitive compensation, Medical, Vision and Dental coverage, Flexible work environment and time-off policy, 401k, Com…

Google

Skills & Focus: Site Reliability Engineering, software development, automation, distributed systems, team management, project leadership, scalability, performance, problem solving, cloud infrastructure
Experience: 8 years of experience with data structures or algorithms, 5 years of experience with software development, 3 years of experience managing people or teams

Leading Destination For Short-Form Mobile Video

Skills & Focus: Site Reliability Engineering, service lifecycle, cloud-managed infrastructure, Kubernetes, Redis, MySQL, Flink, automate scaling systems, distributed systems, problem solving
About the Company: It is the largest Unicorn startup and the leader in short-form video hosting service with approximately 1.5 billion monthly active users worldwide.
Experience: 5+ years
Type: Full-time
Benefits: 100% premium coverage for employee medical insurance, approximately 75% premium coverage for dependents, dental, vision…

Netapp

Skills & Focus: Cloud, Software Engineering, SRE, Incident Management, Observability, Application Security, Python, Golang, DevSecOps, Virtualization
About the Company: NetApp is the intelligent data infrastructure company, turning a world of disruption into opportunity for every customer. No matter the data type, workload or …

Arkose Labs

Skills & Focus: Platform Engineering, Infrastructure, Site Reliability, Cloud Infrastructure, Incident Response, AWS, Azure, Distributed Systems, CI/CD, Infrastructure-as-Code
About the Company: Arkose Labs protects enterprises from cybercrime and abuse, offering the world's first $1M warranties for credential stuffing and SMS toll fraud. They have a s…
Experience: 5+ years of leadership experience in Platform, Infrastructure, SRE, or related fields; 10+ years of experience in software engineering.
Salary: $270,000.00-$350,000.00
Type: Full-time
Benefits: Competitive salary + Equity; 401k plan; Robust benefits package (85% medical, dental, vision for employees; 75% for dep…

Xero

Skills & Focus: Site Reliability Engineering, SRE, software engineering, systems engineering, product reliability, observability, performance, failure tolerance, engineering support, data connectivity
About the Company: Xero helps businesses by automating routine tasks, surfacing actionable insights, and connecting them with the right data, advisors, and apps. They aim to make…
Type: Full-time
Skills & Focus: Product SRE, reliability, Observability, high performing services, SRE culture, engineering, change management, team building, communication, strategy
About the Company: At Xero, we help you supercharge your business by automating routine tasks, surfacing actionable insights and connecting businesses with data, advisors, and ap…
Experience: Strong Engineering background, deep experience in SRE
Type: Full-time
Skills & Focus: Product SRE, SRE engineers, reliability, Observability, high performing services, Engineering, high performing teams, Product SRE strategy, transformation, expert communicator
About the Company: Xero helps businesses by automating routine tasks and connecting them with the right data, advisors, and apps, ultimately contributing to a stronger economy.
Experience: Strong Engineering background, deep experience in SRE

Palo Alto Networks

Skills & Focus: DevOps, Site Reliability Engineering, Cortex, Security, Engineering Management, Cloud, Platforms, Production Operations, AI, Software Development
About the Company: Palo Alto Networks is a cybersecurity company that offers advanced firewalls and cloud-based security services to secure the digital transformation.
Type: Full-time

Carta

Skills & Focus: Kubernetes, EKS, Docker, AWS, Cloud Infrastructure, Terraform, Networking, CI/CD, API Services, Infrastructure as Code
About the Company: Carta develops purpose-built software that transforms traditional accounting into a powerful growth engine. Carta’s world-class fund administration platform su…
Salary: $181,688 - $213,750 in Seattle, WA; $191,250 - $225,000 in San Francisco, CA or Santa Clara, CA
Benefits: equity for all full-time roles, exceptional benefits, and commissions plans for applicable roles.

Palo Alto Networks

Skills & Focus: Site Reliability Engineering, DevOps, Infrastructure as Code, CI/CD, Cloud Security, Kubernetes, Automation, Agile, Monitoring, Scalability
About the Company: At Palo Alto Networks everything starts and ends with our mission: Being the cybersecurity partner of choice, protecting our digital way of life. We are a comp…
Experience: 4+ years of total experience with Unix/Linux; 2+ years of working with microservice architectures
Salary: $0 - $0/YR
Type: Full-time
Benefits: FLEXBenefits wellbeing spending account, mental and financial health resources, personalized learning opportunities.

Servicenow

Skills & Focus: SRE, team management, career development, automation, reliability, crisis management, continuous improvement, training, onboarding, support

Palo Alto Networks

Skills & Focus: DevOps, SRE, Cloud infrastructure, Automation, Kubernetes, Terraform, GitOps, Performance, Security, Reliability
About the Company: Palo Alto Networks runs a large infrastructure and is one of the largest GCP customers.
Skills & Focus: Site Reliability Engineering, DevOps, cloud-native applications, AWS, GCP, Terraform, Kubernetes, automation, programming languages, CI/CD
About the Company: Palo Alto Networks is a cybersecurity company that aims to redefine protection and security in the digital age. Their mission is to be the cybersecurity partne…
Experience: 4+ years as an engineer in Infrastructure, Operations, DevOps, or System Engineering; 2+ years building high availability, scalable cloud-native applications on AWS and GCP
Type: Full-time
Benefits: FLEXBenefits wellbeing spending account, mental and financial health resources, personalized learning opportunities

Google

Skills & Focus: software development, systems engineering, large-scale systems, fault-tolerant systems, automation, debugging, coding best practices, design reviews, collaboration, problem solving
About the Company: Google is a global technology company known for its search engine, software products, and cloud services.
Experience: Experience with data structures/algorithms and software development in one or more programming languages.
Salary: $118,000-$170,000
Type: Full-time
Benefits: Bonus, equity, and other benefits.

Netapp, Inc.

Skills & Focus: Cloud, Scripting, Automation, Containers, Kubernetes, DevOps, SRE, AWS, Azure, Google Cloud
Experience: 8+ years experience
Salary: 152,150 - 196,900 USD
Benefits: Health Insurance, Life Insurance, Retirement or Pension Plans, Paid Time Off (PTO), various Leave options, Performance-…

Spacex

Skills & Focus: Linux, Terraform, Ansible, Kubernetes, Docker, Python, DevOps, Site Reliability, Automation, Infrastructure
About the Company: SpaceX was founded under the belief that a future where humanity is out exploring the stars is fundamentally more exciting than one where we are not. Today Spa…
Experience: 5+ years of professional experience in systems administration, site reliability engineering, or DevOps; OR 7+ years of professional experience in systems administration, site reliability engineering, or DevOps in lieu of a degree
Salary: $170,000.00 - $230,000.00/per year
Type: Full-time
Benefits: comprehensive medical, vision, and dental coverage; 401(k)-retirement plan; short & long-term disability insurance; lif…

Meta

Skills & Focus: Production Engineering, DevOps Engineer, Site Reliability Engineer, UNIX, TCP/IP, Python, Kubernetes, Terraform, MySQL, Infrastructure Management
About the Company: Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Ap…
Experience: 2+ years of experience in UNIX and TCP/IP network fundamentals, 2+ years of coding experience
Salary: $117,000/year to $173,000/year + bonus + equity + benefits
Type: Full-time
Benefits: Meta offers various benefits including bonuses and equity options.