Site Reliability Engineer

Palo Alto, CA

Date Posted:06-May-2026

Work Type:On-Site

Job Number:484337

Job Description

Position: Site Reliability Engineer
Location: Palo Alto, CA
Duration: 9 Months

Top skills required for this role:

• Programming: Proficiency in languages like Python, Java, or Go.
• System Administration: Strong understanding of Linux/Unix systems.
• Cloud Infrastructure: Experience with AWS
• Infrastructure as Code (IaC): Knowledge of tools like Terraform or Ansible.
• Monitoring Tools: Proficiency with tools such as Prometheus, Grafana, or Datadog

Job Description/ Responsibilities

• Automation and Tooling: SREs write code to automate operational tasks, such as provisioning, configuration changes, and system updates to reduce manual work and human error.
• System Monitoring and Alerting: Developing and maintaining observability stacks (logs, metrics, tracing) to proactively detect issues before they impact users.
• Incident Response and On-Call: Managing 24/7 on-call rotation to respond to, troubleshoot, and resolve production incidents.
• Post-Incident Reviews (Postmortems): Conducting blameless, in-depth reviews of incidents to identify root causes and implement preventive measures.
• Capacity Planning: Analyzing system resource utilization to ensure infrastructure can scale to handle future load requirements.
• Performance Optimization: Identifying and fixing bottlenecks in software and infrastructure to improve system efficiency and responsiveness.
• Error Budget Management: Setting and managing Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to determine if a service is reliable enough to allow new feature deployments.
• Chaos Engineering: Testing system resilience by intentionally introducing failures to ensure systems are fault-tolerant

Applicant Notices & Disclaimers

For information on benefits, equal opportunity employment, and location-specific applicant notices, click here

At SPECTRAFORCE, we are committed to maintaining a workplace that ensures fair compensation and wage transparency in adherence with all applicable state and local laws. This position's starting pay is: $59.00/hr.

✉Share via Email

Site Reliability Engineer

Job Description

Services

Information

Resources

Social

Privacy and Policy

Privacy and Policy