Jobgether logo

Senior Incident Manager (Remote - US)

Jobgether
Full-time
On-site
remote

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Senior Incident Manager in the United States.

This role offers a critical leadership opportunity in managing high-impact incidents for cloud-based services. You will coordinate cross-functional teams during major incidents, ensuring swift resolution while maintaining clear, accurate, and timely communication with stakeholders and customers. The position combines operational leadership, technical expertise, and strong communication skills to drive reliability, root cause analysis, and continuous improvement. You will mentor peers, improve incident response processes, and influence how complex distributed systems are monitored and maintained. This role is ideal for someone passionate about operational excellence, proactive problem solving, and driving confidence in technical systems during high-pressure events.

Accountabilities:

  • Lead critical production incidents, coordinating multi-disciplinary response teams to mitigate impact and restore operations rapidly.
  • Drive root cause analysis and collaborate with engineering teams to implement long-term reliability improvements.
  • Summarize key learnings from incidents, communicate actionable items, and ensure follow-through of technical and procedural improvements.
  • Own incident communications, providing timely and accurate updates to internal stakeholders and empathetic, customer-facing notifications.
  • Mentor and train colleagues in incident management, communication best practices, and technical response strategies to elevate the overall team performance.
  • Continuously refine incident response processes, playbooks, and automation to improve efficiency and reduce downtime.

Requirements

  • 5+ years of experience in incident management, site reliability engineering, or production operations for large-scale, cloud-native systems.
  • Proven ability to lead high-severity incidents, identify impacts, isolate fault domains, and coordinate multi-team responses.
  • Strong knowledge of cloud infrastructure (AWS, Azure, or GCP) including compute, networking, storage, and observability.
  • Hands-on experience with log analysis, debugging, and observability systems (Datadog, Elasticsearch, Splunk, Prometheus, Grafana, OpenTelemetry, etc.).
  • Proficiency in at least one programming or scripting language (Python, Go, Bash) for diagnostics and automation.
  • Experience creating and maintaining incident playbooks and communication templates for consistent, high-quality updates.
  • Exceptional communication and writing skills to summarize complex technical situations for both technical and business audiences.
  • BS, Master’s, or advanced degree in Computer Science, Computer Engineering, or related technical field.

Benefits

  • Competitive base salary: $143,300 – $200,600 USD, with potential for performance bonus and equity.
  • Comprehensive health benefits including medical, dental, vision, and life insurance.
  • Paid parental leave, flexible PTO, and additional company holidays.
  • Opportunities for professional growth and development.
  • Remote work options with occasional on-site collaboration if needed.
  • Supportive, inclusive, and collaborative team environment focused on operational excellence.

Jobgether is a Talent Matching Platform that partners with companies worldwide to efficiently connect top talent with the right opportunities through AI-driven job matching.

When you apply, your profile goes through our AI-powered screening process, designed to identify top talent efficiently and fairly.
🔍 Our AI evaluates your CV and LinkedIn profile thoroughly, analyzing your skills, experience, and achievements.
📊 It compares your profile to the job’s core requirements and past success factors to determine your match score.
🎯 Based on this analysis, we automatically shortlist the three candidates who best match the role.
🧠 When necessary, our human team may conduct an additional manual review to ensure no strong profile is overlooked.

The process is transparent, skills-based, and free of bias — focusing solely on your fit for the role. Once the shortlist is complete, we share it directly with the company that owns the job opening. The final decision and next steps (such as interviews or assessments) are made by their internal hiring team.

Thank you for your interest!

 

#LI-CL1

Apply now
Share this job