Job Details
Revolutionizing protection.
Define what’s next in cybersecurity.
Principal DevOps Engineer
Our Mission
At Palo Alto Networks®, we’re united by a shared mission—to protect our digital way of life. We thrive at the intersection of innovation and impact, solving real-world problems with cutting-edge technology and bold thinking. Here, everyone has a voice, and every idea counts. If you’re ready to do the most meaningful work of your career alongside people who are just as passionate as you are, you’re in the right place.
Who We Are
In order to be the cybersecurity partner of choice, we must trailblaze the path and shape the future of our industry. This is something our employees work at each day and is defined by our values: Disruption, Collaboration, Execution, Integrity, and Inclusion. We weave AI into the fabric of everything we do and use it to augment the impact every individual can have. If you are passionate about solving real-world problems and ideating beside the best and the brightest, we invite you to join us!
We believe collaboration thrives in person. That’s why most of our teams work from the office full time, with flexibility when it’s needed. This model supports real-time problem-solving, stronger relationships, and the kind of precision that drives great outcomes.Job Summary
Job Summary
We’re seeking an experienced hands-on Cloud SRE engineer to lead high-severity incident and problem management across our AWS or GCP centric platforms. This role combines deep technical troubleshooting with process ownership, ensuring rapid recovery, root cause elimination, and long-term reliability improvements. You will own L2 OnCall responsibilities, drive post-incident learning, and champion automation and operational excellence.
Key Responsibilities
- Implement and lead post-mortem processes within SLAs, identify root causes, and drive corrective actions to reduce repeat incidents.
- Rapidly diagnose and resolve failures across Kubernetes, Terraform, and AWS or GCP using advanced troubleshooting frameworks.
- Implement automation and enhanced monitoring to proactively detect issues and reduce incident frequency.
- Work with AWS or GCP TAMs and other vendors to request new features or followups for updates.
- Coach and elevate SRE and DevOps teams, promoting best practices in reliability and incident/problem management.
- Establish and maintain a problem backlog, ensuring timely resolution and continuous process improvement.
- Ability to envision how a modern SRE team should operate leveraging AI/ML.
Qualifications
Required Qualifications
- 12+ years of experience in SRE/DevOps/Infrastructure roles, with a strong foundation in AWS or GCP cloud-based environments.
- 5+ years of proven experience managing SRE/DevOps teams, preferably with a strong focus on AWS or GCP Cloud Platform.
- Deep hands-on knowledge of Terraform, Kubernetes (EKS/GKE), GitLab CI/CD, and modern observability practices (e.g., Prometheus, OpenTelemetry).
- Strong knowledge in Data Platforms like RDS, Cassandra, Kafka, MemSQL is mandatory.
- Strong experience in managing incident response and postmortems, reducing MTTR, and driving proactive reliability improvements.
- Proficiency with cloud platforms such as AWS or GCP.
- Solid grasp of Infrastructure as Code, container orchestration, and scalable cloud architectures.
- Track record of building tools for system reliability, automated remediation, and performance tuning.
- Expertise in SLI/SLO/SLA design and implementation, and driving operational maturity through data.
- Strong interpersonal and leadership skills, with a demonstrated ability to coach, mentor, and inspire teams.
- Effective communicator, capable of translating complex technical concepts to non-technical stakeholders.
- Committed to inclusion, collaboration, and creating a culture where every voice is heard and respected.
Preferred Qualifications
- Experience leveraging AI/ML-based operations tools for automation, anomaly detection, and predictive alerting is a plus.
Our Commitment
We’re trailblazers that dream big, take risks, and challenge cybersecurity’s status quo. It’s simple: we can’t accomplish our mission without diverse teams innovating, together.
We are committed to providing reasonable accommodations for all qualified individuals with a disability. If you require assistance or accommodation due to a disability or special need, please contact us at accommodations@paloaltonetworks.com.
Palo Alto Networks is an equal opportunity employer. We celebrate diversity in our workplace, and all qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or other legally protected characteristics.
All your information will be kept confidential according to EEO guidelines.
Is role eligible for Immigration Sponsorship? No. Please note that we will not sponsor applicants for work visas for this position.MORE PALO ALTO NETWORKS
-
A corporate SaaS story.
How Palo Alto Networks secured critical SaaS apps using SaaS Security Posture Management.
-
Our Culture
Leading the way in a global community, from vision to action.
-
Early Careers
Our early-in-career programs will train you to be a part of the next generation of cybersecurity talent.
No Recently Viewed Jobs
No Recently Viewed Jobs