logo

Adyen

Incident Engineer

Department
Engineering
Job Type / Location
remote / onsite
Experience Required
7+ years
Posted On

This is Adyen

Adyen provides payments, data, and financial products in a single solution for customers like Meta, Uber, H&M, and Microsoft - making us the financial technology platform of choice. At Adyen, everything we do is engineered for ambition.

For our teams, we create an environment with opportunities for our people to succeed, backed by the culture and support to ensure they are enabled to truly own their careers. We are motivated individuals who tackle unique technical challenges at scale and solve them as a team. Together, we deliver innovative and ethical solutions that help businesses achieve their ambitions faster.

Incident Engineer

A team within Global Platform Operations under the Monitoring Engineering pillar exhibits an unwavering attention to detail and a deep understanding of the platform wide monitoring implications to all merchants.

In this role, you will be on-call monitoring platform performance, communicating with merchants, working on monitoring frameworks,  providing feedback to product engineering teams to improve the reliability of the platform. You will initiate and lead initiatives across our platform offerings prioritizing merchant impact to proactively detect any issues and inform merchants quickly.

What you’ll do

  • You will participate in 24/7 on-call monitoring. Observe platform and merchant performance and detect any issues proactively to mitigate risks in partnership with Engineering teams.
  • Be an expert in communicating with merchants real time during an incident and present the most accurate and updated information to keep them informed.
  • Working together with Operations, Product, Engineering, and reliability teams to integrate, grow, and continuously improve our monitoring strategy and increase our reliability.
  • Improve operations by leading/project managing initiatives and, or tools—development of automation for effective monitoring. .
  • Investigate alerts and provide feedback to engineering teams to build effective logging and alerts across the platform architecture..
  • Mitigate merchant impact risk by actioning on alerts in partnership with Engineering teams, and contribute to the monitoring playbook by documenting your learnings.
  • Focus on ruthlessly prioritizing, automating, and scaling every aspect of our detection capabilities.

Who you are

  • You have at least 5 to 10 years of experience with incident client communication and platform monitoring operations.
  • You're willing to participate in the on-call rotation and work in a fast-paced, dynamic environment.
  • You have experience with monitoring and logging tools like Prometheus, Grafana, ELK Stack, etc.
  • You have experience with observability platforms like Datadog, Dynatrace, Splunk.
  • You have excellent analytical and problem-solving skills, with the ability to analyze complex systems and spot the root cause of issues

View Assessment Process

Think you'll be a good fit?