Reward Gateway

Application Operations Problem Manager

Reward Gateway  •  €28k - €33k/yr  •  Plovdiv, BG (Hybrid)  •  2 days ago
Apply
AI can make mistakes so check important info. Chat history is never stored.

Job Description

Application Operations Problem Manager

Department: Engineering

Employment Type: Full Time

Location: Plovdiv

Reporting To: Director of Application Operations

Compensation: €28,000 - €33,000 / year

Reward Gateway, part of Edenred, is a global leader in benefits and employee engagement. We help businesses attract, engage, and retain top talent through strategic reward, recognition, and well-being solutions.

Guided by our shared missions - ‘Making the World a Better Place to Work’ and ‘Enriching Connections, For Good’ - we’re committed to transforming workplaces and improving people’s daily lives.

Our team embodies entrepreneurial spirit, innovation, and respect. We push boundaries, speak up, and stay human, fostering a culture where imagination thrives.

Your Role in our Mission:
As Problem Manager, you will own and drive the Problem Management function within the Platform Engineering & Technical Operations (PETO) organisation, reporting directly to the Director of Application Operations. You will play a critical role in reducing the frequency and impact of incidents across our platforms by identifying root causes, managing known errors, and delivering preventative actions that lead to measurable, systemic improvements.

This is a hands-on, high-impact role. You will work at the intersection of operational excellence and engineering quality, partnering closely with L2/L3 support teams, Platform, Infrastructure, and product-aligned Engineering squads to ensure that problems are properly identified, investigated, resolved, and most importantly don't recur.

What’s In It For Me?
A chance to be part of an extremely well established, stable and high growth ‘Unicorn’ SaaS company with plenty of benefits in our employee benefits package, including:
  • Annual Wellness Bonus
  • Monthly Edenred Electronic Food Voucher
  • Udemy: Access for your professional development
  • Flexible Holiday plan & other leave benefits
  • Book Benefit: Professional development books and an additional annual budget for fiction books of your choice
  • Subsidised sports card and many other benefits!
Flexible Hybrid Working: This is a hybrid role that would require presence in office at least twice per week, as agreed.

What You’ll be Doing:

Problem Management Process Ownership
  • Own the end-to-end Problem Management lifecycle in line with ITIL best practice: problem detection, logging, categorisation, prioritisation, investigation, resolution, and closure
  • Maintain and govern the Problem Record backlog in Jira Service Management, ensuring all records are accurate, prioritised, and progressing toward resolution
  • Define and enforce the standards for problem identification, including criteria for reactive problem management (post-incident) and proactive problem management (trend analysis and risk identification)
  • Manage the Known Error Database (KEDB), ensuring it is current, accurate, and actively used by L1/L2 support teams to improve first-contact resolution
Root Cause Analysis (RCA)
  • Lead and facilitate structured RCA sessions following major and recurring incidents, using recognised methodologies (e.g. 5 Whys, Fishbone/Ishikawa, fault tree analysis)
  • Produce high-quality Problem Records and RCA reports that clearly articulate the root cause, contributing factors, timeline, and recommended corrective/preventative actions
  • Ensure RCA outputs translate into tracked, accountable action plans with clear owners, timelines, and success criteria
  • Challenge superficial root cause findings and push for systemic, durable fixes rather than symptomatic workarounds
Proactive Problem Management
  • Analyse incident, change, and event data to proactively identify trends, recurring issues, and systemic risks before they become major incidents
  • Collaborate with Observability and Platform teams to use monitoring signals, error budgets, and SLO breach data as early-warning inputs to the problem management process
  • Contribute to the shift-left support agenda by feeding problem findings into runbooks, playbooks, and operability improvements
Stakeholder Engagement & Reporting
  • Communicate problem status, known errors, and risk exposure clearly to technical and non-technical stakeholders, including engineering leads and senior management
  • Produce regular problem management reporting, including metrics such as: number of open problems by age/severity, incident recurrence rate, time to root cause, and percentage of problems with preventative actions closed on time
  • Present insights and trends to the Director of Application Operations and wider PETO leadership to inform prioritisation decisions and continuous improvement initiatives
Collaboration & Integration
  • Work closely with Incident Management to ensure seamless handoff from major incidents into the problem management process
  • Partner with L2.5/L3 engineering teams to coordinate investigation effort, agree timelines, and remove blockers to root cause resolution
  • Integrate problem management activity into the Service Catalogue and Jira Service Management workflows, ensuring service ownership and escalation paths are respected
  • Contribute to Change Management processes by ensuring known problems and risks are visible to change approvers, reducing the risk of change-induced incidents
Continuous Improvement
  • Continuously assess and improve the Problem Management process itself, maturing capability over time and aligning with evolving ITIL and organisational standards
  • Build and maintain problem management documentation, templates, and guidance to enable consistent, high-quality practice across the PETO organisation
  • Support the development of L2 team capability in recognising and logging potential problems, contributing to the team's progression toward greater autonomy

Experience and Skills You Need in this Role:

Essential
  • Solid, demonstrable experience in an ITIL-aligned Problem Management role, ideally within a fast-paced, product-led technology organisation
  • Strong working knowledge of ITIL Problem Management practices (ITIL 4 Foundation certification or above preferred), including the distinction between reactive and proactive problem management and the role of the KEDB
  • Hands-on experience facilitating RCA sessions using structured methodologies (5 Whys, Fishbone, fault tree analysis, etc.) and translating findings into actionable improvement plans
  • Experience working with Jira Service Management or a comparable ITSM platform to manage problem records, workflows, and reporting
  • Ability to analyse incident and operational data to identify trends and systemic issues, with experience using dashboards or reporting tools to communicate findings
  • Strong written and verbal communication skills, with the ability to produce clear RCA reports and updates for both technical audiences and senior non-technical stakeholders
  • Collaborative working style with experience engaging engineering, infrastructure, and operations teams in problem investigation and resolution
  • Familiarity with Agile ways of working and the ability to integrate ITIL practices within a modern, product-centric engineering environment
Desirable
  • Experience with observability and monitoring tooling (e.g. Datadog, Grafana, PagerDuty) as inputs to proactive problem management
  • Understanding of SLOs, error budgets, and their relationship to operational risk and problem prioritisation
  • Experience contributing to or maintaining a knowledge base (e.g. Confluence), including runbooks and known error documentation
  • Exposure to cloud-native application architectures and API-first platforms
  • ITIL 4 Specialist or Practitioner certification in relevant practices (e.g. Problem Management, Incident Management)
  • Experience with operational metrics and reporting frameworks, including DORA metrics or similar

The Interview Process:

  • Screening call with Talent Acquisition Partner
  • First Stage Interview with the Director of Application Operations & the VP Platform Engineering
At Reward Gateway | Edenred, we are committed to ensuring an inclusive and accessible recruitment process for all candidates. If you have any specific requirements or need reasonable adjustments at any stage of the recruitment journey, please let your Talent Acquisition Partner know. Your needs are important to us, and we want to ensure an equitable experience for every candidate.

Be comfortable. Be you.
We want every employee to feel comfortable bringing their passion, creativity and individuality to work. We value all cultures, backgrounds and experiences, because we believe diversity drives innovation and makes us stronger. Our approach to hiring and building teams is about more than filling roles - it’s about creating an environment where everyone can thrive, feel supported, and contribute to our mission of making the world a better place to work!
Reward Gateway

About Reward Gateway

Since 2006, we’ve helped the most innovative companies and HR leaders transform the employee experience to attract and retain top talent through employee benefits, strategic reward and recognition, wellbeing and much more. Across the globe, over 750 of us work together to make the world a better place to work, and as an ambitious, fast-growth, HR Tech SaaS company we’re flexible, inclusive and keen to meet talented individuals who are passionate about positively impacting the future of work. Clients include American Express, Unilever, Samsung, IBM and McDonald's. For further information, please visit: www.rewardgateway.com

Industry
IT & Software
Company Size
501-1,000 employees
Headquarters
London, GB
Year Founded
2006
Social Media