Question
Full-time
5-10

Mercury L2/L3 Application Support Engineer

2/4/2026

The role involves owning and restoring incidents via ServiceNow, triaging issues based on impact and priority, and performing deep, engineering-level investigations for root cause analysis and permanent fixes.

Working Hours

40 hours/week

Company Size

10,001+ employees

Language

English

Visa Sponsorship

No

About The Company
For over two decades, we have been harnessing technology to drive meaningful change. By combining world-class engineering, industry expertise and a people-centric mindset, we consult and partner with our customers to create technological solutions that drive innovation and transform businesses. Working side by side with leading brands, we build strategies, products and solutions tailored to unique needs, regardless of industry, region or scale. From ideation to production, we support our customers through every step of their digital transformation journey, creating dynamic platforms and intelligent digital experiences across various industries.
About the Role

Company Description

Technology is our how. And people are our why. For over two decades, we have been harnessing technology to drive meaningful change.
 
By combining world-class engineering, industry expertise and a people-centric mindset, we consult and partner with leading brands from various industries to create dynamic platforms and intelligent digital experiences that drive innovation and transform businesses.
 
From prototype to real-world impact - be part of a global shift by doing work that matters.

Job Description

Key responsibilities (combined L2/L3)

Incident management and restoration (L2-led, L3 support)

  • Own incidents/requests assigned via ServiceNow.
  • Triage with structure: impact, scope, priority, user journey breakpoint.
  • Investigate using logs/metrics/traces and correlation identifiers.
  • Apply approved recovery actions (safe reprocess/retry where documented).
  • Identify ownership (Mercury vs dependency) and route correctly.
  • Provide clear updates: impact, actions, evidence-based ETAs.
  • Escalate with a complete diagnostic pack (timeline, evidence, repro, what’s ruled out).

Deep investigation, RCA, and permanent fix (L3-led)

  • Perform engineering-level analysis across code/config/data/environment/dependencies.
  • Reproduce issues using lower environments where feasible.
  • Produce RCAs with prevention actions (detection gaps, fixes, safeguards).
  • Drive permanent remediation: code/config/data fixes and controlled hotfixes.
  • Improve resilience: timeouts, retries/backoff, idempotency, DLQ handling, circuit-breaker style protections.
  • Reduce operational toil through automation and better tooling.

Observability, change, and operational readiness (L2 + L3)

  • Operate and improve monitoring and alert playbooks across CloudWatch, Splunk, New Relic.
  • Improve alert quality (reduce noise, tune thresholds, add actionable signals).
  • Support releases/changes: risk review, rollback/validation readiness, early-life monitoring, post-release checks.
  • Support backup/recovery understanding and restore verification (DynamoDB/Aurora via AWS managed backups).
  • Maintain runbooks/known errors/FAQs and create fast-diagnosis assets (queries, dashboards, checklists).
  • Coordinate with Product Owners, Tech Leads, and dependent teams during incidents.

Qualifications

Required technical skills (must have)

  • Proven L2/L3 production support experience.
  • Strong troubleshooting using logs/metrics/traces.
  • Strong ServiceNow incident ownership and discipline.
  • AWS operational knowledge: Lambda (timeouts/retries/concurrency), DynamoDB (throttling/latency/capacity), Aurora PostgreSQL (connections/slow queries/failover awareness).
  • Strong API troubleshooting: HTTP, REST/GraphQL basics, JSON, auth/token patterns.
  • Clear communication under pressure to business and technical audiences.

Preferred / nice to have

  • Node.js debugging and production engineering.
  • APM + distributed tracing experience.
  • Event/onsite constraints support (device/browser variability, offline/edge).
  • Reliability automation/tooling experience.
  • Change governance and release engineering exposure.

Additional Information

 

 

At Endava, we’re committed to creating an open, inclusive, and respectful environment where everyone feels safe, valued, and empowered to be their best. We welcome applications from people of all backgrounds, experiences, and perspectives—because we know that inclusive teams help us deliver smarter, more innovative solutions for our customers. Hiring decisions are based on merit, skills, qualifications, and potential. If you need adjustments or support during the recruitment process, please let us know.

Key Skills
Incident ManagementL2/L3 SupportServiceNowLog AnalysisMetrics AnalysisTrace AnalysisAWSLambdaDynamoDBAurora PostgreSQLAPI TroubleshootingHTTPRESTGraphQLCommunicationRCA
Categories
TechnologyEngineeringSoftwareCustomer Service & Support
Apply Now

Please let Endava know you found this job on InterviewPal. This helps us grow!

Apply Now
Prepare for Your Interview

We scan and aggregate real interview questions reported by candidates across thousands of companies. This role already has a tailored question set waiting for you.

Elevate your application

Generate a resume, cover letter, or prepare with our AI mock interviewer tailored to this job's requirements.