DevOps Engineer GroudOS (REF5181Z)
3/3/2026
The role involves managing the deployment, observability, and lifecycle of thousands of remote mini-PCs and associated Cloud components, including executing reliable Over-The-Air (OTA) updates across the massive edge fleet. Key tasks include configuring NATS JetStream, setting up OpenTelemetry for monitoring, architecting resilient systems against fleet reconnection events, and managing security aspects like secrets and mTLS communication.
Working Hours
40 hours/week
Company Size
5,001-10,000 employees
Language
English
Visa Sponsorship
No
Company Description
As Hungary’s most attractive employer in 2025 (according to Randstad’s representative survey), Deutsche Telekom IT Solutions is a subsidiary of the Deutsche Telekom Group. The company provides a wide portfolio of IT and telecommunications services with more than 5300 employees. We have hundreds of large customers, corporations in Germany and in other European countries. DT-ITS recieved the Best in Educational Cooperation award from HIPA in 2019, acknowledged as the the Most Ethical Multinational Company in 2019. The company continuously develops its four sites in Budapest, Debrecen, Pécs and Szeged and is looking for skilled IT professionals to join its team.
Job Description
Job Description:
Are you an expert in deploying, observing, and maintaining distributed fleets of devices? Do you build infrastructure that scales effortlessly and recovers automatically from mass reconnections? Join our team to oversee the operational backbone of our edge-to-cloud ecosystem. If you love automating complex deployments and diving deep into observability metrics, you are the right fit for us!
Project Description:
Our project, GroundOS, is not just another screen manager. It is a next-generation Universal Display System (UDS) built to power the future of global mobility. We are building an "Operating System for Reality" that orchestrates massive, data-driven signage networks across critical infrastructure, from major international airports to sprawling public transport systems. GroundOS moves beyond static displays; it uses a state-of-the-art digital twin to process and react to real-time operational data. To guarantee continuous operation, the platform features a resilient, offline-first edge architecture that ensures screens keep running smoothly even if the network fails. Join us to blend high-performance Rust edge computing with modern TypeScript cloud services and help us set a new global standard for how hundreds of millions of passengers experience their journey.
Tasks
- Manage the deployment, observability, and lifecycle of thousands of remote mini-PCs alongside Cloud components.
- Execute Over-The-Air (OTA) updates reliably across a massive edge fleet.
- Configure and manage NATS JetStream, including Leaf Nodes for edge-cloud bridging, stream retention, and cluster HA.
- Setup and maintain tracing and metrics using OpenTelemetry to monitor cross-system health.
- Architect resilient systems capable of withstanding mass fleet reconnection events (thundering herd) without performance loss.
- Manage secrets, certificates, and secure mTLS communication between edge devices and the central control plane.
- Lead incident management and root-cause analysis for fleet-wide issues.
- Design scalable operations workflows to keep maintenance effort constant as the fleet grows.
Qualifications
Qualifications:
- Extensive experience with infrastructure automation and remote fleet management.
- High proficiency in containerization (Docker), specifically optimized for edge devices (multi-arch builds, ARM/x64).
- Deep operational knowledge of NATS JetStream or similar high-throughput event brokers.
- Strong background in observability, tracing, and metric collection.
- Solid understanding of Zero-Trust security architectures and certificate management.
- Ability to remain calm and analytical during high-pressure incident response situations.
- Expert knowledge of agile development
- Solid knowledge of Scrum
- Experience working in agile projects and teams
- Excellent English skills, both written and spoken (B2–C1)
- Excellent technical and analytical skills, as well as problem-solving abilities
- Ability to handle stressful situations and work independently
Advantages:
- Experience with Google Clouds GKE for the central cloud control plane.
- Prior experience with specific edge orchestration tools
Additional Information
* Please be informed that our remote working possibility is only available within Hungary due to European taxation regulation.
Please let Deutsche Telekom IT Solutions know you found this job on InterviewPal. This helps us grow!
We scan and aggregate real interview questions reported by candidates across thousands of companies. This role already has a tailored question set waiting for you.
Generate a resume, cover letter, or prepare with our AI mock interviewer tailored to this job's requirements.