Question
Full-time
Remote
5-10

ML Engineer - AI Infra Group

7/15/2025

Architect and build scalable ML infrastructure for training and inference workloads across heterogeneous compute environments. Collaborate with AI researchers, data scientists, and product teams to understand their workflows and translate them into reusable platform services and APIs.

Working Hours

40 hours/week

Company Size

51-200 employees

Language

English

Visa Sponsorship

No

About The Company
Dream is a pioneering AI cybersecurity company delivering revolutionary defense through artificial intelligence. Our proprietary AI platform creates a unified security system safeguarding assets against existing and emerging generative cyber threats. Dream's advanced AI automates discovery, calculates risks, performs real-time threat detection, and plans an automated response. With a core focus on the "unknowns," our AI transforms data into clear threat narratives and actionable defense strategies. Dream's AI cybersecurity platform represents a paradigm shift in cyber defense, employing a novel, multi-layered approach across all organizational networks in real-time. At the core of our solution is Dream's proprietary Cyber Language Model, a groundbreaking innovation that provides real-time, contextualized intelligence for comprehensive, actionable insights into any cyber-related query or threat scenario.
About the Role

At Dream, we redefine cyber defense vision by combining AI and human expertise to create products that protect nations and critical infrastructure. This is more than a job; It’s a Dream job. Dream is where we tackle real-world challenges, redefine AI and security, and make the digital world safer. Let’s build something extraordinary together. 


Dream's AI cybersecurity platform applies a new, out-of-the-ordinary, multi-layered approach, covering endless and evolving security challenges across the entire infrastructure of the most critical and sensitive networks. Central to our Dream's proprietary Cyber Language Models are innovative technologies that provide contextual intelligence for the future of cybersecurity.  


At Dream, our talented team, driven by passion, expertise, and innovative minds, inspires us daily.   We are not just dreamers, we are dream-makers. 


The Dream Job

We are on an expedition to find you, someone who is passionate about creating intuitive, out-of-this-world production-grade AI infrastructure. This group builds scalable, high-performance AI systems for internal users and external customers, designed to run seamlessly across cloud and on-premise environments using the latest hardware advancements. 


The Dream-Maker Responsibilities

  • Design and optimize LLM serving infrastructure using inference engines (vLLM, TensorRT-LLM, Triton Inference Server)
  • Implement and tune distributed inference strategies including tensor parallelism, pipeline parallelism, and multi-node serving
  • Develop and apply model compression techniques to optimize cost, latency, and memory footprint while maintaining model quality
  • Build self-service fine-tuning platforms that enable data scientists to run experiments (LoRA, QLoRA, full fine-tuning) in a standardized, reproducible, and governed manner
  • Optimize inference performance through batching strategies, KV-cache tuning, and speculative decoding
  • Develop reusable APIs, abstractions, and platform services for model deployment, scaling, and lifecycle management
  • Collaborate with AI researchers and product teams to productionize models and meet latency/throughput requirements
  • Evaluate and benchmark new model architectures, compression methods, and serving frameworks

The Dream Skill Set

  • 5+ years of experience in software engineering or ml engineering with significant focus on ML systems or backend infrastructure
  • Strong proficiency in Python and deep learning frameworks (PyTorch)
  • Hands-on experience with LLM inference engines (vLLM, TensorRT-LLM, Triton Inference Server)
  • Deep understanding of transformer architectures and LLM-specific optimizations (attention mechanisms, KV-cache, quantization techniques like GPTQ, AWQ, GGUF)
  • Experience with distributed training/fine-tuning frameworks (Ray, DeepSpeed, FSDP)
  • Ability to build developer-facing tools and platforms with clear APIs and documentation
  • Understanding of GPU performance profiling and optimization
  • Familiarity with LLM evaluation methodologies and benchmarking

Never Stop Dreaming...

If you think this role doesn't fully match your skills but are eager to grow and break glass ceilings, we’d love to hear from you! 


Requirements

null
Key Skills
Distributed SystemsMachine LearningPythonSoftware EngineeringKubernetesAirflowRaySparkGPU InfrastructureContainerizationCloud-native ArchitecturesML WorkflowsModel TrainingModel EvaluationInference Pipelines
Apply Now

Please let Dream know you found this job on PrepPal. This helps us grow!

Apply Now
Get Ready for the Interview!

Do you know that we have special program that includes "Interview questions that asked by Dream?"

Elevate your application

Generate a resume, cover letter, or prepare with our AI mock interviewer tailored to this job's requirements.