Question
FULL_TIME
5-10

Principal Researcher - GPU Performance

11/28/2025

Design, implement, and optimize GPU kernels for complex computational workloads such as AI inferencing. Collaborate with other researchers to improve model performance and document optimization strategies.

Working Hours

40 hours/week

Company Size

10,001+ employees

Language

English

Visa Sponsorship

No

About The Company
Every company has a mission. What's ours? To empower every person and every organization to achieve more. We believe technology can and should be a force for good and that meaningful innovation contributes to a brighter world in the future and today. Our culture doesn’t just encourage curiosity; it embraces it. Each day we make progress together by showing up as our authentic selves. We show up with a learn-it-all mentality. We show up cheering on others, knowing their success doesn't diminish our own. We show up every day open to learning our own biases, changing our behavior, and inviting in differences. Because impact matters. Microsoft operates in 190 countries and is made up of approximately 228,000 passionate employees worldwide.
About the Role
Design, implement, and optimize GPU kernels for complex computational workloads such as AI inferencing. Research and develop novel optimization techniques for generation of GPU kernels. Profile and analyze kernel performance using advanced diagnostic tools. Generate automated solutions for kernel optimization and tuning. Collaborate with other researchers to improve model performance. Document optimization strategies and maintain performance benchmarks. Contribute to the development of internal GPU computing frameworks. Doctorate in relevant field AND 3+ years related research experience OR equivalent experience. These requirements include but are not limited to the following specialized security screenings: 3+ years of experience in GPU architecture, memory hierarchies, parallel computing and algorithm optimization. 3+ years of experience in GPU programming and optimization; familiar knowledge of CUDA, ROCm, Triton, PTX, CUTLASS, or similar GPU programming frameworks including performance profiling and optimization tools. Experience with machine learning frameworks (PyTorch, TensorFlow) Working knowledge on Large Language Model architecures Publication record in relevant conferences or journals (MLSys, NeurIPS, ICML, ICLR, AISTATS, ACL, EMNLP, NAACL, ISCA, MICRO, ASPLOS, HPCA, SOSP, OSDI, NSDI, etc.)
Key Skills
GPU ArchitectureMemory HierarchiesParallel ComputingAlgorithm OptimizationGPU ProgrammingCUDAROCmTritonPTXCUTLASSPerformance ProfilingMachine LearningPyTorchTensorFlowLarge Language ModelsPublication Record
Categories
TechnologyScience & ResearchEngineeringData & Analytics
Apply Now

Please let Microsoft know you found this job on PrepPal. This helps us grow!

Apply Now
Get Ready for the Interview!

Do you know that we have special program that includes "Interview questions that asked by Microsoft?"

Elevate your application

Generate a resume, cover letter, or prepare with our AI mock interviewer tailored to this job's requirements.