Senior Linux & Infrastructure IT Engineer
2/7/2026
Operate and scale hybrid AWS and on-prem Linux compute infrastructure for chip design and verification workloads. Own day-to-day reliability, performance tuning, capacity planning, and incident response.
Working Hours
40 hours/week
Company Size
51-200 employees
Language
English
Visa Sponsorship
No
About the Role
We are a fast-growing semiconductor startup building next-generation silicon. Our design and verification pipelines rely on large-scale Linux compute infrastructure spanning AWS and on-prem environments.
We are seeking a senior, hands-on Cloud & Infrastructure IT Engineer to own the reliability, performance, and automation of our mission-critical EDA platforms. You will work directly with chip design teams to ensure our compute environments are fast, stable, secure, and ready to scale.
Requirements
What You’ll Do
- Operate and scale hybrid AWS + on-prem Linux compute infrastructure for chip design and verification workloads.
- Own day-to-day reliability, performance tuning, capacity planning, and incident response.
- Build and maintain AWS environments using Terraform and Ansible.
- Automate provisioning of VPCs, IAM, EC2, FSx, EBS, S3, VPNs, and security controls.
- Tune Linux systems for CPU-, memory-, and I/O-intensive EDA workloads.
- Operate and optimize grid / job scheduling platforms such as Slurm, LSF, or Grid Engine.
- Design and manage high-throughput storage solutions for simulation pipelines.
- Develop automation and self-service tooling using Python and Bash.
- Implement observability and alerting using Prometheus and Grafana.
- Participate in on-call rotation and lead root-cause analysis for production incidents.
Required Qualifications
- AWS: VPC, EC2, IAM, FSx, EBS, S3, VPN, security controls
- Infrastructure as Code: Terraform, Ansible
- Linux / HPC: Kernel, filesystem, and network performance tuning
- Schedulers: Slurm / LSF / Grid Engine
- Automation: Python, Bash
- Observability: Prometheus, Grafana
- CI/CD: GitHub Actions / GitLab CI
Requirements
- 7+ years of hands-on experience operating large-scale Linux infrastructure.
- Strong experience managing AWS production environments.
- Advanced proficiency with Terraform, Ansible, Python, and Bash.
- Deep understanding of networking, storage, and Linux internals.
- Comfortable owning business-critical systems in a fast-moving startup.
- Experience supporting semiconductor / EDA / HPC workloads.
Preferred
- Exposure to Azure or GCP.
- Experience with cloud cost optimization / FinOps.
Please let Retym know you found this job on InterviewPal. This helps us grow!
We scan and aggregate real interview questions reported by candidates across thousands of companies. This role already has a tailored question set waiting for you.
Generate a resume, cover letter, or prepare with our AI mock interviewer tailored to this job's requirements.