

Production Systems Engineer, AI Systems
Location
Menlo Park, CA
Level
Full-Time
Department
Software Development
Type
Salary
$104,000 - $155,000
Job Description
Posted on:
2025-03-24
Responsibilities
- Support new AI platform introduction by driving scale up and scale out interface integration.
- Create experiments and tooling to detect and diagnose hardware/firmware/software health issues.
- Develop understanding of AI workload traffic for new product introduction.
- Enable hacks for future technology explorations in AI workloads.
- Troubleshoot and diagnose system failures, isolating components while collaborating with partners.
- Develop visibility through data visualization and implement solutions for hardware health issues.
- Use production experience to drive quality improvements with teams.
Job Requirements
- Bachelor's degree in Computer Science, Computer Engineering, or a relevant technical field.
- 2+ years of experience in Network ASIC/Platform development or related areas.
- Knowledge of server architecture and components.
- Experience with Linux and TCP/IP, including iperf.
- Hands-on troubleshooting and debugging experience.
- Preferred experience with Network Interface Cards (NICs) and RDMA/RoCE.
- Experience with large-scale deployments and Python scripting.