OpenAI

Reliability/DFX Engineer

Job Description

Posted on: 
2025-10-09

Responsibilities

  • Oversee DFX architecture, implementation, and execution in silicon from concept to deployment.
  • Build system-level reliability models based on empirical data.
  • Collaborate with chip and platform design teams to implement DFX features.
  • Partner with hardware health and platform design teams to improve reliability.
  • Serve as the DFX/reliability champion to align industry ecosystem with OpenAI’s requirements.
  • Propose high-ROI features to enhance reliability and fault tolerance.
  • Analyze data to drive continuous improvements across the stack.

Job Requirements

  • BS with 15+ years, MS with 10+ years, or PhD with 3+ years of relevant experience.
  • Hands-on experience with RTL design and DFT is required.
  • Detailed understanding of ML chip and platform architecture is necessary.
  • Strong fundamentals in reliability modeling and empirical data analysis.
Apply now

More job openings