OpenAI

Reliability/DFX Engineer

Job Description

Posted on: 
2026-01-12

Responsibilities

  • Oversee DFX architecture, implementation, and execution in silicon from concept to high-volume deployment.
  • Build system-level reliability models to guide DFX and reliability strategy.
  • Collaborate with chip and platform architecture/design teams to implement DFX features.
  • Partner with hardware health and platform design teams to improve reliability and fault tolerance.
  • Design experiments and perform data analysis for continuous improvements.
  • Serve as the DFX/reliability champion to align industry ecosystem with OpenAI’s requirements.
  • Propose high-ROI features to enhance reliability and fault tolerance.

Job Requirements

  • BS with 15+ years, MS with 10+ years, or PhD with 3+ years of relevant experience.
  • Hands-on experience with RTL design and DFT.
  • Understanding of ML chip and platform architecture and workload characteristics.
  • Strong fundamentals in reliability modeling and empirical data analysis.
  • Experience in physical implementation and/or silicon ATE is preferred.
  • Ability to collaborate across teams and communicate effectively.
  • Knowledge of DFX methodology and reliability strategies.
Apply now

More job openings