Google

System Hardware Reliability Engineer, TPU Systems

Job Description

Posted on: 
2026-02-18

Responsibilities

  • Lead system hardware reliability efforts and identify potential failure-inducing factors.
  • Define and manage reliability needs for TPU system deployments at customer data centers.
  • Monitor and improve times to repair and system hardware availability.
  • Conduct failure analysis to identify root causes and create actionable insights.
  • Collaborate with external partners and develop in-house testing capabilities.
  • Support early detection and actioning of unexpected system failure trends.
  • Engage with cross-functional internal groups to enhance hardware reliability.

Job Requirements

  • Bachelor's degree in Electrical, Mechanical, Industrial, Materials, or a related field.
  • 8 years of experience in manufacturing.
  • Preferred Master's degree or PhD in a related engineering field.
  • Experience in manufacturing assembly processes and product launches.
  • Technical leadership and project management experience.
  • Proficiency in statistical analysis and reliability statistics.
  • Knowledge of hardware systems, particularly solder joint reliability and PCBA.
Apply now

More job openings