Dissertation Defense
A Holistic Solution for Reliability of 3D Parallel Systems
This event is free and open to the publicAdd to Google Calendar
ABSTRACT: As device scaling slows down, emerging technologies such as 3D integration and carbon nanotube field-effect transistors are among the most promising solutions to increase device density and performance. These emerging technologies offer shorter interconnects, higher performance, and lower power. However, higher levels of operating temperatures and current densities project significantly higher failure rates. Moreover, due to the infancy of the manufacturing process, high variation, and defect densities, chip designers are not encouraged to consider these emerging technologies as a stand-alone replacement for Silicon-based transistors.
The goal of this dissertation is to introduce new architectural and circuit techniques that can work around high-fault rates in the emerging 3D technologies, improving performance and reliability comparable to Silicon. We propose a new holistic approach to the reliability problem that addresses the necessary aspects of an effective solution such as detection, diagnosis, repair, and prevention synergically for a practical solution. By leveraging 3D fabric layouts, it proposes the underlying architecture to efficiently repair the system in the presence of faults. This thesis presents a fault detection scheme by re-executing instructions on idle identical units that distinguishes between transient and permanent faults while localizing it to the granularity of a pipeline stage. Furthermore, with the use of a dynamic and adaptive reconfiguration policy based on activity factors and temperature variation, we propose a framework that delivers a significant improvement in lifetime management to prevent faults due to aging.
Finally, a design framework that can be used for large-scale chip production while mitigating yield and variation failures to bring up Carbon Nano Tube-based technology is presented. The proposed framework is capable of efficiently supporting high-variation technologies by providing protection against manufacturing defects at different granularities: module and pipeline-stage levels.