Bespoke processors
Sampath VP
ASIC/FPGA Design Professional | SoC Architecture | Technology Evangelist | IEEE Reviewer|
Bespoke processors are used for applications with ultra-low area and power constraints. Low-power processors are widely-used and are expected to power a large number of emerging applications. Such processors tends to be simple, run relatively simple applications, and do not support non-determinism which makes symbolic simulation-based technique a good fit for such processors.
Bespoke processor design that tailors a general purpose processor IP to a target application by removing all gates from the design that can never be used by the application. A bespoke processor, tailored to a target application, must be functionally-equivalent to the original processor when executing the application. A bespoke implementation of a processor design should retain all the gates from the original processor design that might be needed to execute the application. Any gate that could be toggled by the application and propagate its toggle to a state element or output port performs a necessary function and must be retained to maintain functional equivalence.
Conversely, any gate that can never be toggled by the application can safely be removed, as long as each fan-out location for the gate is fed with the gate’s constant output value for the application. Removing constant gates for an application could result in significant area and power savings and, unlike conventional energy saving techniques, will introduce no performance degradation.
Bespoke processors can achieve significantly lower area and power than their general purpose counterparts without any performance degradation since removed gates are never used by an application. In addition, gate removal can expose additional timing slack that can be exploited to increase area and power savings or performance of a bespoke design. Bespoke processor design reduces area and power by 62% and50%, on average, while exploiting exposed timing slack improves average power savings to 65%.
Area and power-constrained microprocessors and microcontrollers are the most abundant type of processor produced and used today, with projected deployment growing exponentially in the near future. This explosive growth is fueled by the emerging area- and power-constrained applications, such as the internet-of things, wearables, implantable, and sensor networks.
The microprocessors and microcontrollers used in these applications are designed to include a wide variety of functionalities in order to support a large number of diverse applications with different requirements. On the other hand, the embedded systems designed for these applications typically consist of one application or a small number of applications, running over and over on a general purpose processor for the lifetime of the system. Given that a particular application may only use a small subset of the functionalities provided by a general purpose processor, there may be a considerable amount of logic in a general purpose processor that is not used byan application.
Cost concerns drive many of the above applications to use general-purpose microprocessors and microcontrollers instead of much more area- and power-efficient ASICs, since, among other benefits, development cost of microprocessor IP cores can be amortized by the IP core licensor over a large number of chip makers and licensees. In fact, ultra-low-area- and power-constrained microprocessors and microcontrollers powering these applications are already the most widely used type of processing hardware in terms of production and usage in spite of their well-known inefficiency compared to ASIC and FPGA-based solutions. Given this mismatch between the extreme area and power constraints of emerging applications and the relative inefficiency of general-purpose microprocessors and microcontrollers compared to their ASIC counterparts, there exists a considerable opportunity to make microprocessor-based solutions for these applications much more area- and power-efficient. One big source of area inefficiency in a microprocessor is that a general purpose microprocessor is designed to target an arbitrary application and thus contains many more gates than what a specific application needs. Also, these unused gates continue to consume power, resulting in significant power inefficiency. While adaptive power management techniques (e.g., power gating help to reduce power consumed by unused gates, the effectiveness of such techniques is limited due to the coarse granularity at which they must be applied, as well as significant implementation overheads such as domain isolation and state retention. These techniques also worsen area inefficiency.