Log in Register

Login to your account

Username
Password *
Remember Me

Create an account

Fields marked with an asterisk (*) are required.
Name
Username
Password *
Verify password *
Email *
Verify email *

Published on Chip Design Magazine (Article Link)

Power Insanity: Front-to-Back

 By Solaiman Rahim, PhD, Atrenta Inc., and Jason Xing, ICScape Inc.

 The Power Insanity Problem – Earlier is Better!

Solaiman Rahim, PhD
Senior Director R&D, 
SpyGlass Power

Atrenta Inc.
San Jose, CA

The recent boom 

in the mobile, gaming, automotive and other hi-tech consumer electronics markets has created the need to run multiple applications on system-on-chip (SoC) devices faster than ever before. This trend has created a major challenge to extend battery life for handheld devices or reduce cooling costs for wired devices. Software and hardware engineers are constantly looking for new methods to reduce power. Traditionally, power optimization has been done at the physical design stage, mainly by implementing clock gating, using different VT libraries, performing cell sizing and so forth, all at synthesis or place and route.

It is now mandatory to perform low power analysis during the SoC architecture planning or RTL coding stages to design for the shutdown of IP subsystems during different modes of operation to reduce leakage power. Accurate RTL power estimation plays a key role in identifying such instances as early as possible. Sequential optimization can also be used at RTL to reduce dynamic power.  These techniques leverage formal analysis to identify redundancies in the design at several hundreds of sequential depth and exploit them to identify new clock gating opportunities or improve the efficiency of existing clock gating.

Sequential optimization mainly identifies two different types of redundancy conditions.  The first one is called ODC, which stands for “observability don’t care condition.” This is the condition for which the output of a flop is not observable for a certain number of clock cycles and this condition could be used to gate registers that are not shut down. 

The second one is called STC, which stands for “stability condition.” This is the condition for which the input of a flop is stable for a certain number of clock cycles and could be used to shut it down.

Memory power is also an important component of the design’s power. The memory controller might sometimes generate redundant “write” or “read” signals resulting in wasted power. Sequential optimization using ODC and STC techniques can also be used to detect and eliminate such redundancies. 

With the move to the 28nm technology node and below, memories come with a lower leakage mode (also called sleep mode). Designers can distinguish several types of sleep modes, the “light sleep” mode, where the content of the memory is retained and or the “deep sleep” mode, where the content of the memory is lost.  If the memory is put into “light sleep” or “deep sleep” for a certain number of clock cycles, the memory enters a low leakage mode which can lead to substantial power savings. As entering and exiting the “sleep mode” leads to a rush of current, which in turn increases the power consumed, it is important to put the memory into sleep mode for an N number of cycles to compensate the rush of current and effectively save power.

While use of the “deep sleep” mode can only be done at the architectural level, sequential formal techniques at RTL can be used to identify the state(s) for which the memory is not accessed for N cycles, allowing the memory to be put into “light sleep” mode. Identifying these sequential conditions is only the first step to effectively reduce power. 

The second step is to estimate if the new gating opportunity will save power or not. The power saving is a factor of the logic needed to build the opportunity and the temporal relationship between the signals. This requires accurate power estimation at RTL and the ability to perform differential power computations.  

The third step is the implementation of power reduction opportunities. Sequential optimization introduces additional structures and can result in very complex gating conditions. RTL designers are usually reluctant to perform complex RTL changes to preserve the backward compatibility with physical synthesis to resolve potential engineering change orders (ECOs). This was also the case when “synthesis retiming optimizations” were introduced in the industry almost ten years ago. 

However, with the constant need to build faster chips, “retiming” is now adopted by the industry. The same will certainly apply to sequential optimization for power reduction. In the meantime, it is important to report the power reduction opportunities or automatically generate new power optimized RTL that can be understood by designers.

Finally, the correctness of the changes has to be verified. As the changes are sequential in nature, logic equivalence checking (LEC) tools will fail to verify them. It is important to use a sequential equivalence checking (SEC) tool to verify the functional correctness of the design after implementation of low power optimization. In addition to the functional correctness, it is important to validate that power optimization did not introduce clock domain crossing (CDC) problems. Power intent verification should also be used to verify the correctness of the electrical power structures.

With the constant need for the industry to come up with exciting hi-tech products that are smaller, faster and perform more tasks, the power crisis will not be over soon. However, it can be lessened if chip designers address the problem early in the design flow. At RTL, design teams can make substantial power reduction by using sequential optimization, accurate power estimation as well as validation. These techniques will produce better results when compared to waiting until later in the flow, when physical implementation begins.

Multi-dimensional thinking and optimization is the key to fixing power insanity

Jason Xing
VP, Engineering 
ICScape Inc.
Santa Clara, California

As we all know, 

from the chip design perspective, the power crisis is getting out of control.  The single-dimensional methodology of days gone by no longer works and indeed will make us go crazy.  So we have to stop thinking like that. 

I think multi-dimensional simultaneous power and timing optimization methodologies are needed to effectively balance performance and power for today’s relentlessly-rigorous, consumer-driven chip designs. Let me explain.

In the past, if we wanted to optimize power, we ratcheted performance down a bit.  So there was an easy tradeoff:  if you had to get better “performance” (whether faster speed or smaller area) you increased power consumption.  Usually that’s what happened. Power considerations were definitely secondary.  After all, wasn’t our desktop computer plugged into the wall outlet?

What are we driven by today?  Consumer-driven mobile applications and handheld devices.   Battery life is a fundamental, primary consumer demand. Blazing speed and monstrous functional or “app” capacity is also important but if your battery dies, you have no capability.  Chip designs for mobile consumer applications are now pushing the limits of performance and power.  Performance and power are no longer linked design tradeoff components, in my opinion.   They’ve become orthogonal considerations and get pushed simultaneously when we mull over the design objective.  

Today, if you want to increase chip performance and reduce power, you run into barriers such as signal integrity, manufacturability, and ever smaller area requirements.  For example, if you want to minimize power today, you need to use complex techniques such as clock gating to reduce dynamic power or multiple voltage thresholds so that you can reduce power leakage.  Chip designers are also using complicated techniques such as power domain and multi-threshold CMOS (MTCMOS).   However, those techniques may create a huge number of corners and design modes for physical implementation and timing closure to address.  What’s the problem? They dramatically increase the complexity of the physical design as well as slow down timing ECO convergence for multi-corner and multi-mode design.

As we already see, ECO timing closure is becoming much more difficult, as it becomes an increasingly-greater portion of the physical design process.  So what do we have to do to solve the ECO timing closure problem as it gets to be a greater presence in the physical design process?   

Timing closure tools will need to have multi-dimensional optimization engine technology on top of existing layout engines and timing engines. Such an optimization engine should simultaneously consider power and multi-corner multi-mode constraints. This timing engine approach should offer high correlation with the timing signoff engine.  Plus, the layout engine should have high correlation with the implementation placement and routing engine.   Without this optimization approach, we’ll be stuck trying to fix the multi-dimensional power-performance tradeoff problem with single-dimensional tools.   It simply won't work.

Just as power and performance are orthogonal today, so are timing and power. Tuning one up or down won't materially affect the other, as we used to be able to do.  It’s as if we went from two dimensional tic tac toe (with nine two dimensional squares) to 3D tic tac toe (with nine 3D squares) to a fourth layer (16 3D squares), and on and on.   How do we develop SoC design tools to win as we get new layers, new dimensions added?

The solution is to solve the multi-dimensional problems simultaneously. Contemporary chip designs have advanced quickly from 40 and 28 nm, down to 20, 14, and below.  Low power leakage is becoming the gating signoff criteria for chip designs. Typically multi-VT cell swapping techniques are used for this leakage power optimization. However, when a low leakage power high VT cell is used, we sacrifice chip performance.  We need to minimize power leakage while considering timing slacks. Again, such timing considerations have to cover multi-corner and multi-mode to combat power leakage and the greater power-performance-timing tradeoff problem.

Power insanity is not unique to physical design. Power insanity permeates the entire chip design process and has to be tackled from beginning to end: i.e., from architecture, to RTL design, to synthesis, and through to physical design (Floorplanning, placement, clock tree synthesis, and routing). During any of the design phases, designers and their design tools must consider performance and power simultaneously. Without a rigorously consistent effort, the design will never meet the ever-increasing demand for higher performance and lower power.

 

 

 

 

 

 

 

 

 

 

Jason XingVP, EngineeringICScape Inc.

Jason Xing is co-founder and Vice President of Engineering at ICScape, where he architected the timing, clock and power closure products. Jason has over 15 years EDA research and development experience. In 1997, He joined Sun Labs after receiving his PhD in Computer Science from the University of Illinois at Urbana-Champaign. At Sun Labs, Xing did research on physical and logical concurrent design methodologies and shape-based routing technologies. In 2001, he joined the Sun Microsystems internal CAD development team before he started ICScape in 2005. Jason holds another PhD in Mathematics from University of Louisiana.  

 

 

Solaiman Rahim. Ph.D., Senior Director R&D, SpyGlass Power, Atrenta Inc.
Solaiman Rahim has over 12 years of experience in the field of Electronic Design Automation and has worked on all aspects of Formal Verification, Power, Synthesis and Timing constraint. He is the author of numerous conference papers and journal articles presented at international conferences. He has been granted several patents in the field of Power, Formal Verification and Timing Constraint. He has previously worked at Synplicity, where we was leading the formal verification technology and was one of the early employees at Softknot.  Solaiman holds a Master in Electronic and Computer Science from the French "Grand Ecole" ESEO, a Master in Physics from the University of Provence and earned his Ph.D. in Formal Verification and Synthesis from the LIRMM, Montpellier, France.

 

 

Go to top