

# **L16: Power Dissipation in Digital Systems**

#### **Problem #1: Power Dissipation/Heat**



**Courtesy Intel (S. Borkar)** 

## How do you cool these chips??



Шіт

## **Problem #2: Energy Consumption**



Шіт

#### Шiī

#### **Dynamic Energy Dissipation**





## The Transition Activity Factor $\alpha_{0->1}$

| Current<br>Input | Next<br>Input | Output<br>Transition |
|------------------|---------------|----------------------|
| 00               | 00            | 1 -> 1               |
| 00               | 01            | 1 -> 1               |
| 00               | 10            | 1 -> 1               |
| 00               | 11            | 1 -> 0               |
| 01               | 00            | 1 -> 1               |
| 01               | 01            | 1 -> 1               |
| 01               | 10            | 1 -> 1               |
| 01               | 11            | 1 -> 0               |
| 10               | 00            | 1 -> 1               |
| 10               | 01            | 1 -> 1               |
| 10               | 10            | 1 -> 1               |
| 10               | 11            | 1 -> 0               |
| 11               | 00            | 0 -> 1               |
| 11               | 01            | 0 -> 1               |
| 11               | 10            | 0 -> 1               |
| 11               | 11            | 0 -> 0               |



Assume inputs (A,B) arrive at *f* and are uniformly distributed What is the average power dissipation?

$$\alpha_{0->1} = 3/16$$

 $\mathbf{P} = \alpha_{0 \to 1} \mathbf{C}_{\mathbf{L}} \mathbf{V}_{\mathbf{D}\mathbf{D}}^2 f$ 

(II) i i

### **Junction (Silicon) Temperature**







(bolt case to extended metal surface – heat sink)

#### **Intel Pentium 4 Thermal Guidelines**

- Pentium 4 @ 3.06 GHz dissipates 81.8W!
- Maximum  $T_c = 69 \ ^{\circ}C$

Cache

**70°C** 

Illii

Temp

 $(^{\circ}C)$ Temperature deg C) 105.2

> 102.4 99.53 96.7

93.86 91.03 88.2 85.36

82.53

79.7 76.87

- $R_{CA} < 0.23$  °C/W for 50 C ambient
- Typical chips dissipate 0.5-1W (cheap) packages without forced air cooling)

Integer

**& FP** 



ecution

core



| Processor and<br>Core Frequency  | Thermal Design<br>Power <sup>1,2</sup> (W) |  |
|----------------------------------|--------------------------------------------|--|
| Processors with<br>VID=1.500V    |                                            |  |
| 2 GHz                            | 52.4                                       |  |
| 2.20 GHz                         | 55.1                                       |  |
| 2.26 GHz                         | 56.0                                       |  |
| 2.40 GHz                         | 57.8                                       |  |
| 2.50 GHz                         | 59.3                                       |  |
| 2.53 GHz                         | 59.3                                       |  |
| Processors with<br>VID=1.525V    |                                            |  |
| 2 GHz                            | 54.3                                       |  |
| 2.20 GHz                         | 57.1                                       |  |
| 2.26 GHz                         | 58.0                                       |  |
| 2.40 GHz                         | 59.8                                       |  |
| 2.50 GHz                         | 61.0                                       |  |
| 2.53 GHz                         | 61.5                                       |  |
| 2.60 GHz                         | 62.6                                       |  |
| 2.66 GHz                         | 66.1                                       |  |
| 2.80 GHz                         | 68.4                                       |  |
| Processors with<br>multiple VIDs |                                            |  |
| 2 GHz                            | 54.3                                       |  |
| 2.20 GHz                         | 57.1                                       |  |
| 2.26 GHz                         | 58.0                                       |  |
| 2.40 GHz                         | 59.8                                       |  |
| 2.50 GHz                         | 61.0                                       |  |
| 2.53 GHz                         | 61.5                                       |  |
| 2.60 GHz                         | 62.6                                       |  |
| 2.66 GHz                         | 66.1                                       |  |
| 2.80 GHz                         | 68.4                                       |  |
| 3.06 GHz                         | 81.8                                       |  |





$$\mathbf{P} = \alpha_{0 \rightarrow 1} \mathbf{C}_{\mathbf{L}} \mathbf{V}_{\mathbf{D}\mathbf{D}}^2 f$$

- Reduce Transition Activity or Switching Events
- Reduce Capacitance (e.g., keep wires short)
- Reduce Power Supply Voltage
- Frequency is typically fixed by the application, though this can be adjusted to control power

# **Optimize at all levels of design hierarchy**

### Clock Gating is a Good Idea!

Plii



100's of different clocks in a microprocessor

#### **Clock Gating Reduces Energy, does it reduce Power?**

### **Does your GHz Processor run at a GHz?**



Note that there is a difference between average and peak power

 On-chip thermal sensor (diode based), measures the silicon temperature

If the silicon junction gets too hot (say 125 °C), then the activity is reduced (e.g., reduce clock rate or use clock gating)

## **Use of Thermal Feedback**

#### **Power Supply Resonance**





#### Number Representation: Two's Complement vs. Sign Magnitude



### Consider a 16 bit bus where inputs toggles between +1 and -1 (i.e., a small noise input) Which representation is more energy efficient?

Uii

#### Pliī

### **Time Sharing is a Bad Idea**





### **Time Sharing Increases Switching Activity**

### Not just a 6-1 Issue: "Cool" Software ??? IIII



float a [256], b[256]; float pi= 3.14;

```
for (i = 0; i < 255; i++) {
a[i] = sin(pi * i /256);
b[i] = cos(pi * i /256);
}
```

float a [256], b[256]; float pi= 3.14;

for (i = 0; i < 255; i++) {a[i] = sin(pi \* i /256);} for (i = 0; i < 255; i++) {b[i] = cos(pi \* i /256);}

512(8)+2+4+8+16+32+64+128+256 = 4607 bit transitions 2(8)+2(2+4+8+16+32+64+128+256) = 1030 transitions

L16: 6.111 Spring 2006

Introductory Digital Systems Laboratory







- Balancing paths reduces glitching transitions
- Structures such as multipliers have lot of glitching transitions
- Keeping logic depths short (e.g., pipelining) reduces glitching

## **Reduce Supply Voltage : But is it Free?**





 $V_{DD}$  from 2V to 1V, energy  $\downarrow$  by x4, delay  $\uparrow$  x2



#### **Trade Area for Low Power**





#### **Exploit Time Varying Algorithmic Workload To Vary the Power Supply Voltage**

#### **Dynamic Voltage Scaling (DVS)**





Шт



#### **DVS on a Processor**





# μOS selects appropriate clock frequency based on workload and latency constraints

Introductory Digital Systems Laboratory



#### Hardware vs. Software





Courtesy of R. Brodersen, J. Rabaey, TI, ARM/StrongARM

**Energy/Operation** 



### **Energy Efficiency of Software**





#### "Software" Energy Dissipation has Large Overhead



#### **Trends: Leakage and Power Gating**







#### **MEMS Generator**



Jose Mur Miranda/ Jeff Lang

#### Vibration-to-Electric Conversion

**~ 10**μ₩

L16: 6.111 Spring 2006

#### **Power Harvesting Shoes**



Joe Paradiso (Media Lab)

After 3-6 steps, it provides 3 mA for 0.5 sec

~10mW