





Prof. Hyesoon Kim











# Intel's Embedded Systems

Atom Processors



 32-bit, Hyper-threading, low-power, inorder processors

http://en.wikipedia.org/wiki/File:Atom\_Z520\_vs\_1Cent.JPG









| Intel® Atom™ Processor for Tablets  | s and Fanless Netbooks                   |
|-------------------------------------|------------------------------------------|
|                                     | INTEL* ATOM™ PROCESSOR Z670              |
| Processor Frequency                 | 1.50 GHz                                 |
| Number of Cores / Threads           | 1/2                                      |
| Intel* Smart Cache                  | 512 KB L2                                |
| Graphics                            | Intel® Graphics Media Accelerator 600    |
| Intel* 64 Architecture <sup>2</sup> | No                                       |
| Integrated Memory Controller        | Yes                                      |
| Memory Support                      | Single-channel DDR2 800 MT/s, up to 2 GB |
| Manufacturing Process               | 45nm                                     |
| Processor Package Size              | 13.8mm x 13.8mm                          |
| Intel* Express Chipset              | SM35                                     |
| Chipset Package Size                | 14mm x 14mm                              |

| desktop                                                 |                                                         |  |  |  |  |
|---------------------------------------------------------|---------------------------------------------------------|--|--|--|--|
| INTEL® ATOM™<br>PROCESSOR D525                          | INTEL® ATOM™<br>PROCESSOR D425                          |  |  |  |  |
| 1.80 GHz                                                | 1.80 GHz                                                |  |  |  |  |
| 2/4                                                     | 1/2                                                     |  |  |  |  |
| 2 x 512 KB L2                                           | 512 KB L2                                               |  |  |  |  |
| Intel® Graphics Media<br>Accelerator 3150               | Intel* Graphics Media<br>Accelerator 3150               |  |  |  |  |
| Yes                                                     | Yes                                                     |  |  |  |  |
| Yes                                                     | Yes                                                     |  |  |  |  |
| Single-Channel DDR3<br>and DDR2 800 MHz,<br>up to 4 GB³ | Single-Channel DDR3<br>and DDR2 800 MHz,<br>up to 4 GB³ |  |  |  |  |
| 45nm                                                    | 45nm                                                    |  |  |  |  |
| 22mm x 22mm                                             | 22mm x 22mm                                             |  |  |  |  |
| NM10                                                    | NM10                                                    |  |  |  |  |
| 17mm x 17mm                                             | 17mm x 17mm                                             |  |  |  |  |









#### **Atom on Phone Processors**











#### **Die Photo**



- L1 (32KB I-cache, 24K D-cache)
- L2 cache (512KB)
- Hardware prefetcher
- In-order processor









#### **Data Path**



- 16-stage pipeline
- 128-bit data path
- Hyper-threading (SMT): 2-way support







## **Power Management**

- Idle power management
- Aggressive power gating
- Speed step
- C-state/C-mode







### Power 101: Power Dissipation in CMOS



$$P_{tot} = P_{dyn} + P_{sta} = C_L V_{dd}^2 f + V_{dd} I_{leak}$$









## Speed step = Dynamic Frequency Scaling

$$P_{tot} = P_{dyn} + P_{sta} = C_L V_{dd}^2 f + V_{dd} I_{leak}$$

- $f \propto V \rightarrow P \propto V^3$
- Set different frequency → change power consumption
- Save idle power
- P-states

**Table 3.6** Intel® Atom™ Processor N450 P-states

| Performance State | Clock Speed |
|-------------------|-------------|
| РО                | 1.67 GHz    |
| P1                | 1.33 GHz    |
| P2                | 1.00 GHz    |



Thermal control (dynamic thermal management)







# **Power Gating**



- Sleep signal to turn off the supply voltage
- Save both dynamic power and leakage power

Microarchitectural Techniques for Power Gating of Execution Units, Hu et al.









# **Power Gating with Ground**







- Longer wake up time, lower leakage power consumption
- Provide multiple sleep mode









## **Clock Gating**

- Adds additional logic to a circuit to prune the clock tree
- Simplest gating mechanism
- Reduce dynamic power consumption
- Power up delay (timing prot
- Variations in current











### **C-states**

| Mode   | Name                  |                                                                    |
|--------|-----------------------|--------------------------------------------------------------------|
| C0     | Operating State       | CPU fully turned on                                                |
| C1     | Halt                  | Stop CPU clock via software but interface are running              |
| C1E    | Enhanced Halt         | Stop CPU clock via software, reduce voltage, interface are running |
| C2E    | Extend stop Grant     | ~=C1E but via hardware                                             |
| C3     | Sleep                 | Stop all CPU internal clocks                                       |
| C4     | Deeper Sleep          | Reduce CPU voltage                                                 |
| C4E/C4 | Enhanced Deeper Sleep | Reduce CPU voltage and turns off the memory cache                  |
| C6     | Deep Power Down       | Reduce CPU voltage close to 0                                      |









## **Dynamic Caching Sizing**



Control number of Ways











## **Before Going to Sleep**



: dirty block

- Is this OK?
- What about Multiple caches?



 Flush the cache → generate write back requests → sleep









#### **Atom's Power States**

- Atom can put any thread into any C1, C2,or C4 states
- C4 or C4E support dynamic cache sizing
- PLL (Phase Locked Loop): interface logic

**Table 3.5** Intel® Atom™ Processor Z6xx Power States

|             | CO HFM | CO LFM | CO ULFM | C1/C2   | C4            | C6   |
|-------------|--------|--------|---------|---------|---------------|------|
| Core clock  | On     | On     | On      | Off     | Off           | Off  |
| PLL         | On     | On     | On      | On      | Off           | Off  |
| L1 cache    | Active | Active | Active  | Flushed | Flushed       | Off  |
| L2 cache    | Active | Active | Active  | Active  | Partial flush | Off  |
| Wakeup time | Active | Active | Active  | Short   | Medium        | Long |









## **Friday**

- Progress Meeting
- Each team 6 min

