

# INTEL INNOVATION IN DATA Processing

Alexey Belogortsev | Technical Consultant EMEA

October 1, 2015

### IT Challenges: What to Worry About Next?





### Intel Data Center Hardware Building Blocks





### Server Roadmap

Shipping Future Intel<sup>®</sup> Xeon Phi<sup>™</sup> - Optimized for new highly-parallel oriented 61 cores **Knights Landing** Intel<sup>®</sup> Xeon Phi<sup>™</sup> Coprocessor applications that utilize scale out clusters & highly integrated Co-processor. 1+ TFLOPs. 16GB GDDR Bootable, 3+ TFLOPs, Integrated Omni-Path power-dense cores Intel® Xeon® E7 & Itanium® (not shown) - Targeted at mission **Brickland Platform** critical & storage applications that value a scale up system with E7-8800/4800 v3 large memory capacity and advanced RAS. Itanium for additional 18 cores Future Xeon E7 4S+, AVX2, DDR3/4, 9.6 GT/s, 22nm OSes (HP-UX). **Grantley-EP Platform** Intel<sup>®</sup> Xeon<sup>®</sup> E5 - Targeted at a wide variety of server, storage, and E5-4600 v3 (4S) Future Xeon E5 (4S) networking applications that value a balanced system and 18 cores E5-2600 v3 performance/watt/cost Future Xeon E5 AVX2. DDR4. 9.6 GT/s. 22nm Intel<sup>®</sup> Xeon<sup>®</sup> E3 – Utilized for a variety of workloads that value **Denlow Platform** Future Xeon E3 Platform entry capabilities or integrated graphics including SMB servers, Broadwell 4 cores Future Xeon E3 network security, storage archival, & media streaming AVX2, DDR4, GT3 Gfx, 14nm Intel<sup>®</sup> Xeon<sup>®</sup> D - Targeted at mid-range network, storage, and **Grangeville Platform** Future Xeon D Platform embedded IoT & lightweight web applications that value fast cores Xeon-D 1500 8 cores **Future Xeon D** Integrated 10 GbE & chipset, 14nm & density Intel Atom<sup>™</sup> - Targeted at <u>entry networking</u>, <u>entry storage</u>, & **Edisonville Platform** Future Atom Platform lightweight web applications that value low power & density Atom C2000 8 cores Future Atom Integrated GbE & chipset, Quick Assist,14nm



intel

XEON.

**XEON PH** 

ITANIUM

intel

ATOM

### Intel<sup>®</sup> Xeon Phi<sup>™</sup> Product Family



#### Available Today Knights Corner (KNC)

Intel® Xeon Phi™ x100 Product Family

- 22 nm process
- Coprocessor only
- >1 TF DP Peak
- Up to 61 Cores
- Up to 16GB GDDR5



#### TBA Knights Landing (KNL)

Intel® Xeon Phi™ x200 Product Family

- 14 nm process
- Host Processor & Coprocessor
- >3 TF DP Peak<sup>1</sup>
- Up to 72 Cores
- Up to 16GB HBM
- Up to 384GB DDR4<sup>2</sup>
- High BW memory
- Integrated Fabric

#### **Future**

### Knights Hill (KNH)

3<sup>rd</sup> generation

- 10 nm process
- Integrated Fabric (2<sup>nd</sup> Generation)
- In Planning...



(intel)

All projections are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance.

<sup>1</sup> Over 3 Teraflops of peak theoretical double-precision performance is preliminary and based on current expecations of cores, clock frequency and floating point operations per cycle.

## **INTEL® OMNI-PATH ARCHITECTURE IS CHANGING FABRIC ECONOMICS**



### Minimizes fabric cost, maximizes cluster compute capability

<sup>1</sup> Latency reductions based on Mellanox CS7500 Director Switch and Mellanox SB7700/SB7790 Edge switch product briefs posted on <u>www.Mellanox.com</u> as of July 1, 2015 compared to, compared to, compared to Intel measured port-to-port latency (100ns) calculated from difference between back to back osu\_latency test and osu\_latency test through one switch hop. 10ns variation due to "near" and "far" ports on an Intel® OPA edge switch. All tests performed using Intel® Xeon® E5-2697v3 with Turbo Mode enabled. Cluster configuration is a 1024-node full bisectional bandwidth (FBB) Fat-Tree configuration (3-tier, 5 total switch hops), using a 48-port switch for Intel® Omni-Path cluster and 36-port switch ASIC for either Mellanox or Intel® True Scale clusters <sup>2</sup> Reduction in up to ½ fewer switches claim based on a 1024-node full bisectional bandwidth (FBB) Fat-Tree configuration, using a 48-port switch for Intel® Omni-Path Cluster and 36-port switch ASIC for either Mellanox or Intel® True Scale clusters. <sup>3</sup> A 2.3X based on 27,648 nodes based on a cluster configured with the Intel® Omni-Path ASICs, as compared with a 36-port switch chip that can support up to 11,664 nodes.



# **INTEL® OMNI-PATH ARCHITECTURE PRODUCT LINE COVERAGE**



### Top-to-bottom Intel<sup>®</sup> OPA product line coverage HFI and switch ASICS that enable custom OEM solutions



# **CURRENT MEMORY HIERARCHY**



For illustration only.



# **NEW MEMORY HIERARCHY**



For illustration only, potential future options are targets, subject to change without further notification.

# WHAT IS 3D XPOINT<sup>™</sup>?

#### **Crosspoint Structure**

Selectors allow dense packing and individual access to bits Breakthrough Material Advances Compatible switch and

memory cell materials

Scalable Memory layers can be stacked in a 3D manner

High Performance

Cell and array architecture that can switch states 1000x faster than NAND

## INTEL® OPTANE™ PRODUCTS Based on Intel 3D XPoint™ Technology





experience what's inside<sup>™</sup>