Cell

Someone at IBM thought that multicore programming was not as hard as it needed to be when they made this one.

The Cell Broadband Engine is a PowerPC based processor that are used in the PS3. It is made up of the Power Processing Element and the eight Synergistic Processing Elements. All of them are running at 3.2 GHz.

Power Processing Element (PPE)

The PPE is your everyday PowerPC 64-bit dual-thread processor with the Vector Multimedia extension (VMX). It has a 32KB L1 instruction cache, a 32KB L1 data cache and a 512KB L2 cache. C++ code can be compiled for the Cell without any change but it will then only run on the PPE.

Synergistic Processing Elements (SPE)

Each SPE is a special purpose processor that are using a specific set of instructions with a focus on SIMD capability. It is made up of the Synergistic Processing Unit (SPU), a Memory Flow Controller (MFC) and 256KB of local storage memory. Most documents about coding on the SPE use the name SPU. The local storage is used to hold both the code and the data used by the SPE. The SPE can not use main memory directly so all data needs to be loaded into the local storage by DMA operations. The SPE instuctions are 128-bit SIMD instructions and they are not the same as the PPE instructions.

Why SPUs need aligned data - 2011

SPU Assisted Rendering

The Little Optimization that couldn't.

Reference