NVidia

GeForce 256

This page is about GPU's from NVIDIA.

Maxwell - switch

2014

Architecture Overview

The Maxwell GPU is partitioned into multiple GPCs (Graphics Processing Cluster). Each GPC contains multiple SMs (Streaming Multiprocessor) and one raster engine. The SM runs the shader programs and the raster engine turn triangles into pixels.

The part in the SM that runs shaders are called cores and the SM have many of them. They are run in groups of 32 so a vertex shader for example work on 32 vertices at the same time. Each group is called a warp and more then one warp can be active on the same cores at once. Each cycle the Warp Scheduler checks all the active warps to find one that is not stalled. A warp is stalled when it is waiting for something, for example a pixel shaders that wait for a texture read to complete. The warp selected will get to perform some instructions and then the Warp Scheduler might let another warp run. The switching between many warps if the way the GPU can work around the latency of certain operations, i do some useful work even if parts if it's work is waiting. The number of warps active at the same time is called the SM occupancy. High occupancy gives the Warp Scheduler more possible warps to switch between and let it work as much as possible.

The register file is 64k*32 bit in size and provide the registers for each thead. The more registers a thread need the less warps can run at the same time.

SASS

The assembly code used by the SM is called SASS. It change with each architecture but it can be useful to read some of it as it.

SASS Instruction Set

Links

New GPU Features of NVIDIA's Maxwell Architecture - 2015

Life of a triangle - NVIDIA's logical pipeline - 2015

Don't be conservative with Conservative Rasterization - 2014

Maxwell Whitepaper

Performance Guidelines

GPU Architecture

Ampere (GeForce 3000) - 2020

NVIDIA Ampere Architecture In-Depth - 2020