Speed of Light
NSight focus on hardware metrics, how well the hardware units and sub-units are utilized and how close they are running to their respective maximum throughput. This is shown as a % value of the theoretical throughput of the unit, Speed of Light (SOL). The SOL can be shown for a whole frame, a range of graphic API calls or a specific API call. For details about the GPU architecture of NVIDIA look on the NVIDIA GPU page.
Stall reason smsp__warp_stall_*_pct
- long_scoreboard - Warps that were stalled waiting for a scoreboard dependency on L1TEX.
- short_scoreboard - Warps that were stalled waiting for a scoreboard dependency on a MIO (memory input/output) operation. Ex special math instructions or dynamic branching.
- drain: Warps stalled waiting after EXIT for all memory operations to complete so warp can be freed.
- imc_miss: Warps stalled waiting for an immediate constant cache miss.
- no_instructions: Warps waiting to be selected to fetch an instruction or waiting on an instruction cache miss.
GPU-Driven Rendering - 2016