Premature optimization is the root of bugs that you need to handle for 18 years.

Optimization is the process of improving the performance of the program. The two common things that one try to improve is the speed, so it runs faster, or memory, so it use less of it. Often it is possible to trade one for the other, for example to calculate things in advance so the program do run faster but consume more memory. The process of optimization is made up of the following three steps.

Profile to find the hotspot

First one need to find a area to improve. The best one is the one that has the biggest potential to improve the overall performance. To do that one can use external or internal profiler. A internal profiler is something that one has built into the game and an external is another program that measure it's performance. In the end the profiler gives a measure of the current performance of that area. Common units to measure performance in is seconds (for speed) or bytes (for size). Depending on the scale unit prefixes such as milli or mega is common. When measuring speed one should use the unit seconds and not Frames per second.

Optimize it

Do whatever is needed to improve the speed, size or whatever of the hotspot.

Verify improvement

Run the tools (profiler, memory trackers) again to see if the performance has improved. If not try again :).

Build Time

Optimizing this will cut into your fencing practice

The time it takes for you to compile and link your game is time wasted. It can make you loose focus when waiting for it to complete or try avoiding changes that force a time consuming recompilation.

Profile - Seconds

Clang Build Analyzer - 2019

time-trace: timeline / flame chart profiler for Clang - 2019

Investigating compile times, and Clang -ftime-report - 2019

Another cool MSVC flag: /d1reportTime - 2019

Best unknown MSVC flag: d2cgsummary - 2017

C++ Compilation: Lies, Damned Lies, and Statistics - 2019


Optimizing C++ Compilation: The Trouble With Templates - 2019

Reducing build times by 20 % with a one line change - 2019

C++ Compilation: Fixing It - 2019

Reduce Compilation Times With extern template - 2019

Compiler investigations - 2017

Even More Experiments with Includes - 2005

The Care and Feeding of Pre-Compiled Headers - 2005

Physical Structure and C++ – Part 1 and Part 2 - 2004

Reduce Compilation Times With extern template - 2019

std::vector and Minimizing Includes - 2019

Introducing vcperf /timetrace for C++ build time analysis - 2020

Build C++ from source: Part 1/N - Improving compile times - 2020

Code Binaries Size

Coding for console are we?

Profile - Bytes

Sizer - Win32/64 executable size report utility - 2014


One Simple Trick For Reducing Code Bloat - 2019

Executable Bloat – How it happens and how we can fight it - 2011

C++ Weekly - Ep 154 - One Simple Trick For Reducing Code Bloat

Development Time

No fencing for the design or art team

This is measured is seconds and it is the waste time from the tools you use. It should be as low as possible for the optimal production of the game. One example is if a artist change a texture how long does it take from save in the image software until it is possible to see how the new texture looks in the game.

Profile - Seconds


A modern asset pipeline: 7 reasons to optimize content - 2015

CPU Speed

Run as fast as possible

Profile - Seconds,

External: AMD CodeXL, AQtime, VTune

Internal: Tracy Profiler, Remotery, Minitrace, EasyProfiler, Brofiler

Simple instrumenting profiler - 2008

Analysing Stutter – Mining More from Percentiles - 2014


Why (most) High Level Languages are Slow - 2015

The Cost of Enabling Exception Handling - 2011

The Cost of Buffer Security Checks in Visual C++ - 2011

The Cost of _SECURE_SCL - 2011

Optimization 101: ordering conditions - 2010

GPU Speed



The less memory we use the better

Memory usage are measured in bytes but not all the memory is the same. Some of might only be usable by the GPU or there might be a small amount of memory that is faster.

Profile - Bytes



Access memory in the right order

This is about improving the memory access pattern of the code and in that way also the speed. More information in cache and data oriented design.



Content Size

Small is good

Many games are now downloaded and patched online and the size of your game can be a download cost for your players. Size optimization is when you try to get the game package as small as possible. It can be the size of the download, storage on the customers computer or the save file. The progress is measured in bytes.

Profile - Bytes

    • Common Limits
      • IOS Over-the-Air App Store limit: 100MB.
      • Google Play APK limit: 50MB


    • Make a pass over all assets and make all of them are still in used. Often many old expired ones get left in the build.
    • Make a tool to list all assets by type and size and verify that they do not use more memory then they need for the place they have in the game.
    • Common assets to look at are textures and sound.




Optimizations in C++ Compilers - 2019

Rules of optimization - 2018

The Elusive Frame Timing - 2018

Optimization and performance measurement - 2018

Profiling: The Case of the Missing Milliseconds - 2018

More performance, more gameplay - 2017

CppCon 2016: Nicolas Fleury “Rainbow Six Siege: Quest for Performance"

CppCon 2016: Jason Turner “Practical Performance Practices - 2016

CppCon 2016: Timur Doumler “Want fast C++? Know your hardware!" - 2016

Taming the Jaguar: x86 Optimization at Insomniac Games - 2016

C++ Performance: Common Wisdoms and Common “Wisdoms” - 2016

Stop Misquoting Donald Knuth! - 2015

Understanding Compiler Optimization - Chandler Carruth - Opening Keynote Meeting C++ - 2015

Code Clinic : How to write code the compiler can actually optimize - 2015

Optimizing software in C++ - 2014

Looking For a Good Sort - 2014

The microarchitecture of Intel, AMD and VIA CPUs - 2014

Optimizing subroutines in assembly language - 2014

Vessel: Common Performance Issues - 2013

Don't Help the Compiler - 2013

A Profiling Primer - 2013

Optimisation lessons learned - 2012 Part 1, Part 2 and Part 3.

Visual C++ Performance Pitfalls - 2011

The Windows Heap Is Slow When Launched from the Debugger - 2011

Finding Bottlenecks by Random Breaking - 2011

Hotspots, FLOPS, and uOps: To-The-Metal CPU Optimization - 2011

Optimisation Lesson - 2011 : 1: Profiling , 2: The Pipeline and 3: The Memory Bottleneck.

Optimizations that aren't - 2010

Writing Efficient Game Code for Next-Gen Console Architectures - 2005

OPT#1:Profiling - 2020


Optimizing Trilinear Interpolation

OPT#3:SIMD (part 1 of 2)

OPT#4:SIMD (part 2 of 2)

AMD Ryzen™ Processor Software Optimization - Video / Slides

Optimizing for the Radeon™ RDNA Architecture- Video / Slides

From Source to ISA: A Trip Down the Shader Compiler Pipeline - Video / Slides