Programming‎ > ‎Paradigm‎ > ‎

Data-oriented design

DOD is a programing style that focus on the fact that the code in the end will run on some real hardware. While other paradigms try create a easy mental model for the programmer DOD aim to make it easy for the hardware to read and write the data. The problem that normally exist are as follows.  
  • The CPU needs to wait for memory access if things are not in the memory cache. The faster a CPU get relative to the memory the more CPU cycles is wasted on waiting. That can apply to both waiting for code instructions to be loaded or data to be loaded.  
  • When multi threading the CPU needs to wait for other resources to get unlocked so it can use them.
The goal is to get the maximum amount of flow of data without any stalls waiting for anything. This is done as follows.
  • Focus on data. How is it read, transformed and written.
  • Usage pattern
    • Activity - How often field is accessed. Often is hot and less often is cold.
    • Correlation - How often fields are used together.
  • Split data into hot and cold. A struct with the hot data that have a pointer to the cold data. All the hot data can then be one array and the cold another. If pure arrays and index are used on both one can skip the pointers.
  • Improve locality by keeping data that is accessed together close to each other.
  • Watch compiler padding so data structures do not bloat with padding.
  • Linear data is best.
  • The source data and the native data does not need to be the same. 
  • Linearize data at runtime.


Data-Oriented Hash Table - 2015

Is Data-Oriented Design a Paradigm? - 2010
Musings on Data-Oriented Design - 2010
Be nice to your cache

The Story behind The Truth: Designing a Data Model - 2017