Category Databases

Galaxy Schema

Most introductions to dimensional modeling start with the star schema – a single fact table surrounded by dimensions. It’s clean, simple, and perfect for explaining the basics. But real businesses are rarely that simple and you can model from a…

Star Schema

If you’ve ever worked with a data warehouse or business intelligence tool, then you’ve probably encountered a star schema. If you haven’t, then you might want to look into it… This elegant data modeling pattern has become the de-facto standard…

The CAP Theorem

In the late 1990s and early 2000s, the rapid growth of the web changed database design forever. Traditional monolithic, single-node relational databases, the backbone of enterprise applications for decades suddenly faced workloads that spanned continents, scaled to millions of concurrent…

Understanding Z-Ordering

When you index data you don’t often think “I wonder where all this data physically resides”. Or maybe you do, but anyway, when working with massive datasets, even the smartest indexes and metadata can’t help if your data is scattered…

Zone Maps

The fastest query is the one you don’t need to make. Zone maps are a lightweight indexing technique that lets the query engine skip large chunks of data by storing summary statistics for each data block. They’re simple, space-efficient, and…

Slowly Changing Dimensions

In the world of analytics and data warehousing, one of the trickiest challenges is keeping track of how things change over time. In operational systems, these changes often overwrite the old value without a second thought. But in analytical systems,…

Data Skipping

In large-scale analytics systems, scanning every record to answer a query is a recipe for slow performance and high cost.Data skipping is the technique that lets the engine read only the files or blocks that might contain relevant data, thus…

Spatial Databases

Spatial databases are designed to store, query, and manipulate data that represents objects in space — from cities and roads to oceans and underground utility lines. Unlike traditional databases that handle purely numeric or text data, spatial databases deal with…

Spatial Indexing

When you query a traditional (relational) database column, a B-tree index usually handles the search efficiently.Spatial data, however, is not just a single dimension; it’s at least two (latitude, longitude) and often more (3D coordinates, time, attributes). This makes spatial…

Rise and Fall(?) of the Data Warehouse

For nearly three decades, the enterprise data warehouse reigned supreme. It was the single source of truth, the vault where all valuable corporate data lived, cleaned, and neatly structured for consumption. Vendors promised that if you put everything in the…