Digital Twins

In industries like manufacturing and logistics, digital twins are dynamic digital replicas of physical systems. These are are well established. For example, here is NVIDIA presention of how digital twins…

Medallion Architecture

Data architectures need organizing principles that help teams understand where data lives, what quality to expect, and how transformations progress. And what better than a catchy description. Without clear structure,…

Graph Databases

Most databases are designed around storing things: Relationships between these things exist, but are treated as secondary concerns, represented through foreign keys and join tables that the database supports. Queries…

Why Delta Lake

Data Lakes promised everything. Store all your data in one place, in any format, ready for any workload. The reality was horrific. Data Lakes became data swamps, filled with inconsistent…

The Logical Model

The backbone of any data modelling strategy. A backbone so critical, so important, so key that … it is almost always ignored. See my post that talks to this: So…

Edgar Codd’s 12 Rules

In the late 1960s, Edgar F. Codd, an Oxford-educated mathematician working at IBM’s research lab in San Jose looked at the way data was being stored and thought “…this is…

Data Masking Strategies

Your production database contains millions of customer records with real names, addresses, credit card numbers, social security numbers, and medical histories. Your developers need realistic data to test new features.…

Master Data Management

Ask five different systems in your organization for information about customer number 12345, and you’ll likely get five different answers. The CRM shows one address, the order management system shows…

Centipede Schema

I only recently discovered this pattern for data model design. If you’ve heard of the centipede schema previously, then it’s probably as a warning rather than a recommendation. This rare…