Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124

Streaming data systems and relational databases each offer immense value. Streaming excels at delivering low-latency, real-time insights; relational databases give strong consistency, rich transactional guarantees, and mature tooling. Yet, integrating…

It first appeared in 1992, bundled with Microsoft Office, promising a way for anyone to build a database without writing a line of SQL. Three decades later, Microsoft Access is…

For decades, the B-tree has been the backbone of database indexing, enabling efficient lookups, inserts, and range queries in block-oriented storage systems. For more information, please see my post on…

We all understand the concept of a marketplace. But it’s usually for things we spend money on to acquire. In today’s digital economy, data also has value. Real, tangible value…

The term data product has become ubiquitous in modern data organizations, but its meaning often remains fuzzy. Teams talk about building data products, while creating the same old dashboards, reports,…

I have previously had exposure to Domain Specific Language (DSL) using Xtext (). The DSL was used to define context and models. It meant we were able to give the…

This is a notoriously challenging problem in the realm of data management. But why is it hard? It’s just dates and times. Data with timestamps is everywhere. Server logs record…

Data is everywhere, but meaning is scarce. Businesses increasingly struggle with silos, databases older than most people operating them (hello DB2), inconsistent or missing semantics, and limited discoverability. Knowledge Graphs…

Every piece of data in your system has a temperature, whether you’ve thought about it that way or not. Some data is accessed constantly, queried hundreds of times per second,…

DataFrames dominate modern data analysis. If I had a £ for every time I typed… import pandas as pd Whether its Pandas/Polars in Python, Spark Data Frames, or R’s original…

Data contracts are becoming the backbone of modern data architectures. As organisations shift from ad-hoc pipelines to product-oriented data ecosystems, they need guarantees: that data will arrive on time, in…

Databricks (like Snowflake) doesn’t rely on traditional B-trees, because it’s built on a cloud-native, columnar, distributed file architecture. It avoids B-trees entirely because the cost of maintaining per-row index structures…
You must be logged in to post a comment.