Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124

Is the answer AI? Ummm….not yet (correct at time of writing). Data engineering today looks remarkably different from five years ago. The role that emerged to build Hadoop clusters and write MapReduce jobs has evolved into something unrecognizable from its…

Memory is cheap, but it ain’t free. In the world of modern data engineering, compression is everywhere. It’s in your Parquet files, your Kafka messages, your database storage engine, your .txt file and your API responses. Yet despite its ubiquity,…

Writing this blog reminded me that I need a holiday. Anyway, there’s a class of data problem that shows up all over the place: On the surface, these seem like different problems. But they’re all the same problem in disguise:…

When storing and retrieving large volumes of ordered data, the goal is to keep search, insert, and delete operations efficient while staying balanced. B‑trees (Balanced Trees) do exactly this: maintain sorted keys, keep all leaves at the same depth, and…

Spatial databases are designed to store, query, and manipulate data that represents objects in space — from cities and roads to oceans and underground utility lines. Unlike traditional databases that handle purely numeric or text data, spatial databases deal with…

We don’t really think about where the data is physically stored. In the digital world, storage is the unsung hero. Data scientists, AI models and analytics tools often take centre stage, but none of them work without a place to…

When I first came across this algorithm I was intrigued. What did it mean? Working/playing with vector database like Milvus, Pinecone, Weaviate, or Vespa, I saw HNSW mentioned in the index settings. I discovered this algorithm, Hierarchical Navigable Small World…

The encryption protecting your data today relies on mathematical problems that are practically impossible for classical computers to solve. Factoring large numbers, computing discrete logarithms, and similar hard problems form the foundation of RSA, ECC, and Diffie-Hellman, the cryptographic algorithms…

Another stroll back in time here to talk about the way we order data and how it affects how efficiently we can store, retrieve, and analyze it. Sorting a single-dimensional list is straightforward, but real-world data is often multi-dimensional –…

The AI boom has brought a new class of databases into the spotlight: vector databases. But in the background triple stores have been quietly powering knowledge graphs and the Semantic Web for over two decades. At one of my previous…
You must be logged in to post a comment.