Category Uncategorized

Consensus Algorithms

Ever tried to arrange anything with friends via group chat? Messages arrive out of order, some people don’t respond, and others change their minds. Now swap friends with computers and scale this to thousands of nodes trying to agree on…

Liquid Clustering

This form of clustering removes the need for Z-Ordering and partitioning, thus simplifying database layouts and increasing query performance. Databricks actually quotes 10x query performance gains using this approach, so it clearly warrants a closer look. The technology uses Predictive…

Data Governance

As your organisation grows, so does the overhead of managing and controling the data. In large organisations, it is one of the most complex and time consuming activities, requiring teams of specialists to oversee and manage. Data governance is the…

Databricks Indexing

Databricks (like Snowflake) doesn’t rely on traditional B-trees, because it’s built on a cloud-native, columnar, distributed file architecture. It avoids B-trees entirely because the cost of maintaining per-row index structures would destroy the scalability benefits of its append-only, distributed Parquet…