Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124

Honestly, we will do the conceptual and logical models later. Well, maybe. Probably.
Yeah, we won’t.
Data modelling is an art, but not everyone is an artist. We entust this highly specialised domain to engineers with (sometimes) little training and just hope for the best. Delivery trumps design every time and we end up with hidden technical debt. But what debt…we delivered on time and it works in <insert DB of choice>, right?
You did. And it does. But when I look at the models I see numeric fields marked as strings and flags marked as integers. No primary keys. Why are there no labels? Why do all the field descriptions say “to be added later”. What does the field U_HOS_B mean? These are the fundamental components of what makes a data model. They are not optional.

What they have shipped is likely a ticking time bomb. Good luck to the next team that wants to use these models. We may be lucky, the new team might have someone who intuitively knows what the column U_HOS_B means (that’s a real column name by the way…I still have no idea what it does). If your database has one table, if it is never updated and everything always works, then cool. But that’s not a world most of us operate in.
It’s not good enough. What happens when we want to ship a Snowflake physical model to Databricks? We copy the already poor physical model and hand-write it all again. Why, because we don’t have a logical model. If we did we would simple forward engineer it to a Databricks physical model using a data modelling tool. I talk about logical models here:
Your organisation needs discipline. Your organisation needs standards and policies. You need to stop the deployment of poor quality data models. Someone needs to have the power to say “no”!
In the age of AI, semantics are everything. If all the AI model has to go on is a table with 200 fields, all with labels that say “description to be added later”, then good luck and well done. You’ve just successfully weaponised chaos.
Fundamentally it all comes down to having the wrong people in the role and having management that just don’t see the problem. In the same way that longevity apparently makes people good managers, being able to write a few lines of Java makes someone a good data modeller.
How did we end up here? Well, there are a few reasons. The elephant in the room here is the role of the architect. I have been all flavours of architect over the years; enterprise, solution, technical and now informational. And we need architects, yes, but we first and foremost need pragmatic architects. The best advice I was given years ago was:
…to be credible as an architect you need one foot in the ivory tower and one in the trenches

You should have a vision and a goal. But you need to bring the engineers and modellers along with you. Using words like ontology, sementics and taxonomy is great, but if you are the only one that understands the terms, then the message is already lost. It all comes down to a simple, shared understand. In financial services the majority of work deals with tabular data of varying shapes and sizes. We know how to model this. We’ve known since the 1970s:
I used the words conceptual and logical models at the top of this post. I always assumed that everyone working in a technology knows what they mean, why they exist and what we use them for. This is absolutely not the case. But in a room full of your peers people just pretend to understand. You can’t assume anything.
Get everyone in the tent with a shared understanding.
For other examples, please see the blog I wrote on data literacy.
You must be logged in to post a comment.