There was a time when I thought the Entity Relationship (ER) language was past its prime. But I now realize I was wrong, really wrong. The ER language feels stronger than it has been for a long time.
In tihs post I try to write down why I have changed my mind. Let’s see if you agree!
Brief introduction to the ER language
Originally proposed by Peter Chen in 1976 in the paper The entity-relationship model—toward a unified view of data, the ER language quickly gained massive adoption in the emerging (at that time) database community. Indeed, the ER model became the de facto standard for designing databases with many vendors (beginning with most of the companies selling relational database management systems) offering modeling environments for ER diagrams with the possibility to automatically translate such models to SQL DDL scripts.
As the name suggests, the key elements of the ER language are the entities and relationships. Entities are the equivalent of objects in the Object-Oriented world (i.e. a specific person, product or event) and the relationships are the links between these entities. Similar entities are grouped into entity sets, also known, as entity types. Similarly, relationships are grouped into relationship types. Clearly, entity types resemble the concept of classes in OO terminology, while relationship types would the equivalent of associations. Entity types can also have attributes. And both attributes and relationship types may be constrained by multiplicity relationships.
Several extensions to the original ER language were proposed, commonly known as the EER family of languages (“Extended Entity Relationship”). One such key extension added inheritance relationsihps between entity types.
Given this limited set of modeling concepts, it is easy to see that the mapping from such models to relational database schemas was mostly a straightforward generation. When many vendors were not yet strictly following the SQL schema, the main challenge of ER editors was to be able to generate SQL scripts for all the different flavors of SQL existing at the time.
The dark years of ER
ER always had its community, especially, in the database world, but it lost its appeal in the broader software development community.
IMHO for two reasons:
- The arrival of UML. With UML you can model whatever you can model with the ER language (or almost) and much more as with UML you can define all the dimensions of your software project and not “just” your domain data (even if, to me, this is still the core element)
- The NoSQL trend. With NoSQL databases, there is no fixed schema. At least in theory, then I like to say that NoSQL is not schemaless. At most, we can say it is “less-schema” than other data. But still, creating database schemas went out of fashion.
The resurrection of the Entity Relationship language
After a few years, people understood that UML had many qualities but it came with a price, as a language was far more complex than the ER one so if you just wanted to create a database or build an application that was basically a web wrapper on top of a database (which could be easily generated from the database itself), maybe it was a better idea to drop the complexity of UML and stick to the simpler ER language.
And it was also soon obvious that all NoSQL vendors were adding SQL support as SQL was the only language that everybody knew. So even if internally the data was stored using a variety of NoSQL strategies, they brought back the “illusion” of a partial schema that could, again, be modeled.
Moreover, a clear sign that a language is getting more and more interest is the creation of new modeling tools for it.
And I have witnessed this around the ER language. After a while, with only some of the classical database design tools (e.g. PowerDesigner that I already used over 20 years ago ????), we are seeing plenty of new tooling initiatives, including online ER tools, textual modeling ER tools and even an ER plugin for VS Code.
I would even dare to add that the explosion of Machine Learning (ML) has also helped, as it has given a lot of importance to the datasets required to train the ML models. And where you have datasets you have the need to describe such datasets(and nothing more, so, again, no need to use a full-fledged modeling language like UML for that).
All in all, given the clear role of data and domain models in all types of initiatives and the better understanding we all have about the strengths and drawbacks of ER compared to other languages, I think ER will remain popular for a long time. True, the hardcore software engineering community will probably never adopt it but we sometimes tend to forget there are many other communities around us that have different needs and for which the ER language could be a perfect fit.