This is the first post of a series of posts that will try to highlight some typical UML (or, in general, modeling) errors I’ve seen, specially among my students. Therefore, I think all novices (and others not so novices) can benefit from them. As always, suggestions are more than welcome.
To illustrate our first problem, imagine that you need to record information about the parents of the people stored in the system. Given this requirement, many would draw something like this:
At first sight, it seems like a reasonable solution, doesn’t it? It says that all people can have zero or more children, at that each person has two parents (some refine this a little bit to accept people with just one parent). So, where is the issue?
The issue is that this model forces the users to introduce an infinite number of parents for every person they introduce in the system!
This is easy to see with an example. Let’s assume that we insert a person named Albert in the system. According to the model we also need to record the information about Albert’s parents in order to satisfy the minimum multiplicity condition in the association isParentOf . So, we then insert two additional people (e.g. Joseph and Martha) as Albert’s parents. Now, since Joseph and Martha’s are also instance of Person, we need to insert four additional people in the system (Joseph’s parents and Martha’s parents). Do you see it already? After that, we will have to insert the eight grandparents, then the 16 grand-grand-parents and so on.
The only way to stop the recursion is to say that somebody is (directly or indirectly) his/her own ancestor, which obviously does not make sense.
Solution : the easiest way to avoid the problem is to weaken the multiplicity condition. Maybe in the real world, all people have parents, but this cannot be true for the people recorded in the system. So a possible solution would be:
Remember: reflexive associations with a minimum multiplicity constraint greater than 1 only make sense if the domain admits cycles among the instances of the class
FNR Pearl Chair. Head of the Software Engineering RDI Unit at LIST. Affiliate Professor at University of Luxembourg. More about me.
Jordi – this problem ONLY matters if your model IS a design FOR a technical system. If your model IS a model OF a REAL world DOMAIN, every person really does have (1 OR) 2 parents, AND AS we never have TO instantiate them, the infinite regress doesn’t matter. As you move from a domain model to a design model, you somehow have to fix this, because the instances in an information system don’t have the same properties AS REAL world entities.
True. IN fact, my point was exactly the one you mention: we need TO be aware OF the differences BETWEEN the two perspectives WHEN modeling the system (but you’ve explained the issue much better!).
To be honest, though, I don’t think most people modeling the infinite recursive version do so being aware that they’ll need to fix that later on when refining the models.
Jordi,
If “most people” are modelling infinitely recursive solutions it’s because they’ve been poorly trained.
I can’t think OF a single practical use FOR such a model AND surely the whole point OF kicking off this debate about USING modelling IN the REAL World IS practicality.
Any recursive solution must be fully optional TO be practical. Moreover, most recursive relationships have hidden information, FOR example mother OF OR father OF, OR even birth mother OF, adoptive father OF, foster father OF; which need TO broken out IN another entity, i.e. ‘relationship’. You can see here that your original supposition that someone can have 2 parents AT most IS falacious.
When I was running training courses ON DATA modelling back IN the 80s, I always told people TO be very suspicious OF recursive relationships IN physical models FOR these very reasons. They ONLY belong IN logical models looking AT things AT a high LEVEL, rather than IN detailed solutions.
You can use a tool like our UMLtoCSP verification tool
I would pretty much agree that most people model recursive relationships incorrectly due to poor training (or just lack of experience). However, I also can think of a practical use for the model as originally shown by Jordi. As previously commented, it depends on what you are modeling — or, perhaps more to the point, why you are modeling.
If your model is ontological, then it is about what “exists”, not just about what you “know”. Thus, according to the original model, every person has exactly two parents, whether you know what they are or not. This is useful, since it means that if your fact basis only includes information on one parent for a person, you can still deduce that the person has another parent — but that you are missing that information.
On the other hand, if your models is epistemological, then it is about what you “know”. In this case, the original model is no appropriate since you can never know the parents of all people for all time. Making the multiplicity 0..2 still limits you to knowing no more than two parents for a person, but it also let’s you know fewer.
Traditional software systems ARE epistemological — they “know” only about the finite amount of information represented in their data bases (or whatever data structures they have) and class models are usually taken to specify a required structure for those data bases. However, with the interest in semantic technologies these days, the ontological modeling approach also has practical uses.
The ontological approach may also be more appropriate for conceptual modeling, too, requiring care when transitioning to a design model, as previously noted. However, even conceptual, most recursive relationships are actually not usually infinite. The example I usually like to use is that of the supervisor of an employee, who is also an employee — but, unless you allow cyclic supervision relationships, in any real company, there generally has to be some be some top boss who does not have any supervisor, even ontologically!
— Ed