{"id":8810,"date":"2024-01-28T22:06:55","date_gmt":"2024-01-28T22:06:55","guid":{"rendered":"https:\/\/modeling-languages.com\/?p=8810"},"modified":"2024-01-28T22:15:54","modified_gmt":"2024-01-28T22:15:54","slug":"data-models-and-ai","status":"publish","type":"post","link":"https:\/\/modeling-languages.com\/data-models-and-ai\/","title":{"rendered":"The perfect three-way: data, models and AI"},"content":{"rendered":"
In the last decade, we have witnessed an explosion of research on new architectures, training methods, fine-tuning strategies, etc. for machine learning (ML). But we are now entering a new phase where all these new approaches are becoming a commodity. Platforms like HuggingFace<\/a> do an outstanding job in making all the latest results accessible to everyone. And its exponential growth<\/a> confirms it.<\/p>\n Therefore, using the latest ML architectures alone is not a competitive advantage<\/strong> anymore. Instead, companies need to turn their attention to the data<\/strong> used for training the ML models. Better data turns into better ML<\/strong>. As simple as this. And the only way to evaluate the quality of the data is to understand it. Both their underlying structure and their gathering and annotation process.<\/a><\/p>\n Data and data models are back in fashion<\/strong> thanks to the AI fever. But this also means that classical problems like data annotation, data mining, data fusion, data composition, etc., now in an ML context, must be revisited. For instance, ML often relies on big data sources that seem to be schemaless<\/a>. But this is not really true. At most, we can say it is \u00a0\u201cless-schema<\/strong>\u201d than other data, and we may need to first infer the implicit schema behind that data to be able to interpret it. And we could discover more than one possible schema as data is not always a static artefact. It’s rather a\u00a0partial, dynamic and temporal view of the data to facilitate manipulating the data at that specific instant.<\/p>\n I believe that the three-way relationship resulting from the interweaving of data, data models and AI reinforces each of them. Let’s see a relevant scenario as a representative of each combination:<\/p>\n These scenarios impose new requirements for the conceptual modeling field. In this new AI age, models are not a static element<\/strong> in the development process, but they become dynamic as they often need to change and evolve to remain aligned with the data (and the data drifts). They are also partial<\/strong> (as they may represent only parts of the data) and uncertain<\/strong> (as we may not be completely sure of how accurate they are, e.g. when they are automatically inferred).<\/p>\n But despite these challenges, conceptual models remain a key asset. A good example of this is the promotion of the common European data spaces <\/a>to facilitate the data exchange among partners within a data domain. This exchange requires the partners to agree on a unified conceptual data model to ensure interoperability. It’s not surprise that modeling languages such as the Entity Relationship language are experiencing a revival<\/a>.<\/p>\n And let’s not forget, ML models are also models!<\/strong><\/p>\n Everything is a model – Jean B\u00e9zivin (On the unification power of models)<\/p><\/blockquote>\n This means that the conceptual modeling community has the chance to bring their expertise to the AI world<\/strong>, helping the AI community to improve the way they represent, transform, reuse and deploy ML artefacts. Looking forward to seeing how we can bring AI-based engineering to a whole new level thanks to our decades of expertise in conceptual modeling.<\/p>\n This reflection is part of the panel discussion AI-Driven Software Engineering \u2013 The Role of Conceptual Modeling<\/a> we had at ICSOFT<\/a> 2023. The panel was coordinated by Hans-Georg Fill and I had as co-panelists <\/span>Wolfgang Maass and Marten Van Sinderen.<\/p>\n<\/span>","protected":false},"excerpt":{"rendered":" In the last decade, we have witnessed an explosion of research on new architectures, training methods, fine-tuning strategies, etc. for machine learning (ML). But we are now entering a new phase where all these new approaches are becoming a commodity. Platforms like HuggingFace do an outstanding job in making all the latest results accessible to […]<\/p>\n","protected":false},"author":2,"featured_media":8813,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_et_pb_use_builder":"","_et_pb_old_content":"","_et_gb_content_width":"","footnotes":""},"categories":[609,17,49,78],"tags":[741,856],"hashtags":[],"_links":{"self":[{"href":"https:\/\/modeling-languages.com\/wp-json\/wp\/v2\/posts\/8810"}],"collection":[{"href":"https:\/\/modeling-languages.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/modeling-languages.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/modeling-languages.com\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/modeling-languages.com\/wp-json\/wp\/v2\/comments?post=8810"}],"version-history":[{"count":7,"href":"https:\/\/modeling-languages.com\/wp-json\/wp\/v2\/posts\/8810\/revisions"}],"predecessor-version":[{"id":8818,"href":"https:\/\/modeling-languages.com\/wp-json\/wp\/v2\/posts\/8810\/revisions\/8818"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/modeling-languages.com\/wp-json\/wp\/v2\/media\/8813"}],"wp:attachment":[{"href":"https:\/\/modeling-languages.com\/wp-json\/wp\/v2\/media?parent=8810"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/modeling-languages.com\/wp-json\/wp\/v2\/categories?post=8810"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/modeling-languages.com\/wp-json\/wp\/v2\/tags?post=8810"},{"taxonomy":"hashtags","embeddable":true,"href":"https:\/\/modeling-languages.com\/wp-json\/wp\/v2\/hashtags?post=8810"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}\n