After 20 years of research ( 🥴 ) in this field, in 2020 we (Gwendal Daniel and myself) thought it was time to eat our own dog food and transform one of our research projects (Xatkit) into a company. Overall, the adventure lasted for four years, including a first phase as a pure research project, then the company creation process (market study, product maturity,…) thanks to seed-funding from the Catalan government and finally the company incorporation and its dissolution at the end of 2022. While the end of the company was strongly related to personal changes, I think we learned a lot about model-driven engineering (MDE) along the process and I hope this kind of “post-mortem” recap of our model-driven company is also interesting for you!
You can read the full story in the paper Applying model-driven engineering to the domain of chatbots: The Xatkit experience published in the Science of Computer Programming Journal (free unedited version) or keep reading for the highlights version. We don’t claim all our findings are surprising (indeed, adoption of model driven engineering is the topic of many discussions) but I do believe our perspective as creators of a commercial model-driven tool and model-driven company ourselves, we bring a different perspective compared with previous studies more based on interviews to external professionals.
Key lessons learned
Overall, our main conclusions are that:
- Model-driven engineering can be successfully applied to new domains such as AI-based software. In particular, our company aimed to commercialize a model-driven framework for the definition and runtime execution of chatbots in a platform independent way.
- Model-driven tools are suitable for prototype creation but have limitations when it comes to developing industrial-strength solutions where more aggressive and “less-pure” solutions are needed for scalability and agility reasons.
- Commercial success of model-driven based approaches depends on numerous factors beyond technical ones. Wihle the benefits of software modeling are still true, they are not enough to guarantee a commercial success as clients may not regard them as key reasons to buy your product, e.g. they may no have a long-term view and therefore they couldn’t care less about platform-independence.
Next, we go deeper on some of these aspects, explaining how our initial bot framework and our company itself evolved to try to achieve product-market fit.
Technical-driven evolution
From a technical perspective, the framework had some major disruptive evolutions we had to perform after the initial release, once we started using Xatkit to build more than “toy bots” and we began to realize some limitations on our initial design decisions.
Such initial decisions were mostly driven by our previous experience with other existing modeling frameworks and our knowledge on how they were internally built. In this sense, we are confident some of these evolutions and reflections could be useful recommendations for future MDE tool builders as informed recommendations that could influence their own design trade-offs.
Moving to an Internal DSL to avoid reinventing the wheel
While a pure stand-alone MDE infrastructure provided us a powerful toolkit to implement a first version of Xatkit, we quickly realized that the MDE toolkits and language workbenches we used were lacking seamless integration capabilities with more traditional software development techniques, languages and libraries.
As a consequence, evolving all the modeling artifacts required to improve the expressiveness of our DSL (e.g. adding new primitives to better support a new use case) was quite complex and time-consuming. And we had the feeling we were starting to reinvent the wheel. Indeed, once you start adding conditionals, iterators and other basic language constructs to your DSL when you could get them for free in other languages, it is time to rethink your choices.
Therefore, we decided to switch from an external DSL to an internal one, implemented as a Fluent Interface to isolate as much as possible the bot definition from other parts of the Java program. We’re not suggesting internal DSLs are always the best option, just saying the opposite is not true either. We’re always tempted to start from scratch and go “the external way” but this has a cost that for Xatkit was not worth paying.
Try to get your DSL primitives right as soon as possible
Evolving your DSL is expensive, especially when you follow a pure model-based approach where every evolution should start by evolving the metamodel, then regenerating the artefacts derived from it to finally evolve the code consuming those artefacts.
Our recommendation here is to spend some time trying to build a variety of examples before setting too much into the abstract syntax for your DSL to minimize as much as possible the need for an immediate evolution. It will need to evolve, for sure, but try at least to start with a solid and validated base.
Balancing a model-based perspective with optimized hand coded components
We designed Xatkit as an extensible framework, always keeping a platform-independent perspective so that all components extending the framework would offer a unified interface to facilitate the deployment of Xatkit bots across a number of platforms and input/output libraries. The goal here is to find the balance between just staying at the modeling level and then simply deploying your solution on top of existing components or getting “your hands dirty” and building some components yourself on top of which deploy your bots.
Both options are not contradictory as long as the bot language still remains independent of the low-level solution, even your own, so that clients are free to choose to deploy your bot on top of other third-party solutions. Make sure you avoid the temptation to take the easy route and hard-code your own integration directly to the modeling framework (faster to implement as then you do not need to respect all interfaces external components need to follow, but it would break the platform independent philosophy).
Commercial-driven evolution
We use the term commercial to refer to both, the evolution required to align better with client requirements and the evolution required to evolve the product to reach an industrial strength quality level.
Platform independence is not always a selling point
Sad but true. Platform-independence was not a selling point for many of our clients. In our opinion, platform-independence becomes more relevant once a domain reaches a certain level of maturity and companies have more technical options to choose from, and they have already experimented with some of them, potentially already having suffered as well the pain of migrating from one to another. This is not yet the case of the chatbot domain, clients were never excited about the benefits of using Xatkit to abstract from concrete NLU providers (DialogFlow, Amazon Lex,…) and the potential future benefits when migrating from one to the other, they were just thinking short-term. It turned out that being able to adapt the look and feel of the chatbot widget to match the corporate color palette was much more important for them.
MDE helped us for sure to adapt to the current technology stack and infrastructure each client wanted. MDE was a good paradigm to build bots faster ourselves but not a commercial strategy per se.
Build your model-based solution with scalability in mind
It is often said that premature optimization is the root of all evil but for small teams building a model-based infrastructure, initial decisions can be very costly to change afterward. In our case, we started with a Java, Eclipse/EMF and Xtext stack as this was our “natural” option.
But while Java is a great language, most of the NLP and ML developments are first released in Python and therefore, probably building Xatkit in Python (instead of having to wrap Python APIs and libraries) would have been a wiser choice (🚨 stay tuned for more news on this!!! 🚨). A second mistake was not to think about a multi-tenant approach from the beginning (we had to implement this for our ecommerce bot, where many clients would be sharing the same template bot, instantiated with their individual data in every case).
Stop looking for the “right” concrete syntax
A critical decision when designing a DSL is the choice of the concrete syntax. But we often fail to remember that the same DSL could be linked to many different concrete syntaxes, each with their own trade-offs (e.g. in terms of expressiveness and usability).
We quickly realized that each user profile required a different syntax. So, we dropped the idea of going for a common syntax aimed at being a good compromise among all the user profiles (us, technical end-users, non-technical users, …) and decided to, on the one hand, create the “right” syntax for us and then experimented with other syntaxes more oriented towards non-technical users. This even included an Excel-based syntax where users could simply use Excel to define their bots.
Even more, we also found out that, often, our clients had no interest in defining the bot themselves, they just wanted to hand us out the documentation (website, support emails and tickets, manuals,..) to create the bot and have us built the bot for them.
Ask yourself whether you are selling a model-based tool or a service that you internally develop with your own model-based infrastructure. The right notation in both cases may be very different.
Final remarks
We still believe model-driven engineering is the correct approach when developing any complex software product (and of course, our new low-code platform is also model-driven!). Nevertheless, we also tried to make clear in our lessons learned that each of these model-driven benefits and technical enhancements will need to be analyzed not only from a purely technical point of view but from a broader perspective to make sure they are aligned with the needs of future stakeholders and users of your specific scenario and domain.
(and yes, I did realize the featured image of the post has typo. It was generated with DALL-E. I decided to leave it as a reminder of the time when image generators were not yet good at embedding text in images).
FNR Pearl Chair. Head of the Software Engineering RDI Unit at LIST. Affiliate Professor at University of Luxembourg. More about me.
This is an excellent article, and I can only support everything Jordi says here based on our own experience – including ending up using a generic tool, such as MS Excel, for modeling. For whatever reason ;-), business users and Subject Matter Experts in many domains simply *love* Excel and want to keep using it for everything.