During my participation in the development of several products that embed Machine Learning (ML) and other AI components as core of their logic, I’ve addressed many issues, most of them regarding communication and alignment between the team members. The teams were largely multidisciplinary and included other profiles beyond developers and software engineers, e.g., data scientists, psychologists and AI experts. It wasn’t clear for anybody, let alone standardized:
- what to communicate
- what to expect from different parties
- when and how often to meet
- when to expect results
It was like most of the information was lost in translation, since people had diverse backgrounds and perspectives, and oftentimes seemed to speak different languages. For my relief, I’m not alone in the desert: the need for proper guidance when executing a project to develop AI-based software is a common problem, and it’s been repeatedly manifested in recent studies with practitioners [2,4,6,11].
We need a way to provide full visibility and traceability about the work decomposition within an organization, along with the responsibilities of their participants and the standards and knowledge of the company. Thankfully, adopting a process-centric approach ensures the standardization and consistency in the execution of tasks, provides a common language that eases the communication across the organization, and nurtures a transparent collaborative environment.
In this sense, we propose a domain-specific language (DSL) to facilitate the specification of AI engineering processes. The motivation for a DSL is to have a shared language in a particular problem space that can foster communication and collaboration. Our DSL encompasses standard process modeling concepts plus AI-specific process primitives based on the analysis of research and grey literature. We currently have introduced concepts from Machine Learning and leave other AI activities and facets for future work. We will present this work in the upcoming International Conference on Product-Focused Software Process Improvement (PROFES 2022).
Our DSL, among other expected benefits, is aimed to:
- facilitate the definition of AI engineering processes by enabling stakeholders to discuss and specify a single and formalized representation of such processes;
- enable automatic processing, by providing constructs that can be assembled and automatically interpreted;
- ease the detection of hidden or conflicting practices; and
- simplify the onboarding of new team members.
Let me briefly show you the DSL in the following.
There are several scientific papers and grey literature describing the development of real AI projects. Among those, we have selected the most cited research publications and influential contributions from the industry as inspiration for our DSL.
In particular, we have chosen 3 industrial methods: CRISP-DM as the de facto standard process for data science projects; and Microsoft’s Team Data Science Process and IBM’s AI Model Lifecycle Management as two major players in the field. We have also included 3 scientific references that discuss the Machine Learning lifecycle in actual applications [2,3,7]; and 1 paper that blueprints a maturity framework for AI model management .
Each of those proposals has a slightly different grouping, distribution and granularity of activities, but all share the following high-level structure:
- Business Understanding, to set the business objectives and criteria linked to an AI project, and produce a plan along with an initial assessment of tools and techniques.
- Data Collection & Preparation, to perform activities to gather and clean data, clean noise and inaccuracies and prepare datasets and features for creating AI models.
- AI Model Training & Evaluation, to select AI modeling techniques, optimize hyperparameters and train the AI models, which will be evaluated and ranked according to evaluation and business criteria.
- Production & Operation, to finally make the AI models available for consumption and to build a monitoring system and pipelines for continuous improvement.
Our DSL generalizes and unifies these concepts to enable the specification of end-to-end AI engineering processes.
A DSL is commonly defined through a metamodel that represents its domain entities and their relationships. As shown in the following figure, at its core, our DSL contains the generic description of activities, the relationships between them, and the main elements they are related to. Based on the analysis of existing literature, we predefine four main activities:
At this moment, we focus on the two most AI-specific ones: DataActivity and AIModelingActivity. We only briefly cover the AIModelDeploymentActivity and leave the BusinessActivity for future work. For the sake of simplicity, in this post we introduce an excerpt of the DSL, whereas the complete metamodel is available here.
Note that our DSL does not prescribe any concrete AI engineering process model. Instead, it offers the modeling constructs so that each organization can easily define its own process.
Activity Core Elements
An Activity constitutes the core element of any process. Activities are composed of other activities (association composedOf). Completing an activity may require completing all subactivities (attribute requiresAllSubactivities). Process creators define if an activity is mandatory (attribute isOptional). There may also be a precedence relationship between activities (association next).
Several Roles perform the activities during development. Their participation could be specified according to the organization’s levels of responsibility, e.g., as responsible or accountable (class Participant).
Activities consume (inputs) and produce (outputs) Artifacts. An artifact could be a document that is generated as an output of an activity and is consumed as an input by the following one. Other examples of artifacts will be studied in the following sections.
Finally, Resources might be helpful to complete an activity. Resources are not consumed nor produced by the process –they are supporting components. An example of a resource would be a template for the document from the previous paragraph.
The DataCollectionActivity is the acquisition of Data from DataSources. The participants move data from internal or external data sources (attribute isExternal) into a destination (attribute location) for further processing.
In the DataProcessingActivity, data is cleaned and transformed via different techniques (e.g., dummy substitution for cleaning empty values in relevant attributes, decimal scaling of numeric discrete data, and data reduction or augmentation) to overcome deficiencies that might result in bad predictions if used for training an AI model. Additionally, data could be labelled to help AI models identify concepts in production data.
The FeatureEngineeringActivity (not included in the figure above) comprehends the tasks and statistical techniques used to transform data attributes of a Data into features that can be used by an AI model and enhance its prediction accuracy. During this activity, correlations between features are identified. As a result of this activity, a set of features are extracted from the data instances.
Data is then usually split into three disjoint sets: a training dataset, a validation dataset and a test dataset.
AI Modeling Activity
The AIModelTrainingActivity is the activity for creating, training and validating new AI models from the collected and prepared data. An AIModel is trained by an AI algorithm using the observations held in the TrainingDataset.
Once an AI model is initially trained, a data scientist tunes its Hyperparameters looking for the OptimalValues that yield its best performance. The ValidationDataset is applied in this procedure. Finally, the hyperparameter values that maximize an AI model performance are fixed for production.
The AIModelPerformanceCriteria will drive the AI model training and will be used to pursue an AI model or discard it; in other words, they dictate when it is not worthwhile to keep improving an AI model.
In the AIModelEvaluationActivity (not part of the figure above), a data scientist checks if an AI model satisfies the AI model success criteria, along with its adequacy to its AI model requirements. A test dataset is used to assess this. Data scientists then set a ranking for each AI model.
AI Model Deployment
In the AIModelDeploymentActivity, an AI model is deployed to a production Platform to serve end users or other systems. It may be useful to run Scripts to automate its installation and setup. An AI model can be deployed (attribute pattern) statically, dynamically (on the user’s device or on a server), or via streaming. An AI model can make its inferences either: (a) in batch mode, periodically making predictions offline and serving the results to a repository; or (b) in real-time, making and serving predictions whenever requested to.
We have implemented our DSL on top of Sirius Web, an open-source subproject of Eclipse Sirius. Given a metamodel and its mapping to a set of visual notation elements, Sirius Web generates a modeling environment that can then be used by modelers to design new graphical models conforming to that metamodel.
As an example, the following figure depicts an excerpt of the DataActivity of a simple process model with four subactivities: (1) Ingest the data, (2) Clean the data, (3) Reduce the data, and (4) Set the data pipeline. The first one is an instance of the DataCollectionActivity and employs a technique (Load data to SQL Server on Azure VM) for transferring data from the data source Employees ERP into the data Extraction ref A0O451. The activity Ingest the data has one participant, a Data engineer, who is responsible for its execution. The activities Clean the data and Reduce the data are instances of the DataProcessingActivity, and each of them performs different techniques to process the data instance.
The DSL provides flexibility for adding elements that are specific to a method. In the example, Set up the data pipeline does not correspond to any predefined AI activity described in our DSL. Therefore, it is based on the generic Activity.
There are dozens of process modeling languages, e.g., BPMN & SPEM and their extensions, UML profiles, and formal languages . Specifically, SPEM is an OMG standard for describing software development processes, but it purposely does not include any distinct feature for particular domains or disciplines –like Artificial Intelligence. To the best of our knowledge, none of the process modeling languages includes AI specific extensions.
Regarding DSLs for AI, there are languages to model certain AI activities such as OptiML , ScalOps , Arbiter  or Pig Latin . There are also DSLs for creating AI artifacts like ML-Schema , an ontology for interchanging information on ML experiments, or DeepDSL  for the creation of deep learning networks. Nevertheless, none of those DSLs focus on process aspects.
Therefore, as far as we know, our DSL is the first that provides elements for describing AI-specific activities and enables modeling AI engineering processes.
Conclusions and Future Work
In this post, we have presented a first version of a DSL to model AI engineering processes. Our language covers the needs for such types of processes as described in academic and industry proposals. We believe this DSL is a step forward towards the standardization of development practices in the AI area and will help teams coordinate and collaborate in an efficient and effective manner.
Our DSL will facilitate the formalization of AI processes within organizations. Moreover, this formalization will also enable the manipulation of the models via any of the existing model-driven tools –especially the EMF-based ones, which will be directly compatible with our DSL implementation.
As further work we plan to extend the DSL. In particular, we will dive deep into the BusinessActivity for contextualizing and setting business purposes to AI projects. Similarly, we will enrich the AIModelDeploymentActivity to incorporate monitoring elements to ensure the performance of deployed AI models remains within acceptable limits. Besides, we will go beyond ML and include other AI methods.
Additional further work will involve creating a tool set that would enable enacting and automating these modeled AI processes, thus providing real-time information of running processes and guidance for intervention. We will also add process snippets and templates that would help companies to create their own process without starting from scratch. Finally, we contemplate to empirically validate the usability of our DSL.
If you’d like to try our DSL to model your internal company process, we’ll be more than happy to explore this collaboration. Get in touch!
 Akkiraju, R., Sinha, V., Xu, A., Mahmud, J., Gundecha, P., Liu, Z., Liu, X., Schumacher, J.: Characterizing machine learning processes: A maturity framework. In: Business Process Management. pp. 17–31. Springer (2020)
 Amershi, S., Begel, A., Bird, C., DeLine, R., Gall, H., Kamar, E., Nagappan, N., Nushi, B., Zimmermann, T.: Software engineering for machine learning: A case study. In: ICSE-SEIP. pp. 291–300 (2019)
 Ashmore, R., Calinescu, R., Paterson, C.: Assuring the machine learning lifecycle: Desiderata, methods, and challenges. ACM Comput. Surv. 54(5) (2021)
 Bosch, J., Olsson, H., Crnkovic, I.: Engineering ai systems: A research agenda. Artificial Intelligence Paradigms for Smart Cyber-Physical Systems pp. 1–19 (2021)
 García-Borgoñón, L., Barcelona, M., García-García, J., Alba, M., Escalona, M.: Software process modeling languages: A systematic literature review. Information and Software Technology 56(2), 103–116 (2014)
 Hill, C., Bellamy, R., Erickson, T., Burnett, M.: Trials and tribulations of developers of intelligent systems: A field study. In: 2016 IEEE Symposium on VL/HCC. pp. 162–170 (2016)
 Nascimento, E.d.S., Ahmed, I., Oliveira, E., Palheta, M.P., Steinmacher, I., Conte, T.: Understanding development process of machine learning systems: Challenges and solutions. In: ESEM 2019. pp. 1–6 (2019)
 Olston, C., Reed, B., Srivastava, U., Kumar, R., Tomkins, A.: Pig Latin: A not-so-foreign language for data processing. In: SIGMOD. p. 1099–1110. ACM (2008)
 Publio, G.C., Esteves, D., Lawrynowicz, A., Panov, P., Soldatova, L., Soru, T., Vanschoren, J., Zafar, H.: ML-Schema: exposing the semantics of machine learning with schemas and ontologies. arXiv preprint arXiv:1807.05351 (2018)
 Sujeeth, A.K., Lee, H., Brown, K.J., Rompf, T., Chafi, H., Wu, M., Atreya, A.R., Odersky, M., Olukotun, K.: OptiML: An implicitly parallel domain-specific language for machine learning. In: ICML. pp. 609–616 (2011)
 Wan, Z., Xia, X., Lo, D., Murphy, G.C.: How does machine learning change software development practices? IEEE Trans. on Software Eng. 47(9), 1857–1871 (2021)
 Weimer, M., Condie, T., Ramakrishnan, R., et al.: Machine learning in ScalOps, a higher order cloud computing language. In: NIPS. vol. 9, pp. 389–396 (2011)
 Zhao, T., Huang, X.: Design and implementation of DeepDSL: A DSL for deep learning. Computer Languages, Systems & Structures 54, 39–70 (2018)
 Zucker, J., d’Leeuwen, M.: Arbiter: A domain-specific language for ethical machine learning. In: AAAI/ACM Conf. on AI, Ethics, and Society. pp. 421–425 (2020)
I am a member of the SOM Research Lab (IN3-UOC) researching in the crossroad between Software Engineering and AI Engineering.